Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't detect all the available applications #5

Closed
przmv opened this issue Apr 19, 2017 · 16 comments
Closed

Doesn't detect all the available applications #5

przmv opened this issue Apr 19, 2017 · 16 comments
Labels

Comments

@przmv
Copy link
Contributor

przmv commented Apr 19, 2017

Looks like webanalyze (with this apps.json) doesn't detect as many applications as AliasIO/wappalyzer:

$ webanalyze -host="http://stackshare.io"
2017/04/19 13:44:32 Scanning with 4 workers.
2017/04/19 13:44:34 [+] http://stackshare.io (1.346715838s):
2017/04/19 13:44:34     - Google Font API        - [17]
2017/04/19 13:44:34     - Nginx  - [22]
2017/04/19 13:44:34     - Express        - [18 22]
2017/04/19 13:44:34     - Ruby on Rails  - [18]
$ docker run --rm wappalyzer/cli http://stackshare.io | jq '.applications | .[] | .name'
"Algolia Realtime Search"
"AngularJS"
"Express"
"Handlebars"
"Intercom"
"List.js"
"Mailchimp"
"Moment.js"
"New Relic"
"Nginx"
"React"
"Segment"
"Snap.svg"
"SweetAlert"
"Twitter Bootstrap"
"UserVoice"
"Varnish"
"jQuery"
"Node.js"
@rverton
Copy link
Owner

rverton commented Apr 19, 2017

Wappalyzer makes us of a javascript environment to execute some javascript checks on the loaded page. We can't do this here without adding a bridge to phantomjs or a headless browser. Maybe we can add an optional feature to include headless chrome/firefox in the future. I'll think over it.

@hbakhtiyor
Copy link

maybe to use more lightweight version, like https://github.com/scrapinghub/splash which https://github.com/spectresearch/detectem uses

@przmv
Copy link
Contributor Author

przmv commented Apr 24, 2017

@rverton I'd like to help you with this issue. I'm interested in PhantomJS integration, since it's easier to install on servers. Let's discuss how it could be implemented, so I could start working on it and hopefully send a pull request in the nearest future.

@hbakhtiyor
Copy link

@pshevtsov Phantomjs is heavy and stopped maintaining

@przmv
Copy link
Contributor Author

przmv commented Apr 24, 2017

@hbakhtiyor what do you suggest instead? I need something that is easy to install for the end users and is cross-platform — just like static PhantomJS binaries.

@hbakhtiyor
Copy link

https://github.com/scrapinghub/splash, using docker for easy installation, or headless chrome/firefox

@rverton
Copy link
Owner

rverton commented Apr 24, 2017

The problem is see here is that including an external tool will have a big impact on performance. So if we implement this, we need to make this optional.

I already included phantomjs in a go project some time ago (https://github.com/rverton/xssmap), but @hbakhtiyor may be right: It looks like PhantomJS is stopped in the (near) future in favor of Chrome/FF headless browser support.

Maybe it's worth making a test run and implementing selenium and compare the performance results? What do you think ?

@przmv
Copy link
Contributor Author

przmv commented Apr 24, 2017

@rverton @hbakhtiyor Using headless Chrome or Firefox seems like a decent solution for desktop users (since it's already there), but the Go application I'm currently working on is mostly targeted at servers, and having huge GUI application like Chrome or Firefox as a CLI tool dependency looks like an overkill.

@rverton
Copy link
Owner

rverton commented Apr 24, 2017

The question we have to ask is, if its worth implementing this and invoking a different renderer, because then we could also just build a wrapper around the already existing wappalyzer phantomjs driver: https://github.com/AliasIO/Wappalyzer/tree/master/src/drivers/phantomjs

It may be worth to make a little test implementation and see if we may be able to perform still better, but I have to say I'm a bit skeptical.

@hbakhtiyor
Copy link

@rverton would be nice to make it optional, using CDP Client
@pshevtsov don't need any GUI, for running headless mode, Chrome, FF

@rverton
Copy link
Owner

rverton commented Apr 28, 2017

@hbakhtiyor but you need to install a full chrome/ff, which may require a lot of other stuff to be installed.

I dont have the time currently to test one of this approaches, if someone wants, feel free to send me PRs.

@rverton rverton mentioned this issue Jan 5, 2019
@j3ssie
Copy link

j3ssie commented Apr 14, 2019

@rverton did you check out this awesome lib https://github.com/chromedp/chromedp

@5amu
Copy link

5amu commented Apr 22, 2021

Hello @rverton!

First of all, I really like this tool and I'd like to use it for work too.

I'm not very familiar with wappalyzer code, but if the problem is that you need to execute Javascript, this might be an easy enough fix https://github.com/rogchap/v8go, otherwise I'd be very glad if anyone could tell me where the JS execution is needed and I can implement it myself and make a PR.

@rverton
Copy link
Owner

rverton commented Apr 22, 2021

Hi @5amu,

sadly its not that easy in this case because it's not just javascript missing here, its the whole (browser) DOM which is missing when not using a browser. Maybe there is a way to emulate this with some libs, but I don't know of any. If we make use of a headless browser, the performance speed we gain by using a not-browser approach is gone.

If you want to go this route I guess its easier to wrap the wappalyzer script or docker container. This will be slower by a huge margin, but more precise because it can detect client side javascript stuff.

Greetings

@bugbaba
Copy link

bugbaba commented Apr 22, 2021

Hi @5amu,

wappalyzer code is heavily documented you can easily refer to this https://github.com/AliasIO/wappalyzer/tree/master/src/drivers/npm to get started with a node binary in your system, if you don't want to use docker.
Also along with performance concerns if this browser support is added to webanalyze project it would then just become a wappalyzer rewritten in golang which doesn't make huge difference in performance.

--
Regards,
@bugbaba

@5amu
Copy link

5amu commented Apr 23, 2021

Thanks @bugbaba @rverton for the clarification,

I'll keep an eye on the project to see if someone, eventually, will come up with an idea to solve this issue without many performance penalties. For now, I'll keep using wappalyzer in docker.

Best of luck!

@rverton rverton closed this as completed Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants