Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to extract JS strings as links in a javascript blob #1121

Open
amiremami opened this issue Feb 23, 2024 · 12 comments
Open
Labels
enhancement New feature or request

Comments

@amiremami
Copy link
Contributor

Couldn't get JS strings as links to able to grep

My command:
bbot -t trickest.com -m httpx -c web_spider_distance=2 web_spider_depth=3 web_spider_links_per_page=1000 omit_event_types=[] url_extension_httpx_only=[]

image

image

馃檹

@amiremami amiremami added the bug Something isn't working label Feb 23, 2024
@TheTechromancer
Copy link
Collaborator

TheTechromancer commented Feb 24, 2024

@liquidsec what do you think about this? We would essentially be implementing js link extractor.

@TheTechromancer TheTechromancer added enhancement New feature or request and removed bug Something isn't working labels Feb 24, 2024
@amiremami
Copy link
Contributor Author

This is my command:

bbot -t react.dev -m httpx -c web_spider_distance=3 web_spider_depth=3 web_spider_links_per_page=500 omit_event_types=[]

And bbot can't detect any of these JS as links

image

For example this link not exists in output file:
https://react.dev/_next/static/chunks/webpack-8af07453075e2970.js

@TheTechromancer
Copy link
Collaborator

Added support for extracting URLs from <link> elements: #1132.

@amiremami
Copy link
Contributor Author

amiremami commented Feb 27, 2024

I add some more examples here for future testing, I guess all of them are related to JS blob.

openai.com
image

shopify.com
image

atlassian.com
image

whatsapp.com
image

ahrefs.com
image

clickup.com
image

@TheTechromancer
Copy link
Collaborator

@amiremami thanks for testing. Did bbot fail to extract these? It always finds full URLs regardless of whether they're embedded in js blobs, so it definitely should have gotten the atlassian one.

@amiremami
Copy link
Contributor Author

@amiremami thanks for testing. Did bbot fail to extract these? It always finds full URLs regardless of whether they're embedded in js blobs, so it definitely should have gotten the atlassian one.

bbot -t https://www.atlassian.com/software -m httpx -c web_spider_distance=2 web_spider_depth=2 web_spider_links_per_page=500 omit_event_types=[]

image

I have it like this tens of times on the output file, but it's not as "url": "https://atl-global.atlassian.com/js/atl-global.min.js"

@TheTechromancer
Copy link
Collaborator

TheTechromancer commented Feb 27, 2024

bbot -t https://www.atlassian.com/software -m httpx -c web_spider_distance=2 web_spider_depth=2 web_spider_links_per_page=500 omit_event_types=[]

I think you're forgetting a config option ;)

image

(The reason this config option exists is because most everyone wants to search javascript files for secrets etc., but if it didn't contain anything interesting, they usually don't want to see it in the output.)

@amiremami
Copy link
Contributor Author

Thanks 馃檹 I also used that config, but still same : (

@amiremami
Copy link
Contributor Author

amiremami commented Feb 27, 2024

This is my command:

bbot -t react.dev -m httpx -c web_spider_distance=3 web_spider_depth=3 web_spider_links_per_page=500 omit_event_types=[]

And bbot can't detect any of these JS as links

image

For example this link not exists in output file: https://react.dev/_next/static/chunks/webpack-8af07453075e2970.js

For this one, I just upgraded bbot to v1.1.7.2998rc and this JS only exists as URL UNVERIFIED, but shouldn't it exist as URL too?

https://react.dev/_next/static/chunks/webpack-a1ff329830897a9a.js

My command:
bbot -t react.dev -m httpx -c web_spider_distance=2 web_spider_depth=2 web_spider_links_per_page=500 omit_event_types=[] url_extension_httpx_only=[]

image

@TheTechromancer
Copy link
Collaborator

@amiremami that specific file is 4 levels deep. The reason it's not showing up is because the spider is set to a depth of 2 (web_spider_depth=2).

If you enable --debug, it will tell you the reason:

2024-02-27 17:00:10,924 [DEBUG] bbot.modules.internal.excavate base.py:1175 Tagging URL_UNVERIFIED("https://react.dev/_next/static/chunks/webpack-ccf89d5e32b01f59.js", module=excavate, tags={'in-scope', 'extension-js', 'endpoint'}) as spider-danger because its spider depth or distance exceeds the scan's limits

@amiremami
Copy link
Contributor Author

@amiremami thanks for testing. Did bbot fail to extract these? It always finds full URLs regardless of whether they're embedded in js blobs, so it definitely should have gotten the atlassian one.

Still couldn't get the atlassian neither in URL nor in URL_UNVERIFIED , if this problem is different than JS blob, please check, thanks a lot 馃檹

Got this today
image

@TheTechromancer
Copy link
Collaborator

TheTechromancer commented Feb 29, 2024

@amiremami keep in mind that https://atl-global.atlassian.com/js/atl-global.min.js is on a different subdomain than www.atlassian.com, so it's not in scope. If you want to see it you will need to either:

  1. increase your scope report distance to see the URL_UNVERIFIED (-c scope_report_distance=1)
  2. whitelist all of atlassian.com to also produce a URL (-w atlassian.com)

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants