Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ghdb_scraper.py no longer retrieves dork #21

Closed
Astroida opened this issue Dec 5, 2018 · 5 comments
Closed

ghdb_scraper.py no longer retrieves dork #21

Astroida opened this issue Dec 5, 2018 · 5 comments

Comments

@Astroida
Copy link

Astroida commented Dec 5, 2018

The GHDB scraper no longer works - presumably this is because the exploit-db website has been updated.

Here's the output I am getting:

[] Initiation timestamp: 20181205_104042
[
] Spawing thread #0
[] Spawing thread #1
[
] Spawing thread #2
[+] Retrieving dork 6: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 7: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 9: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 10: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 5: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 8: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 12: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 13: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 15: Penetration Testing with Kali Linux (PWK)

@opsdisk
Copy link
Owner

opsdisk commented Dec 5, 2018

Hi @Astroida - thank you for alerting me to that! Site definitely looks different. I'll take a look at updating the code. For the time being, you can change the URL (https://github.com/opsdisk/pagodo/blob/master/ghdb_scraper.py#L34) to point to https://old.exploit-db.com which is the old site that worked with ghdb_scraper.py

@Astroida
Copy link
Author

Astroida commented Dec 6, 2018

Hi @opsdisk, thanks for the temp fix! Wasn't aware that the old link still exists.

@opsdisk
Copy link
Owner

opsdisk commented Dec 6, 2018

@Astroida Try taking this branch for a spin: https://github.com/opsdisk/pagodo/tree/issue-21

My original testing shows they start blocking attempts after 500-1000 requests even with 1 thread, so I may have to add some logic to back off / randomize the request rate like I do in https://github.com/opsdisk/metagoofil/blob/master/metagoofil.py

@opsdisk
Copy link
Owner

opsdisk commented Dec 7, 2018

nm testing this. I figured out how to pull all Google dorks with 1 HTTP GET request. May take a day or two to push the code.

@opsdisk
Copy link
Owner

opsdisk commented Dec 9, 2018

Pull the latest from master branch. Pushed some fresh updates.

@opsdisk opsdisk closed this as completed Dec 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants