Skip to content

PritamSarbajna/dark-web-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Black Minimal Business Personal Profile Linkedin Banner

Python Kali Linux Debian TOR

🎯 Usage :

Currently this is only designed to

  • Scrape dark web for onion links
  • Scrape images from dark web
  • Check language of a dark web link
  • Check if a onion link is valid or not

Without tor browser

✨ Unique Features :

  • Instead of using TOR browser, Using tor proxy
  • Randomized IP address for anonymity
  • Spoofing user agent to avoid getting tracked

🔧 Current Dependencies:

  • Linux [ used debian based distro ]

⚙️ Prerequisite :

Enable socks

  • Update package lists
$ sudo apt update
  • Install tor package
$ sudo apt install tor
  • Start Tor service
$ sudo service tor start
  • Verify installation status
$ sudo service tor status

📚 Tutorial :

Install using pip

$ pip install dark-web-scraper

1. Find onion urls from a dark web link

  • Request : find_onion_links( str )
  • Response: links will be saved in result.txt
  • Example :
# Main.py

from dark_web_scraper import find_onion_links
find_onion_links('http://random_url.onion')

2. Scrape images on a dark web link

  • Request : find_images_from_onion_link( str )
  • Response: Images will be saved in /static/images
  • Example :
# Main.py

from dark_web_scraper import find_images_from_onion_link
find_images_from_onion_link('http://random_url.onion')

3. Check language of a dark web link

  • Request : detect_onion_link_language( str )
  • Response: Gives back the result as the language name
  • Example :
# Main.py

from dark_web_scraper import detect_onion_link_language
detect_onion_link_language('http://random_url.onion')

4. Check if a onion link is valid or not

  • Request : is_onion_site_valid( str )
  • Response: gives response as True or False
  • Example :
# Main.py

from dark_web_scraper import is_onion_site_valid
is_onion_site_valid('http://random_url.onion')

🚀 Features to be added :

  • Language Detection
  • Language translation
  • Onion link validator
  • Object detection in images
  • Named entity recognition
  • Search specific keywords in a list of urls
  • Sentiment aAnalysis on the webpage contents

⚠️ Disclaimer:

  • I don't promote illegality.
  • This project is just for educational purposes only