Skip to content

Simple tool to scrape search engine results from Google, Bing, and other search engines.

License

Notifications You must be signed in to change notification settings

linuxhackingid/lhi-search-engine-scraper

Repository files navigation

Python Web Indexing

So in here i make a script for search data from spesific domain, for example i want to gain data about "chocolate" on web with domain "wikipedia.com", this script will automate your searching by listing all domain inside urls file

Installation

Dependencies

- requests
- beautifulsoup4

Install Dependencies

pip install -r ./requirements.txt

Run program

python main.py

Configuration

After running this program you may ask "why urls output feels weird, how to fix it?", Basically you can set it to strict mode go to "config/url.py", set must start with to true and enable domain on must contain, result will be optimized, but keep it mind urls output must be less from bare minimum or standard config

Update

Maybe some of you asking "how to search data from all website?", For now, you can search data only providing a keyword without giving a spesific domain. On domain input section, you can provide '*' as a wildcard, script will automatic read it as search from all website domain

Unit Test

Run Unit Test

python test.py

Notes

This program only use for educational purpose, please use this on your own

About

Simple tool to scrape search engine results from Google, Bing, and other search engines.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages