A multi-threaded web crawler written in Python
The purpose of this tool to gather the links Only
for now. You may look todo section
To run the crawler, please type python3 index.py
and enter a URL to crawl.
- Gather Page links
- Multi-threaded
- Added Configuration support.
- Crawl images with alt.
- Behchmarking.
- Get metadata (description, keywords)
- Make the package flexible and easy to use without touching any core files
- Components to extend project
- Database layer
- Analytics
- Data harvesting
- Searching algorithms
- Add more tests
There is still a lot of work to do, so feel free to contribute to open PR
MIT
Donate coffee?
here is the bitcoin address
37x6PA4qtPu2fQnYdW5U7jztYhbchASpBV
Thanks you so much.
I do not accept responsibility for any illegal usage