🕷️ Fast Async Python Site Spider

A personal, no-limits site spider written in Python — built out of necessity when every "free tool" out there was either limited, paid, or just didn't work well on macOS.

🚀 Why I Built This

I needed to spider a site quickly and without restrictions, but:

Most tools had limits or required paid plans
Others didn't work smoothly on macOS
I just needed a simple and fast way to extract links and crawl a site, saving everything in a .csv

So I built my own — with the help of AI and some Python magic.

⚙️ Features

🔗 Parses HTML using BeautifulSoup
🔁 Recursively crawls all internal links
💾 Saves crawled URLs in a results.csv file
⚡ Powered by asyncio for insane speed 🚀
🧠 Smart deduplication of URLs (no repeat crawls)
🎯 Designed for single-site deep crawling

🧰 Requirements

Python 3.7+
aiohttp
beautifulsoup4

Install with:

pip install aiohttp beautifulsoup4 lxml

Usage:

python spider.py

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
spider.py		spider.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🕷️ Fast Async Python Site Spider

🚀 Why I Built This

⚙️ Features

🧰 Requirements

About

Uh oh!

Releases

Packages

Languages

Mid90sAhsan/PythonSpider

Folders and files

Latest commit

History

Repository files navigation

🕷️ Fast Async Python Site Spider

🚀 Why I Built This

⚙️ Features

🧰 Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages