🕷️ Web Scraper Project – GitHub Repo Extractor

This project is a powerful, stealth-enabled web scraper built using Python + Selenium to automate the process of extracting repository details from GitHub collections.

✨ Features

🔍 Automated Web Scraping – Extracts data like repo names, stars, forks, and languages.
🧠 Stealth Mode Enabled – Avoids bot detection using ChromeDriver stealth configuration.
🌐 Handles JavaScript-Heavy Pages – Scrapes dynamic content by rendering the full page.
📊 Data Analytics – Visualizes data with charts and graphs via Streamlit.
💾 Export Options – Save scraped results directly as CSV.
📁 Download-Free Setup – Uses webdriver-manager, so no need to manually install ChromeDriver.

📸 Screenshots

💡 Use Cases

📈 Market & Trend Analysis
🧑‍💻 GitHub-based Research & Repo Discovery
🏢 Competitor & Project Intelligence
🤖 Dataset creation for AI/ML Models

🛠️ Setup & Installation

1. Clone the Repository

git clone https://github.com/yokodrea/scraper-project.git cd scraper-project

2. Install Dependencies

pip install -r requirements.txt

✅ No need to download ChromeDriver manually — it's handled by webdriver-manager.

3. 🚀 Run the Scraper

streamlit run scrape.py You can customize the GitHub collection URL inside scrape.py.

4. 📂 Output

The scraped data will be saved as:

project_list.csv

5. 🔧 Tech Stack

Language: Python

Core Libraries: selenium (for automation), pandas (for data handling), streamlit (for UI) ,webdriver-manager (auto-handles ChromeDriver)

Scraping Mode : Headless browser via Chrome

Export Format: CSV

📅 Future Enhancements

⏱️ Multi-threading for speed boost

📬 Real-time alerts on repo updates

🕒 Scheduler for auto-scraping (cron jobs)

🌐 Deploy as a hosted scraping service

🧪 Requirements

📌 To be filled in once final dependencies are set. 👉 See requirements.txt for more info.

📄 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
docs		docs
scraper		scraper
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
project_list.csv		project_list.csv
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🕷️ Web Scraper Project – GitHub Repo Extractor

✨ Features

📸 Screenshots

💡 Use Cases

🛠️ Setup & Installation

1. Clone the Repository

2. Install Dependencies

3. 🚀 Run the Scraper

4. 📂 Output

5. 🔧 Tech Stack

📅 Future Enhancements

🧪 Requirements

📄 License

About

Uh oh!

Releases

Packages

Languages

License

yokodrea/scraper-project

Folders and files

Latest commit

History

Repository files navigation

🕷️ Web Scraper Project – GitHub Repo Extractor

✨ Features

📸 Screenshots

💡 Use Cases

🛠️ Setup & Installation

1. Clone the Repository

2. Install Dependencies

3. 🚀 Run the Scraper

4. 📂 Output

5. 🔧 Tech Stack

📅 Future Enhancements

🧪 Requirements

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages