forktober

Here is 1 public repository matching this topic...

ACM-VIT / scrag

A flexible web scraper that intelligently adapts to different website structures using multiple extraction strategies (newspaper3k, readability-lxml, BeautifulSoup, and optional headless rendering). It outputs clean, structured data for RAG pipelines or local LLMs, with an optional extension to automatically build RAG indexes from web queries.

hacktoberfest hacktoberfest-accepted hacktoberfest2025 forktober