🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
-
Updated
Jun 13, 2025 - TypeScript
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Python scraper based on AI
🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Lightweight library for scraping web-sites with LLMs
➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.
🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...
[Mirror] Self-hosted abuse detection and rule enforcement against low-effort mass AI scraping and bots.
AI web scraper built with Crawl4AI for extracting structured leads data from websites.
How to guides on web-crawling or scraping
Python, Javascript, and Rust libraries for the Spider Cloud API.
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
Fastest and cheapest distributed residential proxy network.
AI Scraper : scrap and extract data from website in any format (CSV, JSON, HTML...) using Selenium or Crawl4ai, and using Ollama or Sambanova API, and using Streamlit for UI as chatbot
A CLI tool and REST API that converts web content to clean Markdown, bypassing anti-scraping measures using headless browsers. Perfect for AI/LLM applications
All Scrapers Resource Available Here! Give Us Stars🌟
ScrapeGraphAI is a Python-based web-scraping framework that pairs large-language-model reasoning with a graph-style pipeline engine to turn websites (or local XML/HTML/JSON/Markdown files) into structured data with just a handful of lines of code.
Use LLaMA 3 and Python to extract structured data from websites like Amazon, leveraging LLM-powered parsing for resilient, AI-driven web scraping.
AI-powered web scraper using Javascript/Typescript.
Add a description, image, and links to the ai-scraping topic page so that developers can more easily learn about it.
To associate your repository with the ai-scraping topic, visit your repo's landing page and select "manage topics."