🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
-
Updated
Jul 1, 2025 - TypeScript
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Python scraper based on AI
🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Lightweight library for scraping web-sites with LLMs
➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.
🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...
[Mirror] Self-hosted abuse detection and rule enforcement against low-effort mass AI scraping and bots.
AI web scraper built with Crawl4AI for extracting structured leads data from websites.
How to guides on web-crawling or scraping
Python, Javascript, and Rust libraries for the Spider Cloud API.
Fastest and cheapest distributed residential proxy network.
All Scrapers Resource Available Here! Give Us Stars🌟
AI Scraper : scrap and extract data from website in any format (CSV, JSON, HTML...) using Selenium or Crawl4ai, and using Ollama or Sambanova API, and using Streamlit for UI as chatbot
ScrapeGraphAI is a Python-based web-scraping framework that pairs large-language-model reasoning with a graph-style pipeline engine to turn websites (or local XML/HTML/JSON/Markdown files) into structured data with just a handful of lines of code.
A CLI tool and REST API that converts web content to clean Markdown, bypassing anti-scraping measures using headless browsers. Perfect for AI/LLM applications
AI-powered web scraper using Javascript/Typescript.
Use LLaMA 3 and Python to extract structured data from websites like Amazon, leveraging LLM-powered parsing for resilient, AI-driven web scraping.
Add a description, image, and links to the ai-scraping topic page so that developers can more easily learn about it.
To associate your repository with the ai-scraping topic, visit your repo's landing page and select "manage topics."