scraper

Star

Here are 432 public repositories matching this topic...

cheeriojs / cheerio

Sponsor

Star

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

html jquery parser scraper dom cheerio selector hacktoberfest htmlparser2 htmlparser

Updated May 28, 2024
TypeScript

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

nodejs javascript npm crawler scraper automation typescript web-crawler headless scraping crawling web-scraping web-crawling headless-chrome apify puppeteer playwright

Updated May 28, 2024
TypeScript

mendableai / firecrawl

Star

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl, search and extract with a single API.

markdown crawler data scraper ai html-to-markdown web-crawler scraping rag llm ai-scraping

Updated May 28, 2024
TypeScript

mishushakov / llm-scraper

Star

Turn any webpage into structured data using LLMs

scraper browser ai artificial-intelligence openai llama gpt browser-automation puppeteer playwright gpt-4 llm langchain

Updated May 18, 2024
TypeScript

consumet / api.consumet.org

Star

A Modern Search Engine API for Anime, Movies/TVShows, Books, Light Novels, Manga, etc.

Updated May 28, 2024
TypeScript

linvo-io / linvo-scraper

Star

Linkedin Automation Bot with every possible scraping! Valid for 2022 used by Linvo.io

scraper automation linkedin hacktoberfest puppeteer hactoberfest-accepted

Updated Aug 22, 2023
TypeScript

lmmfranco / nintendo-switch-eshop

Star

Crawler for Nintendo Switch eShop

game crawler scraper nintendo lib price switch eshop nintendo-switch

Updated Nov 5, 2021
TypeScript

josephlimtech / linkedin-profile-scraper-api

Star

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

Updated Apr 5, 2024
TypeScript

jacktuck / unfurl

Star

Metadata scraper with support for oEmbed, Twitter Cards and Open Graph Protocol for Node.js ⚡

nodejs slack metadata scraper microservice open-graph oembed twitter-cards embed micro ogp meta-tags unfurl

Updated Apr 9, 2024
TypeScript

gigobyte / HLTV

Sponsor

Star

The unofficial HLTV Node.js API

parser scraper hltv

Updated May 28, 2024
TypeScript

consumet / consumet.ts

Star

Nodejs library that provides high-level APIs for obtaining information on various entertainment media such as books, movies, comic books, anime, manga, and so on.

api npm scraper streaming typescript movies books anime manga npm-package reading anilist light-novels streaming-api movies-api anime-list manga-api

Updated May 24, 2024
TypeScript

openzim / mwoffliner

Sponsor

Star

Mediawiki scraper: all your wiki articles in one highly compressed ZIM file

nodejs scraper offline mediawiki wikipedia archive zim openzim

Updated May 27, 2024
TypeScript

epiqueras / getsy

Star

A simple browser/client-side web scraper.

scraper browser web-scraper client-side

Updated Apr 24, 2017
TypeScript

t3chnoboy / thepiratebay

Star

💀 The Pirate Bay node.js client

parser torrent scraper piratebay

Updated Jan 6, 2023
TypeScript

videomanagertools / scraper

Star

A scraper that switches between normal mode and gentleman mode, built on Eletron, React

scraper movies video tool manager nfo jav av

Updated Dec 27, 2020
TypeScript

ghoshRitesh12 / aniwatch-api

Sponsor

Star

Node.js API for obtaining anime information from hianime.to (formerly aniwatch.to) written in TypeScript, made with Cheerio & Axios

api scraper anime aniwatch

Updated May 10, 2024
TypeScript

bitmakerla / estela

Star

estela, an elastic web scraping cluster 🕸

react python docker kubernetes scraper django scraping crawling requests web-scraping scrapy hacktoberfest python-requests scrapyd scrapy-visualization webscraping-python

Updated Feb 8, 2024
TypeScript

the-convocation / twitter-scraper

Star

A port of n0madic/twitter-scraper to Node.js.

scraper twitter node-js

Updated Apr 28, 2024
TypeScript

sedgwickz / jsonHunter

Star

在线爬虫，online web scraper

scraper

Updated Jul 11, 2022
TypeScript

get-set-fetch / scraper

Star

Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.

nodejs scraper cloud web scraping

Updated Mar 13, 2023
TypeScript

Improve this page

Add a description, image, and links to the scraper topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scraper topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scraper

Here are 432 public repositories matching this topic...

cheeriojs / cheerio

apify / crawlee

mendableai / firecrawl

mishushakov / llm-scraper

consumet / api.consumet.org

linvo-io / linvo-scraper

lmmfranco / nintendo-switch-eshop

josephlimtech / linkedin-profile-scraper-api

jacktuck / unfurl

gigobyte / HLTV

consumet / consumet.ts

openzim / mwoffliner

epiqueras / getsy

t3chnoboy / thepiratebay

videomanagertools / scraper

ghoshRitesh12 / aniwatch-api

bitmakerla / estela

the-convocation / twitter-scraper

sedgwickz / jsonHunter

get-set-fetch / scraper

Improve this page

Add this topic to your repo