Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
-
Updated
Jun 7, 2024 - Python
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
✅ Parse your browser's exported HTML bookmark file to Markdown.
a cli tool to fetch webpages main content and print it as markdown
A simplified online encyclopedia with Markdown-formatted entries. Powered by Django.
Outillage d'extraction du contenu de l'ancien site de Geotribu (web scraping, conversion en markdown...)
Let's do web scrapping from codewars and bring all the solution codes along with their README at once
website scraper for text with conversion to markdown.md and directory structuring
Add a description, image, and links to the html-to-markdown topic page so that developers can more easily learn about it.
To associate your repository with the html-to-markdown topic, visit your repo's landing page and select "manage topics."