Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
-
Updated
Jun 7, 2024 - Python
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
📝 python package to calculate readability statistics of a text object - paragraphs, sentences, articles.
📗 Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more
A Python library for calculating a large variety of metrics from text
A Python utility for moving bookmarks/reading lists between services
This project serve HTML files (and a few more) saved in your computer with a UI suitable for Kindle web browser. On top of that, it include a Read Mode (thanks to ReadabiliPy) to display the text in a comfortable size without have to use the 'Article Mode' in Kindle web browser.
Simple Smart Pipe: python productivity-tool for rapid data manipulation
Plain Russian Language / Понятный (простой) русский язык.
PyYAML-based module to produce a bit more pretty and readable YAML-serialized data
Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.
Extract clean(er), readable text from web pages via Mercury Web Parser.
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
📚 Сборник полезных штук из Natural Language Processing: Определение языка текста, Разделение текста на предложения, Получение основного содержимого из html документа
🌐 Translation plugin (multi-engine, fast, flexible) for SublimeText 3 & 4, works without API keys, works in China
Simple python script to parse twitter feed to generate a rss feed.
From local functions to cloud deployed pipelines
The more often you click a word in the headlines, the more interesting are your news.
The god of human readable numbers
Collect actual content of any article, blog, news, etc.
Add a description, image, and links to the readability topic page so that developers can more easily learn about it.
To associate your repository with the readability topic, visit your repo's landing page and select "manage topics."