Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
-
Updated
Jun 7, 2024 - Python
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
📝 python package to calculate readability statistics of a text object - paragraphs, sentences, articles.
📗 Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more
PyYAML-based module to produce a bit more pretty and readable YAML-serialized data
A Python library for calculating a large variety of metrics from text
Plain Russian Language / Понятный (простой) русский язык.
Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.
A Python utility for moving bookmarks/reading lists between services
Extract clean(er), readable text from web pages via Mercury Web Parser.
Collect actual content of any article, blog, news, etc.
Simple Smart Pipe: python productivity-tool for rapid data manipulation
powerful python crawler: proxy-ip,mutiprocessing+Queue+yaml configurable crawler, readability, bs4(beautiful soup), pybloom, PooledDB, MysqlDb, selenium-webdriver-phantomjs, reids,anti-geetest, yaml, email
📚 Сборник полезных штук из Natural Language Processing: Определение языка текста, Разделение текста на предложения, Получение основного содержимого из html документа
real-time chat app demonstrating some architectural, testing, readability, clean-code and infrastructural skills as a profile for myself
This project serve HTML files (and a few more) saved in your computer with a UI suitable for Kindle web browser. On top of that, it include a Read Mode (thanks to ReadabiliPy) to display the text in a comfortable size without have to use the 'Article Mode' in Kindle web browser.
Simple python script to parse twitter feed to generate a rss feed.
🌐 Translation plugin (multi-engine, fast, flexible) for SublimeText 3 & 4, works without API keys, works in China
The more often you click a word in the headlines, the more interesting are your news.
Generating Summaries with Controllable Readability Levels (EMNLP 2023)
Add a description, image, and links to the readability topic page so that developers can more easily learn about it.
To associate your repository with the readability topic, visit your repo's landing page and select "manage topics."