#

html-parser

Here are 83 public repositories matching this topic...

ispras / dedoc

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser

html pdf ocr table-of-contents excel html-parser docx documents doc scanned-documents txt document-analysis odt pdf-parser table-recognition docx-parser document-content-extraction logical-structure-extraction

Updated Oct 1, 2024
Python

YiraBot-Crawler

OwenOrcan / YiraBot-Crawler

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.

open-source machine-learning data-mining scraping python3 text-extraction web-scraping html-parser robots-txt data-extraction seotools command-line-tool beginner-friendly contributions-welcome big-data-analytics seo-analysis good-first-issue sitemap-parser web-crawlers

Updated Sep 16, 2024
Python

citatyinfo_bot

Scorpi-ON / citatyinfo_bot

An asynchronous bot parser of the Russian quotes portal citaty.info

python bot quotes parsing telegram-bot asynchronous html-parser uvloop pyrogram lexbor selectolax

Updated Aug 21, 2024
Python

Winterwind / MovieReccomendationSystem

My personal summer project: a program that prompts the user to enter the desired genre(s) and keyword(s) and outputs a list of movies that matches that query; results print in terminal

Updated Aug 18, 2024
Python

hawa1222 / data-stream-etl

Python-MySQL ETL pipeline to centralise personal data from sources like YouTube and Apple into a structured database, enabling advanced data analysis and application development.

mysql python pipeline etl html-parser xml-parser api-integration

Updated Aug 9, 2024
Python

pywebcopy

rajatomar788 / pywebcopy

Locally saves webpages to your hard disk with images, css, js & links as is.

python html crawler web webpage mirror html-parser archive-tool

Updated Jul 31, 2024
Python

fearless-spider / python_playground

My Python playground

python ssh tcp email hacking hash html-parser xml-parser text-processing

Updated Jul 25, 2024
Python

alphanome-ai / sec-parser

Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual (semantic) structure of the document.

Updated Jul 13, 2024
Python

ve3xone / urfu-backup-company

Бэкапы приемных компаний УРФУ. Для абитуриентов. Конкурсные списки.

css python html ratings parser backup practice rating parser-generator api-client rtf html-parser html-css urfu 2023 2024 competition-lists reception-companies for-applicants

Updated Jul 7, 2024
Python

P-Sakowski / HTML-Parser

HTML parser that displays the last element of the longest unordered list from the URL given in the parameter. Prepared in Python language.

python html parser html-parser urllib

Updated May 28, 2024
Python

miso-belica / jusText

Heuristic based boilerplate removal tool

python text-extraction html-parser html-parsing

Updated May 9, 2024
Python

menggatot / youtube-watch-history-to-csv

This project allows you to convert your YouTube watch history HTML file from Google Takeout into a CSV file that can be used by the universalscrobbler.com to Scrobble manually in bulk.

scrobble youtube csv lastfm youtube-dl html-parser google-takeout yt-dlp youtube-watch-history universalscrobbler

Updated Apr 16, 2024
Python

imgurbot12 / pyxml

Pure python3 alternative to stdlib xml.etree with HTML support

python parser xml python3 html-parser xml-parser

Updated Mar 3, 2024
Python

ParisaArbab / surf-webpage-textual-content

analyzes the textual content of a webpage, identifying and ranking the most common words found within it

natural-language-processing counter user-agent regular-expression html-parser nltk stopwords urllib custom-html-parser get-most-common-words fetch-parse nltk-downloads

Updated Feb 24, 2024
Python

dmitrijbes / hltb-parser

Simple howlongtobeat.com parser.

parser games parsing html-parser time-tracker timetracking howlongtobeat

Updated Jan 21, 2024
Python

100backslash001 / leetcode

LeetCode-Info is small parser that allows you to view small profile statistics on the LeetCode

python html leetcode python3 requests html-parser leetcode-python

Updated Jan 20, 2024
Python

ssh319 / pc-info_tgbot

Telegram bot, that prints out a parsed from website chaynikam.info CPU or GPU parameters by user's request in chat.

python telegram-bot html-parser

Updated Jan 12, 2024
Python

omar2535 / BioLife-AU-01-attendance-parser

Biolife-AU-01 打卡鐘解析程序

parser html-parser docx docx-parser

Updated Dec 5, 2023
Python

Gill-Singh-A / Github-Analytics-Tool

A Program made in Python, that uses requests module to fetches and analysis publically available information of Github account

github python git requests scrapping html-parser beautifulsoup scrapping-python beautifulsoup4

Updated Nov 14, 2023
Python

hong539 / html_crashed_lab

html_crashed_lab is a workshop to make a small program to parser html tags with Python or combined any other good programming languages.

html web workshop html-parser pytohn

Updated Oct 27, 2023
Python

Improve this page

Add a description, image, and links to the html-parser topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the html-parser topic, visit your repo's landing page and select "manage topics."