#

extractor

Here are 195 public repositories matching this topic...

news-please

fhamborg / news-please

news-please - an integrated web crawler and information extractor for news that just works

Updated Jun 6, 2024
Python

seo-audits-toolkit

StanGirard / seo-audits-toolkit

SEO & Security Audit for Websites. Lighthouse & Security Headers crawler, Sitemap/Keywords/Images Extractor, Summarizer, etc ...

python crawler dashboard analysis seo extractor serp headers summarizer audits lighthouse internal-links seo-tools link-extractor securityheader

Updated Feb 6, 2023
Python

tatuylonen / wiktextract

Wiktionary dump file parser and multilingual data extractor

multilingual parser lua dictionary extractor templates wikitext scribunto wiktionary wiktionary-parser

Updated Jun 13, 2024
Python

opensemanticsearch / open-semantic-etl

Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database

Updated Oct 9, 2022
Python

lipoja / URLExtract

URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.

extractor extract urls hacktoberfest

Updated Feb 29, 2024
Python

TorCrawl.py

MikeMeliz / TorCrawl.py

Crawl and extract (regular or onion) webpages through TOR network

python crawler osint extractor tor onion

Updated Jan 22, 2024
Python

AlexMathew / scrapple

A framework for creating semi-automatic web content extractors

python crawler tutorial extractor scraping web-scraper selector css-selector web-scraping scrapy scrapers beautifulsoup xpath-expression lxml selector-expression

Updated May 22, 2024
Python

mefistotelis / pylabview

Python reader of LabVIEW RSRC files (VI, CTL, LLB). File format description on the Wiki.

extractor reverse-engineering labview python3 fileformat

Updated Aug 15, 2023
Python

nexB / extractcode

A mostly universal file extraction library and CLI tool to extract almost any archive in a reasonably safe way on Linux, macOS and Windows.

gzip zip extractor extract tar cab bzip2 decompression archive zstd lzma iso9660 xz libarchive 7zip cpio

Updated May 16, 2024
Python

iAkashPattnaik / AudioExtractorBot

A telegram bot source to extract audio from videos.

audio ffmpeg telegram-bot extractor

Updated Sep 20, 2023
Python

theLSA / burp-sensitive-param-extractor

burpsuite extension for check and extract sensitive request parameter

checker parameters extractor burp-plugin burpsuite sensitive

Updated Nov 29, 2020
Python

DanielJDufour / date-extractor

Extract dates from text

python nlp parser time parse datetime date extractor iso taiwan chinese french arabic temporal kurdish sorani extract-dates

Updated Jan 27, 2021
Python

ZKAW / website-cloner

Basic website cloner written in Python

python html website downloader html5 download extractor cloner tor python3 pentesting scrap beautifulsoup tor-network scrapper pentest-tool beautifulsoup4 website-clone website-cloner

Updated Sep 13, 2023
Python

JingShing / AI-image-tag-extractor

A tool to help you get image info.

python ai tags extractor tkinter tag tkinter-gui imageai ai-image novelai stable-diffusion

Updated Mar 24, 2023
Python

chi0tzp / PyVideoFramesExtractor

Extract frames from videos in Python using OpenCV.

python opencv frames extractor videos opencv-python

Updated Jun 14, 2023
Python

hxz393 / BrutalityExtractor

适用于高性能系统的多进程解压缩软件(A multiprocess decompression software for high-performance system)

scalable optimization high-performance extractor parallel-computing decompression brute-force data-processing parallel-processing performance-testing performance-optimization brute-force-attack efficient-compression-tool computational-efficiency performance-enhancement brute-force-techniques parallel-decompression high-speed-decompression parallel-optimization brute-force-decompression

Updated Nov 19, 2023
Python

verarong / CommonOcrExtractor

可视化自定义ocr模板、结构化数据抽取、通用票据ocr后处理、mask矫正

ocr extractor invoice

Updated May 12, 2021
Python

nabinkhadka / readable-content

Collect actual content of any article, blog, news, etc.

python extractor python3 readability readability-lxml

Updated May 24, 2020
Python

PROxZIMA / DarkSpider

Anatomy and Visualization of the Network structure of the Dark web using multi-threaded crawler

github python github-pages crawler scraper osint extractor tor networkx onion collaborate hacktoberfest dark-web

Updated Mar 9, 2023
Python

verarong / invoice_ocr

maskrcnn分割、angle旋转方向、AdvanceEast定位文本框、dense识别，含整个后处理工程，serving部署

docker ocr docker-compose extractor invoice serving

Updated May 12, 2021
Python

Improve this page

Add a description, image, and links to the extractor topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the extractor topic, visit your repo's landing page and select "manage topics."