PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
-
Updated
Jun 12, 2024 - Python
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
A simple resume parser used for extracting information from resumes
extract data from html table
Extract colors from an image. Colors are grouped based on visual similarities using the CIE76 formula.
Get Lyrics for any songs by just passing in the song name (spelled or misspelled) in less than 2 seconds using this awesome Python Library.
This program extracts insider trading data from the sec website and stores it in excel file for the specified time frame.
Unofficial Python client for Twitter
Extract audio and other data from the Digitech Trio Plus guitar pedal's SD card
Extract structured data from any unstructured web page
Different python utility scripts to help automate mundane/repetitive tasks. Useful for performance testers/data scientist or anyone who wants to automate mundane tasks in python.
Extract data from Octopus mdict (*.mdd, *.mdx) files
A Python module for reading data from a plot provided as SVG file.
This is a library for making batch request to Google Analytics Core Reporting v3 API and extracting data from Google Analytics property into Python 3 data structures.
A simple UI tool to batch crop images to prepare datasets from images and videos.
A toolkit for extracting elements and visualization for Waymo Open Dataset
Singer Tap for dbt API v2 built with the Meltano SDK
This program can be used to parse the NCBI GenBank file to create a tabulated csv file.
This repository takes a *.xslx that contains a Pivot Table with hidden external source data and converts the pivot cache into CSV. It takes into account files that are too big to be in memory and handles this situation by dividing the original data into n batches.
Extract emails and phone numbers from the list of url addresses
Add a description, image, and links to the extract-data topic page so that developers can more easily learn about it.
To associate your repository with the extract-data topic, visit your repo's landing page and select "manage topics."