Build software better, together

marciohssilveira / observatorio

Organizar extratos de noticias de arquivos pdf

regex noticias pdfplumber relacoes-internacionais

Updated Oct 3, 2023
Jupyter Notebook

jaspreetsidhu3 / text_to_mp3-audiobook

Convert PDF into an audiobook.

python mp3 audiobook pycharm jelly texttospeech audiobooks college-project gtts pythoncoding pdfplumber pdftomp3 texttomp3 textplayer pdfplayer audiobookmaker

Updated Nov 9, 2020
Python

renan-siqueira / python-pdf-tool

Star

This project facilitates the extraction of text from PDF files using various Python libraries. It is designed to be flexible, allowing the choice among different text extraction libraries and supporting both single PDF file and directory containing multiple PDF files.

python pdf mit-license pdf-to-text pypdf2 pdf-extractor pdfminer pymupdf pdfplumber

Updated Nov 18, 2023
Python

YuCheng21 / course-extract

Star

提取課程結構規劃表

pdfplumber

Updated Aug 22, 2021
Python

wolfsbane9513 / knowledgegraph

Star

To create knowledgegraph from pdfs

nlp python3 knowledge-graph pdfplumber

Updated Jul 5, 2020
Jupyter Notebook

nezamtrm / Extracting-contents-of-a-table-in-pdf-file-by-pdfplumber

Star

python table text-extraction text-processing pdfplumber text-extrction-from-pdf

Updated Sep 15, 2023
Jupyter Notebook

department-of-veterans-affairs / DAPM-PFAS-PACT-ACT

Star

Scrapes hazardous waste data from a website and PDF file for PACT Act. Cleans the data to prepare it for mapping.

json-api plotly pandas fuzzy-matching matplotlib spatial-data fuzzywuzzy webscraping geopandas pdf-document-processor urllib3 pandas-python pdf-mining pdfplumber pdf-miner webscraping-data requests-library-python

Updated Jan 26, 2024
Jupyter Notebook

Pevicsanch / project-data-of-the-territorial-division-of-Barcelona

Star

collecting data from the Barcelona City Hall Open Data Service's on socioeconomic indicators of the territorial division of the city of Barcelona

python api data-science database opendata regular-expression pandas api-rest barcelona extract-data datamining pdfplumber

Updated Nov 20, 2022
Jupyter Notebook

avr2002 / CV-JD-Matching

Star

Extracting details from Resume(CVs) and matching with Job Description(JDs) using pretrained model like DistilBERT and ranking them using cosine similarity.

python nlp regex pdf-document-processor distilbert pdfplumber huggingface-transformers

Updated Sep 18, 2023
Jupyter Notebook

tushark01 / InfoExtract

Star

Chat/Query with your pdf, txt, csv, docs files, also from links of blogs.

python openai pdfplumber langchain

Updated Mar 18, 2024
Python

eli64s / pypdf

Sponsor

Star

Common Python PDF parsing utilities 📑

python pdf pdf-document pdf-generation pypdf2 python-pdfkit python-pdf pdfreader pdfplumber pdf-python

Updated Jun 29, 2023
Python

AAC-Open-Source-Pool / Text-Summarization-and-information-extraction

Star

Interface developed to extract information from web through scraping and summarize given data.

nlp spacy beautifulsoup4 pdfplumber

Updated Jan 1, 2024
Python

davidepaci / linea138

Star

web app to query Cosenza bus timetable

nodejs javascript sqlite pandas python3 tailwindcss pdfplumber

Updated Oct 1, 2024
EJS

bekbolsky / pdf-extract-to-excel

Star

Extract text from certain pages of a pdf file and then inserting the text line by line into the table of an empty excel workbook

pdf openpyxl pdfplumber

Updated Oct 27, 2020
Python

himanshu-kalundia / resume-screening

Star

This is a resume screening web app. The user can upload a resume in PDF format. The app will go through all the text in the resume and find the best possible job role for the user among a set of roles.

python nlp knn-classification tfidf-vectorizer streamlit pdfplumber

Updated Jan 19, 2024
Jupyter Notebook

praveen2410-pk / PDF_Comparsion

Star

This repository contains a Python script for comparing PDF files between a local source folder and a remote server. The script logs results, highlighting identical and non-identical files based on size and page count. It employs "pdfplumber" for PDF handling and "paramiko" for SSH connections.

python3 paramiko pdfcompare pdfplumber

Updated Jan 22, 2024
Python

VaibhavDongre1311 / End_to_end_Resume_Classification__project

Star

Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention