pdf-to-excel-automation

Developed a Python automation script at Pinchin Ltd. that extracted data from PDFs and exported it to Excel, significantly reducing manual data entry time.

PDF to Excel Automation

This project automates the workflow of taking structured data out of PDF test results and turned the data into a clean Excel template for analysis and storage.

It was originally built to reduce manual data entry time for laboratory staff by parsing common report formats and writing the results into spreadsheets.

Features

Extracts tabular or semi-structured data from PDFs
Maps extracted fields to a defined Excel layout
Skips duplicate or malformed entries to protect data integrity
Logs processing results (success, skipped, errors)
Designed to be extended for new PDF templates

Tech Stack

Python 3.x
PDF parsing: pdfplumber
Excel writing: openpyxl

Project Structure

src/
  main.py          # CLI / entry point
  mold_processing.py    # PDF extraction logic
  testing.py 
samples/
  Example.xlsx

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
sample		sample
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pdf-to-excel-automation

PDF to Excel Automation

Features

Tech Stack

Project Structure

About

Uh oh!

Releases

Packages

Languages

ConnorDBooth/pdf-to-excel-automation

Folders and files

Latest commit

History

Repository files navigation

pdf-to-excel-automation

PDF to Excel Automation

Features

Tech Stack

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages