Convert PDF to Excel, read all data of PDF and return excel file with this data
The script need two arguments:
first argument: is the dir where PDF files are.
second argument: is where the script save the output Excel files.
py main.py /source/pdf/path /final/excel/path py --version
v3.10.5
Output:
pip --version
Output: pip 24.2You need install the the next libraries:
- pandas - Manage Excel
- tabula-py - Convert/Read PDF
- JPype1 - Is necesary for tabula-py
Run this command from cmd/bash:
pip install JPype1 tabula-py pandas