A simple tool in Python that scraps links from PDF files
Clone the project
git clone https://github.com/umegbewe/pdf-link-extract
Install pikepdf and PyMuPDF Python ibraries
pip3 install pikepdf PyMuPDF
Go to the project directory
cd pdf-link-extract
Specify the PDF to scan by on line 3
file = "(pdfname).pdf"
Run:
python3 pdflinkscraper1.py
or:
python3 pdflinkscraper2.py
#
pdflinkscraper1.py extracts links that are clickable which is more accurate.
#
pdflinkscraper2.py extract links through a specified regex [Check pdflinkscraper.py line 5]
#
You must have atleast Python 3 and PIP installed