PDF-Link-Extract

A simple tool in Python that scraps links from PDF files

Setup

Clone the project

  git clone https://github.com/umegbewe/pdf-link-extract

Install pikepdf and PyMuPDF Python ibraries

  pip3 install pikepdf PyMuPDF

Go to the project directory

  cd pdf-link-extract

Specify the PDF to scan by on line 3

  file = "(pdfname).pdf"

Run:

  python3 pdflinkscraper1.py

or:

  python3 pdflinkscraper2.py

# pdflinkscraper1.py extracts links that are clickable which is more accurate.

# pdflinkscraper2.py extract links through a specified regex [Check pdflinkscraper.py line 5]

# You must have atleast Python 3 and PIP installed

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
LICENSE		LICENSE
README.md		README.md
eg.gif		eg.gif
pdflinkscraper.py		pdflinkscraper.py
pdflinkscraper1.py		pdflinkscraper1.py