Skip to content

This project goal is getting a large dataset of PDF documents

License

Notifications You must be signed in to change notification settings

py-pdf/pdf-crawler

Repository files navigation

pdf-crawler

The goal of pdf-crawler is to download PDF files from web pages for testing PyPDF2.

Install

pip install -r requirements.txt

Usage

It's organized in mostly isolted scripts, e.g.

python crawl.py

starts downloading PDF documents.

About

This project goal is getting a large dataset of PDF documents

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages