A simple PDF scraper, which I used to create study guides based on lecture materials.
Usage is simple, add the PDF files you want to scrape into the input folder with a name of your choosing and run the following command:
python scraper.py
You will be prompted to enter the name of the folder containing the PDF files as well as the name of the output file. Then, the program will extract the text from the PDFs and save it to a text file in the working directory.
Cleanup then inputs a user prompted name for the output file along with the name for the output file, and uses Regex and string manipulation to clean up the text and format it into a more readable study guide. This file should be modified to suit your needs. Running cleanup is optional, but recommended. The command to run cleanup is as follows:
python cleanup.py