Skip to content

Latest commit

 

History

History
16 lines (13 loc) · 880 Bytes

README.md

File metadata and controls

16 lines (13 loc) · 880 Bytes

AUTO-PDF-Scraper

Python scripts to extract text from PDFs, save it as a text file, export a list of words and their frequencies to a CSV file for further analysis, extract dates from the text, and graph the text's parts of speech.

Standalone versions of the part of speech grapher and the date scraper can be found here and here, respectively.

To Use:

  • Download the scripts in the "scripts" folder
  • Place the PDF files you'd like to scrape in the same folder as the scripts
  • Run pdf_scraper.py

Dependencies