Skip to content

glenjasper/pdf2txt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdf2txt

License

This script converts .pdf files to .txt files.

Table of content

Pre-requisites

Python libraries

  $ sudo apt install -y python3-pip
  $ sudo pip3 install --upgrade pip
  $ sudo pip3 install argparse
  $ sudo pip3 install xlsxwriter
  $ sudo pip3 install numpy
  $ sudo pip3 install pandas
  $ sudo pip3 install colorama

Installation

Clone

To clone and run this application, you'll need Git installed on your computer. From your command line:

  # Clone this repository
  $ git clone https://github.com/glenjasper/pdf2txt.git

  # Go into the repository
  $ cd pdf2txt

  # Run the app
  $ python3 pdf2txt.py --help

Download

You can download the latest installable version of pdf2txt.

Built With

  • XpdfReader: Xpdf is a free PDF viewer and toolkit, including a text extractor, image converter, HTML converter, and more. Most of the tools are available as open source.

How To Use

  $ python3 pdf2txt.py --help
  usage: pdf2txt.py [-h] -f FOLDER_PDF [-o OUTPUT] [--version]

  This script converts .pdf files to .txt files.

  optional arguments:
    -h, --help            show this help message and exit
    -f FOLDER_PDF, --folder_pdf FOLDER_PDF
                          Folder that contains all .pdf files
    -o OUTPUT, --output OUTPUT
                          Output folder
    --version             show program's version number and exit

  Thank you!

Author

Organization

License

This project is licensed under the GNU General Public License v3.0 License - see the LICENSE file for details.

About

This script converts .pdf files to .txt files.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages