pdf2txt

This script converts .pdf files to .txt files.

Table of content

Pre-requisites
- Python libraries
Installation
- Clone
- Download
Built With
How To Use
Author
Organization
License
Acknowledgments

Pre-requisites

Python libraries

  $ sudo apt install -y python3-pip
  $ sudo pip3 install --upgrade pip

  $ sudo pip3 install argparse
  $ sudo pip3 install xlsxwriter
  $ sudo pip3 install numpy
  $ sudo pip3 install pandas
  $ sudo pip3 install colorama

Installation

Clone

To clone and run this application, you'll need Git installed on your computer. From your command line:

  # Clone this repository
  $ git clone https://github.com/glenjasper/pdf2txt.git

  # Go into the repository
  $ cd pdf2txt

  # Run the app
  $ python3 pdf2txt.py --help

Download

You can download the latest installable version of pdf2txt.

Built With

XpdfReader: Xpdf is a free PDF viewer and toolkit, including a text extractor, image converter, HTML converter, and more. Most of the tools are available as open source.

How To Use

  $ python3 pdf2txt.py --help
  usage: pdf2txt.py [-h] -f FOLDER_PDF [-o OUTPUT] [--version]

  This script converts .pdf files to .txt files.

  optional arguments:
    -h, --help            show this help message and exit
    -f FOLDER_PDF, --folder_pdf FOLDER_PDF
                          Folder that contains all .pdf files
    -o OUTPUT, --output OUTPUT
                          Output folder
    --version             show program's version number and exit

  Thank you!

Author

Glen Jasper

Organization

Molecular and Computational Biology of Fungi Laboratory (LBMCF, ICB - UFMG, Belo Horizonte, Brazil).

License

This project is licensed under the GNU General Public License v3.0 License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pdf2txt.py		pdf2txt.py
pdftotext		pdftotext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

pdf2txt.py

pdf2txt.py

pdftotext

pdftotext

Repository files navigation

pdf2txt

Table of content

Pre-requisites

Python libraries

Installation

Clone

Download

Built With

How To Use

Author

Organization

License

About

Releases

Packages

Languages

License

glenjasper/pdf2txt

Folders and files

Latest commit

History

Repository files navigation

pdf2txt

Table of content

Pre-requisites

Python libraries

Installation

Clone

Download

Built With

How To Use

Author

Organization

License

About

Resources

License

Stars

Watchers

Forks

Languages