This script converts .pdf files to .txt files.
$ sudo apt install -y python3-pip
$ sudo pip3 install --upgrade pip
$ sudo pip3 install argparse
$ sudo pip3 install xlsxwriter
$ sudo pip3 install numpy
$ sudo pip3 install pandas
$ sudo pip3 install colorama
To clone and run this application, you'll need Git installed on your computer. From your command line:
# Clone this repository
$ git clone https://github.com/glenjasper/pdf2txt.git
# Go into the repository
$ cd pdf2txt
# Run the app
$ python3 pdf2txt.py --help
You can download the latest installable version of pdf2txt.
- XpdfReader: Xpdf is a free PDF viewer and toolkit, including a text extractor, image converter, HTML converter, and more. Most of the tools are available as open source.
$ python3 pdf2txt.py --help
usage: pdf2txt.py [-h] -f FOLDER_PDF [-o OUTPUT] [--version]
This script converts .pdf files to .txt files.
optional arguments:
-h, --help show this help message and exit
-f FOLDER_PDF, --folder_pdf FOLDER_PDF
Folder that contains all .pdf files
-o OUTPUT, --output OUTPUT
Output folder
--version show program's version number and exit
Thank you!
- Molecular and Computational Biology of Fungi Laboratory (LBMCF, ICB - UFMG, Belo Horizonte, Brazil).
This project is licensed under the GNU General Public License v3.0 License - see the LICENSE file for details.