Textify-PDF

Textify-PDF is a Python script that extracts text from all PDF files in a specified folder path and saves them as .txt files. It uses the Tika library to extract text from the PDF files.

Installation

Clone the repository or download the ZIP file and extract it to a folder.
Install the required Python libraries using pip: pip install -r requirements.txt

Usage

Open a terminal or command prompt in the folder where you extracted the files.
Run the script using the command: python main.py
Enter the path to the folder containing the PDF files when prompted, eg: 'D:\CODES\textify-pdf'

The script will extract text from all PDF files in the specified folder and save them as .txt files in a "txt" subfolder. It also generates a zip file containing all the processed .txt files.

New feature: The script now supports processing of password-protected PDF files. If a password-protected PDF file is encountered, the script will skip the file and log a warning message.

Usage screenshot Samples

License

Textify-PDF is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Textify-PDF

Installation

Usage

Usage screenshot Samples

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Textify-PDF

Installation

Usage

Usage screenshot Samples

License