Skip to content

This project is focused on using python to create an audio file from a PDF.

License

Notifications You must be signed in to change notification settings

Brandon-Martinez27/python-pdf2audio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Convert PDF to Audiobook with Python

About the Project

Goals

  • Create an audiobook (mp3 file) from a PDF file

Background

An audiobook is nothing but a book that was recorded in an audio format. It can also be stated as a book that is being read aloud. Audiobook helps in improvising one’s vocabulary, comprehension and pronunciation of words. Hope some of you guys are bookaholic but lazy to read it on your own. So it’s time to create your audiobook by using few codes of Python. This will make you enjoy audiobooks without any subscription fee, that was imposed on platforms like Audible, Scribd, etc.

Deliverables

  • A file.mp3

Acknowledgments

Project Steps

Prerequisite

Python has a huge number of modules that contains reusable code which performs desired functions when invoked. It must be installed into your system before using its functionalities. So here we going to grab just 2 modules from a bunch of python modules prevailing around.

  • Install PyPDF2 module, using pip install PyPDF2
  • Install pyttsx3 module, using pip install pyttsx3

PyPDF2 module

It is one of the Pure-Python libraries that runs on any Python platform without any external dependencies. It works entirely on StringIO rather than using FileStreams. PyPDF2 module performs various operations on PDF files. PyPDF2 module can perform some of the following tasks,

  • Fetching document information like title, author, etc.
  • Split and merge documents page by page.
  • Merging multiple pages into a single page.
  • Cropping a page to the required ratio.
  • Encryption and decryption of PDF files.

pyttsx3 module

It is one of the popular modules in python which intakes text as an input and results in speech/audio as an output. Even this pyttsx3 module works offline, proving it a user-friendly module over other modules. It is compatible with Python version 2 as well as Python version 3.

Importing modules

The above modules must be invoked to get their functionalities used in the program. Use an import statement to import a module, like import PyPDF2 import pyttsx3

Reading a PDF file

A PDF file must be opened first to manipulate its contents. A PDF file can be read/write, can embed with an attachment, add a bookmark within a pdf file and so on. Various operations can be performed on a pdf file using Python. We can retrieve some useful information from a PDF file like the number of pages, the layout of the page that is being used, one can retrieve a page with the help of a page number and so on. All these operations are carried over by the PyPDF2 module in Python. To read a PDF file, use a command as, variable_name = PyPDF2.PdfFileReader(open(‘file_name’,’rb’))

  • where variable_name → the name of the variable
  • PdfFileReader( ) → class under PyPDF2 module.
  • open( ) → function used for opening a file.
  • file_name → the name of the file that needs to be opened.
  • rb → mode of a file (opens file in binary format to read)

Initializing speaker

Speakers must be initialized next so that we can convert text to audio format using the pyttsx3 module. Use command, speaker=pyttsx3.init() to initialize speaker.

Extracting text

The first and foremost thing is getting a page with the help of its page number. The page number of the required page is passed as an argument to the getPage( ) method. The command can be written as, text=readpdf.getPage(pagenumber).extractText() where getPage(pagenumber) → retrieve a page with the help of page number extractText() → extract text from page specified.

Text to Audio

Now it’s time to pass all those extracted texts as an argument to the method named say( ) in the pyttsx3 module, which helps in converting text to an audio format. The command is as follows, speaker.say(text) where text extracted in the previous command is passed as an argument into the say( ) method.

Saving voice to a file

The voice generated by the above command can be saved into an mp3 file. The file will be saved in the exact location where our code has been saved. Hence saving the audio file will help users to access it in future days.

The command to save voice to a file is as follows, speaker.save_to_file(text,’filename.mp3') where speaker → the variable that is initialized already.

  • save_to_file( text, ‘filename.mp3’) → method used for saving an audio file.

How to Reproduce

Steps

  1. Open and IDE like VS Code
  2. Navigate to your root project folder
  3. create a virtual env python -m venv <folder_name>
  4. Activate your venv source venv/bin/activate:(mac/linux) ./venv/Scripts/activate:(windows)
  5. Install dependencies pip install -r requirements.txt
  6. Run the main script python main.py

Tools & Requirements

  • VS Code
  • Python 3.+
  • PyPDF2
  • pyttsx3

License

MIT License

Creators

Brandon Martinez

About

This project is focused on using python to create an audio file from a PDF.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages