# Create an Audiobook from a PDF
## This task tests your ability to apply Text to Speech conversion and Extraction of Text from PDF files in the creation of an audiobook from a PDF file

### Steps
- Extract text from PDF file
- Clean the text
- Convert the text into speech
- Save the speech
- Play the speech

## 1. Extract text from PDF
- Use PyPDF2

### Install the library

In [1]:
# %pip install PyPDF2

### Import the library

In [2]:
import PyPDF2
import re


### Extract the text

In [3]:
def extract_text_from_pdf(file_path):
    with open(file_path, 'rb') as file:
        reader = PyPDF2.PdfReader(file)
        text = ''
        for page in reader.pages:
            text += page.extract_text()
        return text
    

def clean_pdf_text(raw_text):
    cleaned = re.sub(r'-\n', '', raw_text)
    cleaned = re.sub(r'\n+', ' ', cleaned)
    cleaned = re.sub(r'\s{2,}', ' ', cleaned)
    cleaned = re.sub(r'\b\d+\b', '', cleaned)
    return cleaned.strip()



### Print the extracted text

In [4]:
text = extract_text_from_pdf("doc.pdf")
clean_text = clean_pdf_text(text)

print(clean_text)  


Attention is All You Need . Introduction The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder . The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer , based solely on attention mechanisms, dispensing with recurrence and convolutions entirely . Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves . BLEU on the WMT  English-to-German translation task, improving over the existing best results, including ensembles. On the WMT  English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of . after training for . days on eight GPUs, a small fraction of the training costs of the best models from the literature. . Background

## 2. Convert the Text into Speech
- Use **pyttsx3** OR **gTTS**

### Install the library

In [None]:
# !pip install pyttsx3
# %pip install gTTS


Note: you may need to restart the kernel to use updated packages.


### Import the library

In [6]:
from gtts import gTTS


### Initialize a Speaker object

In [7]:
#not required on gtts

### Convert the text

In [8]:
def convert_text_to_speech(text, language='en', output_file='audio.mp3'):
    tts = gTTS(text=text, lang=language, slow=False)
    tts.save(output_file)
    print(f"Audio saved as {output_file}")


### Save the audio

In [9]:
convert_text_to_speech(clean_text)


Audio saved as audio.mp3
