## Table of Contents

- [Getting Started](#getting-started)
- [Setup](#setup)
- [Generate podcast from URL](#generate-podcast-from-url)
- [Generate podcast from multiple sources](#generate-podcast-from-multiple-sources)
- [Generate transcript](#generate-transcript)
- [Generate audio from transcript](#generate-audio-from-transcript)
- [Generate podcast from PDF](#generate-podcast-from-pdf)
- [Multilingual Support](#multilingual-support)
  - [French (fr)](#french-fr)
  - [Portugue (pt-br)](#portugue-pt-br)

## Getting Started
Podcastfy: Your GenAI-Powered Companion for Transforming Multi-Source Text into Captivating Audio Conversations

In [8]:
# Import necessary modules
from podcastfy.client import generate_podcast

This is just a custom function to embed audio we will generate.

In [None]:
%pip install ipython
from IPython.display import Audio, display

def embed_audio(audio_file):
	"""
	Embeds an audio file in the notebook, making it playable.

	Args:
		audio_file (str): Path to the audio file.
	"""
	try:
		display(Audio(audio_file))
		print(f"Audio player embedded for: {audio_file}")
	except Exception as e:
		print(f"Error embedding audio: {str(e)}")

## Setup

The project uses a combination of a `.env` file for managing API keys and sensitive information, and a `config.yaml` file for non-sensitive configuration settings. Follow these steps described in [Config](usage/config.md) to set up your configuration.

## Generate podcast from URL

This code demonstrates the process of generating a podcast from a single URL, in this case wikipedia's page on "Podcast":
1. Extract content from the URL
2. Generate a Q&A transcript from the extracted content
3. Convert the transcript to speech Text-to-Speech model
4. Save the generated audio file to data/audio

In [12]:
audio_file = generate_podcast(urls=["https://en.wikipedia.org/wiki/Podcast"])

2024-10-05 09:50:03,308 - podcastfy.client - INFO - Processing 1 links


[("Welcome to Podcastfy - Your Personal GenAI Podcast! Uh, what's up everyone? You know, it's funny how we use this technology every day, but have you ever stopped to think about the history of podcasts?", "I know what you mean. It's like, they're just there, you know? We hit play and boom - instant entertainment or information. But, where did this all start?"), ('Well, get this: the word "podcast" is actually a mashup of "iPod" and "broadcast"! Ben Hammersley, a journalist, first used it back in 2004.', "Wow, 2004? That's way earlier than I would've guessed! But, weren't MP3 players around before that?"), ('Totally! In fact, there was a company, i2Go, that offered a service kinda like podcasting back in 2000. It let people download news to their MP3 players. They were onto something, but it fizzled out quickly.', 'So, if that was happening in 2000, what really made podcasts take off later?'), ('It was a perfect storm of tech advancements. Apple launched iTunes with podcast support in 

2024-10-05 09:51:06,711 - podcastfy.client - INFO - Podcast generated successfully using openai TTS model


In [13]:
# Embed the audio file generated from transcript
embed_audio(audio_file)

Audio player embedded for: ./data/audio/podcast_e1525fed48054896af5645c203138dca.mp3


It works but it does not sound that exceptionally great! The default backend utilizes OpenAI's TTS model for speech generation. In the next example, we will utilize ElevenLabs model, which in my experience improves results dramatically.

## Generate podcast from multiple sources

Here, we take one step further and generate a podcast from multiple sources.
1. Podcastify's own github readme file
3. A youtube video about Google's NotebookLM going viral

In [16]:
# Define multiple URLs to process
urls = [
	"https://github.com/souzatharsis/podcastfy/blob/main/README.md",
	"https://www.youtube.com/watch?v=jx2imp33glc"
]

# Generate podcast from multiple URLs
audio_file_multi = generate_podcast(
	urls=urls,
    tts_model="elevenlabs"
)

2024-10-05 10:06:04,940 - podcastfy.client - INFO - Processing 2 links
2024-10-05 10:07:25,914 - podcastfy.client - INFO - Podcast generated successfully using elevenlabs TTS model


In [17]:
print(f"Podcast generated and saved as: {audio_file_multi}")

# Embed the generated audio file
embed_audio(audio_file_multi)

Podcast generated and saved as: ./data/audio/podcast_829a531a20334c949f76e077b846cc7f.mp3


Audio player embedded for: ./data/audio/podcast_829a531a20334c949f76e077b846cc7f.mp3


This AI-generated transcript is interesting for a couple of reasons:

- Realism: The transcript demonstrates the ability of AI to generate realistic, conversational dialogue. It includes elements like filler words ("uh", "umm"), casual language, and back-and-forth banter that mimic human conversation patterns.

- Irony: There's an ironic element in that the transcript presents AI-generated characters expressing concern about the implications of AI-generated content on their own (fictional) careers as podcasters.

- Ethical and legal concerns: The characters discuss potential implications of this technology, including copyright issues, voice replication without consent, and the impact on human content creators. This reflects real-world debates surrounding AI-generated content.

- Meta-commentary: The podcast is a an AI-generated content discussion about AI-generated content, specifically AI-created podcasts. This creates an intriguing layer of self-reference, as an AI-generated conversation is discussing the capabilities of AI to generate conversations.

However, this particular transcript did not pickup on my Podcastify's content solely focusing on the youtube video. This may happen as the AI-Podcast hosts may pick a particular concept from one of the provided sources and develop a conversation around that. There is room for improvement in guiding the AI-Podcasts hosts to strike a good balance of content coverage among the provided input sources.

## Generate transcript

There is also the option to generate the transcript only from input urls. This would allow users to edit/process transcripts before further downstream audio generation.

In [18]:
# Generate transcript only
transcript_file = generate_podcast(
	urls=["https://github.com/souzatharsis/podcastfy/blob/main/README.md"],
	transcript_only=True
)

2024-10-05 10:15:06,561 - podcastfy.client - INFO - Processing 1 links
2024-10-05 10:15:29,500 - podcastfy.client - INFO - Transcript generated successfully


Transcript generated and saved as: ./data/transcripts/transcript_f6ab3ee241444e999ed4d1142564b9fe.txt
First 20 characters of the transcript: <Person1> "Welcome t


In [19]:

print(f"Transcript generated and saved as: {transcript_file}")
# Read and print the first 20 characters from the transcript file
with open(transcript_file, 'r') as file:
	transcript_content = file.read(100)
	print(f"First 100 characters of the transcript: {transcript_content}")

Transcript generated and saved as: ./data/transcripts/transcript_f6ab3ee241444e999ed4d1142564b9fe.txt
First 100 characters of the transcript: <Person1> "Welcome to Podcastfy - YOUR Personal GenAI Podcast! You know, the other day I was struggl


## Generate audio from transcript

Users can also generate audio from a given transcript. Here, we generate a podcast from the previsouly generate transcript on wikipedia's Artificial Intelligence page. This allows users to re-use previsouly generated transcripts or provide their own custom produced transcript for podcast generation.

In [23]:
# Generate podcast from existing transcript file
audio_file_from_transcript = generate_podcast(
	transcript_file=transcript_file,
    tts_model="elevenlabs"
)

2024-10-05 10:28:37,745 - podcastfy.client - INFO - Using transcript file: ./data/transcripts/transcript_f6ab3ee241444e999ed4d1142564b9fe.txt
2024-10-05 10:30:17,300 - podcastfy.client - INFO - Podcast generated successfully using elevenlabs TTS model


In [24]:
# Embed the audio file generated from transcript
embed_audio(audio_file_from_transcript)

Audio player embedded for: ./data/audio/podcast_c06620d918d4419884f9c7558a4a2cf1.mp3


## Generate podcast from pdf

One or many pdfs can be processed in the same way as urls by simply passing a corresponding file path.

In [None]:
audio_file_from_pdf = generate_podcast(urls="/data/pdf/s41598-024-58826-w.pdf")

This is a Scientific Reports about climate change in France. we have it pre-generated into our data directory. Let's listen to the podcast:

In [None]:
file_path = "/data/audio/Agro_paper.mp3"
# Embed the audio file generated from transcript
embed_audio(file_path)

## Multilingual Support

Description of how to generate non-English content TBD. For now, here are a couple of audio examples:

### French (fr)

Generates a podcast from about [AgroClim website](https://agroclim.inrae.fr/) - French Government's service unit that aims to study the climate and its impacts on agroecosystems.

In [None]:
embed_audio("/data/audio/podcast_FR_AGRO.mp3")

### Portugue (pt-br)

Generates a podcast in Brazilian Portuguese from a news article on the most recent voting polls on [Sao Paulo's 2024 Elections](https://noticias.uol.com.br/eleicoes/2024/10/03/nova-pesquisa-datafolha-quem-subiu-e-quem-caiu-na-disputa-de-sp-03-10.htm).

In [None]:
embed_audio("/data/audio/podcast_thatupiso_BR.mp3")