🎧 PodAgent: A Comprehensive Framework for Podcast Generation

This repository contains the official implementation of "PodAgent: A Comprehensive Framework for Podcast Generation".

Given the topic to be discussed, PodAgent will simulate human behavior to create podcast-like audio presented as a talk show, featuring one host and several guests. The show will include diverse and insightful viewpoints, delivered in appropriate voices, along with structured sound effects and background music to enrich the listening experience.

News

🥂 2025.03 PodAgent is released! We currently support podcast generation in two languages: English and Chinese.

Download Codes

Download PodAgent

git clone https://github.com/yujxx/PodAgent.git

Download CosyVoice

cd PodAgent
mkdir TTS
cd TTS
git clone https://github.com/FunAudioLLM/CosyVoice.git
cd CosyVoice
git submodule update --init --recursive
cd ../..

Environment Setup

Install the environment (might take some time)

bash ./scripts/EnvsSetup.sh

Or, setup the environment step by step (recommended):

conda create -n podcast -y python=3.10
conda activate podcast
conda install -y -c conda-forge pynini==2.1.5
pip install -r TTS/CosyVoice/requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host=mirrors.aliyun.com
pip install -U git+https://git@github.com/facebookresearch/audiocraft@c5157b5bf14bf83449c17ea1eeb66c19fb4bc7f0#egg=audiocraft
pip install pip==23.2.1
pip install -r requirements.txt

Activate the conda environment

conda activate podcast

Download Models

Pre-download the models (might take some time)

python scripts/download_models.py

Services Setup

Set environment variables for using API services GPT-4 API

export OPENAI_BASE_URL=your_openai_url_here
export PODAGENT_OPENAI_KEY=your_openai_key_here
export PODAGENT_SERVICE_PORT=8021
export PODAGENT_SERVICE_URL=127.0.0.1
export PODAGENT_MAX_SCRIPT_LINES=999

Start Python API services (e.g., Text-to-Speech, Text-to-Audio)

bash scripts/start_services.sh

After that, please wait a moment and check the log in services_logs/service.out. When you see the following output, it means the services are ready to be called.

 * Running on http://127.0.0.1:8021

(Optional) Kill the running services when you finish the usage.

python scripts/kill_services.py

Usage

python podagent.py --topic "What are the primary factors that influence consumer behavior?" --guest-number "2" --session-id "test"

(Optional) If you want to reuse responses for repeated requests (e.g., during debugging), you can enable caching:

export USE_OPENAI_CACHE=True

Citation

If you find this work useful, you can cite the paper below:

@misc{xiao2025podagentcomprehensiveframeworkpodcast,
      title={PodAgent: A Comprehensive Framework for Podcast Generation}, 
      author={Yujia Xiao and Lei He and Haohan Guo and Fenglong Xie and Tan Lee},
      year={2025},
      eprint={2503.00455},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2503.00455}, 
}

Appreciation

WavJourney for providing an extensive audio generation workflow.
CosyVoice2 for a zero-shot text-to-speech synthesis model.
AudioCraft for state-of-the-art audio generation models.

Disclaimer

We are not liable for any audio generated using the semantics produced by this model. Please ensure that it is not used for any illegal purposes.
We provide voice libraries under data/voice_presets_cv_* for quick usage. The .wav files under voice_presets_cv_en and voice_presets_cv_zh are sourced from LibriTTS-R and AISHELL-3, respectively. Please ensure their usage complies with the respective licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
data		data
prompts		prompts
scripts		scripts
.gitignore		.gitignore
APIs.py		APIs.py
LICENSE		LICENSE
README.md		README.md
code_generator.py		code_generator.py
config.yaml		config.yaml
pipeline.py		pipeline.py
podagent.py		podagent.py
requirements.txt		requirements.txt
services.py		services.py
utils.py		utils.py
voice_presets.py		voice_presets.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎧 PodAgent: A Comprehensive Framework for Podcast Generation

News

Download Codes

Environment Setup

Download Models

Services Setup

Usage

Citation

Appreciation

Disclaimer

About

Releases

Packages

Languages

License

yujxx/PodAgent

Folders and files

Latest commit

History

Repository files navigation

🎧 PodAgent: A Comprehensive Framework for Podcast Generation

News

Download Codes

Environment Setup

Download Models

Services Setup

Usage

Citation

Appreciation

Disclaimer

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages