TrashGPT

Welcome to our repository. This project was completed for CS 685 (Spring 2023). The goal of this project is to generate clips of the Trash Taste Podcast. Each folder specifies one aspect of the data pipeline.

Setup

There are many different subroutines in this repository, and I would recommend only installing the packages needed for that subroutine to avoid package conflicts.

Install Python 3.10.X
Setup a virtual environment: python -m venv .venv
Activate virtual environment. Command is platform dependent.
Install PyTorch. Make sure to use the command on their site. It is platform dependent.
Install other requirements as seen in the README of the pipeline component you are running.

Pipeline

Data download
- Create directory raw_data/, move into the directory.
- Run yt-dlp.exe --yes-playlist --write-sub --write-auto-sub --sub-lang "en.*" --sub-format json3 -f m4a https://www.youtube.com/playlist?list=PLUHmmIt9sU6i4JlDABqLeWybD_ZrJf9LB
Data preprocessing (look in data_prep/)
- Speech diarization
- Speaker naming tool
- Dataset formulation
Training the models and generating transcripts (look in modeling/)
- Training and generation code for GPT2-Small, GPT2-Med, Bloom 560M, and LLaMA 7B
Voice cloning and audio performance generation (look in voice_cloning/)
- Transcript parser
- TTS with voice cloning using tortoise-tts

Results

Sample text generations are in test/
Evaluation
LLaMA 7B Clip
Bloom 560M Clip
GPT2-Medium Clip

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
data_prep		data_prep
modeling		modeling
voice_cloning		voice_cloning
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TrashGPT

Setup

Pipeline

Results

Other Documents

About

Releases

Packages

Contributors 3

Languages

paarthtandon/TrashGPT

Folders and files

Latest commit

History

Repository files navigation

TrashGPT

Setup

Pipeline

Results

Other Documents

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages