How to create a text and audio dataset #7

AI-Guru · 2023-04-05T18:40:29Z

Hi!

First and foremost: congratulations on this fine collection of repositories! I am slowly working my way through them and I am amazed by how easy and effective your work is.

I will soon start some work on conditional audio generation. What would be a good starting point for creating something like a WAVDataset that would yield audio and text? Would it be the best way to just extend WAVDataset?

Best,
Tristan

flavioschneider · 2023-04-06T17:06:56Z

Hi @AI-Guru, thanks a lot!

A subclass of WAVDataset with extra text metadata would be a good starting option. I personally used a WebDataset (with the custom AudioWebDataset) which basically loads a set of tar files with numbered pairs of wav/json. WebDatasets work well with a lot of data, but it's a bit more involved to start with.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to create a text and audio dataset #7

How to create a text and audio dataset #7

AI-Guru commented Apr 5, 2023

flavioschneider commented Apr 6, 2023

How to create a text and audio dataset #7

How to create a text and audio dataset #7

Comments

AI-Guru commented Apr 5, 2023

flavioschneider commented Apr 6, 2023