Question: can we generate 1sec/ms length audio? #3

wassimbj · 2023-04-28T13:35:41Z

The homepage shows only 10secs for all the audios, i want to know if the audio length is controllable ? Or there is a minimum as in audioldm, you can't generate less then 2.5 secs

deepanwayx · 2023-04-28T18:42:17Z

Do you mean trimming the audio to a desired length after generating a 10-second long sample? This is easily doable by truncating the generated wave in tango.py:

def generate(self, prompt, steps=100, guidance=3, samples=1, disable_progress=True, desired_length_in_seconds=10):
  """ Genrate audio for a single prompt string. """
  with torch.no_grad():
      latents = self.model.inference([prompt], self.scheduler, steps, guidance, samples, disable_progress=disable_progress)
      mel = self.vae.decode_first_stage(latents)
      wave = self.vae.decode_to_waveform(mel)
      # Sampling rate is 16 KHz
      wave = wave[:, desired_length_in_seconds * 16000]
  return wave[0]

However, constraining the generated audio such that the events described in the text appear within the first n seconds is not straightforward to control. The nature of the training dataset results in the generated audio having the events described in the text prompt being spread over the entire 10 seconds duration.

wassimbj · 2023-04-28T19:12:35Z

yes, I meant to get the events within n seconds. do you mean if I trained it on a short-length audio files, I get short results too? what length should the dataset be in ur opinion? and what do you think should be done to control the length of the audio?

deepanwayx · 2023-04-29T08:26:56Z

You need to train on shorter audio samples to achieve the control. The duration variable in train.py specifies the length of the audio in seconds. It is set to 10 which you can reduce to a smaller number and train with appropriate short audio samples.

wassimbj · 2023-04-29T08:46:09Z

Thanks 😁

wassimbj changed the title ~~Question: can we generate 1sec/ms sounds ?~~ Question: can we generate 1sec/ms length audio? Apr 28, 2023

wassimbj closed this as completed Apr 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: can we generate 1sec/ms length audio? #3

Question: can we generate 1sec/ms length audio? #3

wassimbj commented Apr 28, 2023 •

edited

deepanwayx commented Apr 28, 2023

wassimbj commented Apr 28, 2023

deepanwayx commented Apr 29, 2023

wassimbj commented Apr 29, 2023 •

edited

Question: can we generate 1sec/ms length audio? #3

Question: can we generate 1sec/ms length audio? #3

Comments

wassimbj commented Apr 28, 2023 • edited

deepanwayx commented Apr 28, 2023

wassimbj commented Apr 28, 2023

deepanwayx commented Apr 29, 2023

wassimbj commented Apr 29, 2023 • edited

wassimbj commented Apr 28, 2023 •

edited

wassimbj commented Apr 29, 2023 •

edited