Speaker id argument #7

ilnmtlbnm · 2020-05-15T12:34:10Z

There is a Speaker id argument in inference.py : parser.add_argument('-i', '--id', help='Speaker id', type=int).

Whenever I try to change it to something other than 0, I get the following error :

Traceback (most recent call last):
  File "inference.py", line 122, in <module>
    args.n_frames, args.sigma, args.seed)
  File "inference.py", line 63, in infer
    speaker_vecs = trainset.get_speaker_id(speaker_id).cuda()
  File "/data/code/flowtron/data.py", line 83, in get_speaker_id
    return torch.LongTensor([self.speaker_ids[int(speaker_id)]])
KeyError: 2

The text was updated successfully, but these errors were encountered:

karkirowle · 2020-05-15T12:57:32Z

If you are using the LJS model that might be expected as it is a single speaker model. You could try using the LibrITTS.

Quasimondo · 2020-05-15T13:18:25Z

Just a note - when using LibrITTS you will also have to change the n_speakers parameter in config.json to 123:

"model_config": { "n_speakers": 123, "n_speaker_dim": 128, "n_text": 185, "n_text_dim": 512, "n_flows": 2, "n_mel_channels": 80, "n_attn_channels": 640, "n_hidden": 1024, "n_lstm_layers": 2, "mel_encoder_n_hidden": 512, "n_components": 0, "mean_scale": 0.0, "fixed_gaussian": true, "dummy_speaker_embedding": false, "use_gate_layer": true }

ilnmtlbnm · 2020-05-15T13:40:56Z

If you are using the LJS model that might be expected as it is a single speaker model. You could try using the LibrITTS.

Of course, thanks @karkirowle !
And thanks @Quasimondo for precising n_speakers for LibrITTS.

ilnmtlbnm · 2020-05-15T13:45:42Z

DOH! again, I closed to fast, still doesn't with LibrITTS.

python inference.py -c config.json -f models/flowtron_libritts.pt -w models/waveglow_256channels_v4.pt -t "But the machine only creates what humans have taught it to " -i 15 -n 777 -s 0.5

Quasimondo · 2020-05-15T13:47:16Z

Yeah - I realized that you will also have to adjust the "data_config" section:
"training_files": "filelists/libritts_train_clean_100_audiopath_text_sid_shorterthan10s_atleast5min_train_filelist.txt"

And lastly you will have to pick a speaker ID that actually exists. They are not numbered consecutively, but you have to look them up in that filelist (it's the numbers at the end of each line)

ilnmtlbnm · 2020-05-15T14:31:05Z

Thanks again @Quasimondo

For reference, here are the valid ids for LibriTTS :

40 78 83 87 118 125 196 200 250 254 374 405 446 460 587 669 696 730 831 887 1069 1088 1116 1246 1263
 1502 1578 1841 1867 1963 1970 2092 2136 2182 2196 2289 2416 2436 2836 2843 2911 2952 3240 3242 3259
 3436 3486 3526 3664 3857 3879 3982 3983 4018 4051 4088 4160 4195 4267 4297 4362 4397 4406 4640 4680
 4788 5022 5104 5322 5339 5393 5652 5678 5703 5750 5808 6019 6064 6078 6081 6147 6181 6209 6272 6367
 6385 6415 6437 6454 6476 6529 6818 6836 6848 7059 7067 7078 7178 7190 7226 7278 7302 7367 7402 7447
 7505 7511 7794 7800 8051 8088 8098 8108 8123 8238 8312 8324 8419 8468 8609 8629 8770 8838

rafaelvalle · 2020-05-15T15:35:42Z

Thank you for compiling this list!

yhgon · 2020-05-19T04:42:56Z

I add additional script extract available sid. See below

https://github.com/yhgon/flowtron/blob/master/inference_colab.ipynb

import os
import sys

import pandas as pd 
import numpy as np 
import random
from itertools import cycle
from data import  load_filepaths_and_text

!cat /content/flowtron/filelists/libritts_speakerinfo.txt | tail -n +12  | head -n 10

filelist_path = "/content/flowtron/filelists/libritts_train_clean_100_audiopath_text_sid_shorterthan10s_atleast5min_train_filelist.txt"

def create_speaker_lookup_table(audiopaths_and_text):
    speaker_ids = np.sort(np.unique([x[2] for x in audiopaths_and_text]))
    d = {int(speaker_ids[i]): i for i in range(len(speaker_ids))}
    print("Number of speakers :", len(d))
    return d

audiopaths_and_text = load_filepaths_and_text(filelist_path)
speaker_ids  = create_speaker_lookup_table(audiopaths_and_text).keys() 
print(speaker_ids)
speakers = pd.read_csv('/content/flowtron/filelists/libritts_speakerinfo.txt', engine='python',header=None, comment=';', sep=' *\| *',  names=['ID', 'SEX', 'SUBSET', 'MINUTES', 'NAME'])
speakers['FLOWTRON_ID'] = speakers['ID'].apply(lambda x: x if x in speaker_ids else -1)

female_speakers =   speakers.query("SEX == 'F' and MINUTES > 20 and FLOWTRON_ID >= 0")['FLOWTRON_ID'].sample(frac=1).tolist() 
male_speakers   =   speakers.query("SEX == 'M' and MINUTES > 20 and FLOWTRON_ID >= 0")['FLOWTRON_ID'].sample(frac=1).tolist() 

print("females speakers : ", len(female_speakers), female_speakers )
print("male speakers    : ", len(male_speakers), male_speakers )

ilnmtlbnm closed this as completed May 15, 2020

ilnmtlbnm reopened this May 15, 2020

ghost mentioned this issue Sep 3, 2020

Steps to replicate pretrained models on LibriTTS #57

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speaker id argument #7

Speaker id argument #7

ilnmtlbnm commented May 15, 2020 •

edited

Loading

karkirowle commented May 15, 2020 •

edited

Loading

Quasimondo commented May 15, 2020

ilnmtlbnm commented May 15, 2020

ilnmtlbnm commented May 15, 2020 •

edited

Loading

Quasimondo commented May 15, 2020 •

edited

Loading

ilnmtlbnm commented May 15, 2020 •

edited by rafaelvalle

Loading

rafaelvalle commented May 15, 2020

yhgon commented May 19, 2020 •

edited

Loading

Speaker id argument #7

Speaker id argument #7

Comments

ilnmtlbnm commented May 15, 2020 • edited Loading

karkirowle commented May 15, 2020 • edited Loading

Quasimondo commented May 15, 2020

ilnmtlbnm commented May 15, 2020

ilnmtlbnm commented May 15, 2020 • edited Loading

Quasimondo commented May 15, 2020 • edited Loading

ilnmtlbnm commented May 15, 2020 • edited by rafaelvalle Loading

rafaelvalle commented May 15, 2020

yhgon commented May 19, 2020 • edited Loading

ilnmtlbnm commented May 15, 2020 •

edited

Loading

karkirowle commented May 15, 2020 •

edited

Loading

ilnmtlbnm commented May 15, 2020 •

edited

Loading

Quasimondo commented May 15, 2020 •

edited

Loading

ilnmtlbnm commented May 15, 2020 •

edited by rafaelvalle

Loading

yhgon commented May 19, 2020 •

edited

Loading