I think it is wrong code, please confirm this #8

seohyeonShin · 2024-02-23T06:13:03Z

first, I did run dataset.py for preparing dataset.
and than,,
um, I found that line number 377 in dataset.py is wrong.
because, in your code...
there is not utils/utils.
so I think that from utils.tools import to_device.
is this correct?

also, I can't find this location
open("./config/LJSpeech/preprocess.yaml", "r")
where is it?
and,, what is it?

GalaxyCong · 2024-03-01T10:19:44Z

Hello, Dear author

You can import the Dataset module using “from dataset import Dataset” in train.py, it’s correct cause you don‘t need to run the dataset.py file anymore.

Thus, there is no required step of running the data.py file in the preparation alone.

you can ignore the open ("./config/LJSpeech/preprocess.yaml", "r") cause we did not use this dataset.

("./config/LJSpeech/preprocess.yaml", "r") Source:
https://github.com/ming024/FastSpeech2/blob/master/dataset.py

seohyeonShin · 2024-03-27T01:55:45Z

Thank you for your kind response.
Unfortunately, due to my lack of understanding, my preparation for driving your code is very poor.
If you don't mind, can I ask you an additional question?
I tried to run the mentioned lip2wav once.
also,, My place is where Baidu is not accessible, so I accessed it with your Google Drive, but I can't access it because I can't set permissions. But first, I downloaded chem data.
Below are the questions.

Is there a separate database for corporus in preprocess.yaml?
-1How do I get the corpus, do I just run the mfa?

2.Is raw_path where the original mp4 data(chem youtube video) should be?
-->preprocessed_path : "data/conggaoxiang/V2C/V2C_Code/example_Chem16_framelevel/chem"
-->Do I paste the folders created by sequentially performing 1_get_your_frames.py~ in the HPDubbing-how-to-get-face-and-lip into the preprocessed_path folder?

Overall, it is difficult to have the required folder structure.
I'm really sorry to ask you this basic thing because it's my first time researching the field of speech, but...
I think it will be of great help if you answer.

GalaxyCong · 2024-04-26T09:04:57Z

Hello, sorry for the delayed response.
I'm glad to answer your questions:

Q1
Audio part:
The first step is to download the data set.
The second step is to execute prepare_align.py
The third step is to use mfa to get the *.TextGrid file, or directly download the one we processed
The fourth step is to run preprocess.py, and then the preprocessed audio part is saved in the preprocessed_path path you set.

Q2.1
raw_path is the result of the original data processed by prepare_align.py, which contains *lab (raw text) and .wav (normalized audio).

Q2.2

No, you do not need to paste. Because preprocessed_path is only related to audio processing, "HPDubbing-how-to-get-face-and-lip" is related to video preprocessing. In "HPDubbing-how-to-get-face-and-lip", we provide some examples and codes of how to extract lip areas and facial areas.

The processing flow needs some time, so we directly provide features and disclose the extracted mouth and facial areas (.jpg) of the two datasets chem and V2C. Thanks for your reply again, I will reply to you as soon as possible if I have time.

seohyeonShin closed this as completed May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I think it is wrong code, please confirm this #8

I think it is wrong code, please confirm this #8

seohyeonShin commented Feb 23, 2024

GalaxyCong commented Mar 1, 2024 •

edited

Loading

seohyeonShin commented Mar 27, 2024

GalaxyCong commented Apr 26, 2024

I think it is wrong code, please confirm this #8

I think it is wrong code, please confirm this #8

Comments

seohyeonShin commented Feb 23, 2024

GalaxyCong commented Mar 1, 2024 • edited Loading

seohyeonShin commented Mar 27, 2024

GalaxyCong commented Apr 26, 2024

GalaxyCong commented Mar 1, 2024 •

edited

Loading