Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I think it is wrong code, please confirm this #8

Closed
seohyeonShin opened this issue Feb 23, 2024 · 3 comments
Closed

I think it is wrong code, please confirm this #8

seohyeonShin opened this issue Feb 23, 2024 · 3 comments

Comments

@seohyeonShin
Copy link

first, I did run dataset.py for preparing dataset.
and than,,
um, I found that line number 377 in dataset.py is wrong.
because, in your code...
there is not utils/utils.
so I think that from utils.tools import to_device.
is this correct?

also, I can't find this location
open("./config/LJSpeech/preprocess.yaml", "r")
where is it?
and,, what is it?

@GalaxyCong
Copy link
Owner

GalaxyCong commented Mar 1, 2024

Hello, Dear author

You can import the Dataset module using “from dataset import Dataset” in train.py, it’s correct cause you don‘t need to run the dataset.py file anymore.
1

Thus, there is no required step of running the data.py file in the preparation alone.
FS2

you can ignore the open ("./config/LJSpeech/preprocess.yaml", "r") cause we did not use this dataset.

("./config/LJSpeech/preprocess.yaml", "r") Source:
https://github.com/ming024/FastSpeech2/blob/master/dataset.py

@seohyeonShin
Copy link
Author

Thank you for your kind response.
Unfortunately, due to my lack of understanding, my preparation for driving your code is very poor.
If you don't mind, can I ask you an additional question?
I tried to run the mentioned lip2wav once.
also,, My place is where Baidu is not accessible, so I accessed it with your Google Drive, but I can't access it because I can't set permissions. But first, I downloaded chem data.
Below are the questions.

  1. Is there a separate database for corporus in preprocess.yaml?
    -1How do I get the corpus, do I just run the mfa?

2.Is raw_path where the original mp4 data(chem youtube video) should be?
-->preprocessed_path : "data/conggaoxiang/V2C/V2C_Code/example_Chem16_framelevel/chem"
-->Do I paste the folders created by sequentially performing 1_get_your_frames.py~ in the HPDubbing-how-to-get-face-and-lip into the preprocessed_path folder?

  • Overall, it is difficult to have the required folder structure.
    I'm really sorry to ask you this basic thing because it's my first time researching the field of speech, but...
    I think it will be of great help if you answer.

@GalaxyCong
Copy link
Owner

Hello, sorry for the delayed response.
I'm glad to answer your questions:

Q1
Audio part:
The first step is to download the data set.
The second step is to execute prepare_align.py
The third step is to use mfa to get the *.TextGrid file, or directly download the one we processed
The fourth step is to run preprocess.py, and then the preprocessed audio part is saved in the preprocessed_path path you set.

Q2.1
raw_path is the result of the original data processed by prepare_align.py, which contains *lab (raw text) and .wav (normalized audio).

image

Q2.2

No, you do not need to paste. Because preprocessed_path is only related to audio processing, "HPDubbing-how-to-get-face-and-lip" is related to video preprocessing. In "HPDubbing-how-to-get-face-and-lip", we provide some examples and codes of how to extract lip areas and facial areas.

The processing flow needs some time, so we directly provide features and disclose the extracted mouth and facial areas (.jpg) of the two datasets chem and V2C. Thanks for your reply again, I will reply to you as soon as possible if I have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants