Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About training time. I found that it takes two hours to complete 4 epochs. #10

Open
sumorday opened this issue Mar 11, 2024 · 7 comments

Comments

@sumorday
Copy link

sumorday commented Mar 11, 2024

Is this speed normal? If it takes two hours to complete 4 epochs, then wouldn't it take nearly 250 hours, or about 11 days, to complete a total of 500 epochs? Is this normal for celeba(256x256) dataset(celeb256_dit.txt)?

スクリーンショット 2024-03-11 午後7 09 12
@quandao10
Copy link
Collaborator

If you are using DiT architecture, you should install torch>=2.0, the flash attention will allow you train faster but will sacrifice some performance. Or you could run encoder on the training data to get latent data before training lfm, you could feed more batch size. Hope these tricks help you.

@sumorday
Copy link
Author

If you are using DiT architecture, you should install torch>=2.0, the flash attention will allow you train faster but will sacrifice some performance. Or you could run encoder on the training data to get latent data before training lfm, you could feed more batch size. Hope these tricks help you.

Thank you for the response. I will try adjusting the torch version (mainly concerned about compatibility issues).
As for running the encoder, hasn't the autoencoder already been used in the train_flow_latent.py? Or do I need to configure something to run the encoder on the training data?

@sumorday
Copy link
Author

If you are using DiT architecture, you should install torch>=2.0, the flash attention will allow you train faster but will sacrifice some performance. Or you could run encoder on the training data to get latent data before training lfm, you could feed more batch size. Hope these tricks help you.

截屏2024-03-13 10 55 59
截屏2024-03-13 10 56 13
Is the "--f" here automatically utilizing an autoencoder?

@sumorday
Copy link
Author

截屏2024-03-15 13 45 06
Is this, change false to true?

@sumorday
Copy link
Author

quandao10

@hao-pt Xin chào, Quandao đề xuất tôi nên bật bộ mã hóa để tăng tốc độ huấn luyện. Tôi muốn hỏi, liệu tôi có thể bắt đầu lại quá trình huấn luyện bộ giải mã mã hóa ở đây bằng cách thay đổi giá trị từ false thành true cho mô hình giai đoạn đầu tiên không? Mong được câu trả lời của bạn. (Dường như tôi không cần mô hình được tiền huấn luyện vì sử dụng cho các tập dữ liệu khác và có thêm các phương pháp khác.)

Hello, Quandao suggested that I enable the encoder to speed up the training process. I would like to ask, can I restart the training of the encoder-decoder model from scratch by changing the value from false to true for the first_stage_model_train here? Looking forward to your answer. (It seems that I don't need a pretrained model because I am using it on different datasets and have added other methods.)

@hao-pt
Copy link
Collaborator

hao-pt commented Mar 20, 2024

The use of pretrained autoencoder is to enhance training efficiency and performance of the model. Hence, first_stage_model here is only performed in inference mode, without further training. In case, you want to train from scratch an autoencoder which is out of our focus. As long as your dataset still follows similar statistics of natural images, there is no need to retrain the autoencoder.

@sumorday
Copy link
Author

run encoder on the training data to get latent data before training lfm

Thank you. Because I saw another comment saying "run encoder on the training data to get latent data before training lfm," I'm curious if I need to set up another encoder separately. So, as long as the code being run is as follows: !bash ./bash_scripts/run.sh ./test_args/celeb256_dit.txt, it will be the correct Flow Matching in Latent Space, right?

Bởi vì tôi thấy một comment khác nói "chạy encoder trên dữ liệu huấn luyện để có dữ liệu ẩn trước khi huấn luyện lfm," nên tôi muốn biết liệu tôi có cần phải thiết lập một encoder khác một cách riêng biệt không. Vì vậy, miễn là mã được chạy như sau: !bash ./bash_scripts/run.sh ./test_args/celeb256_dit.txt, nó sẽ là Flow Matching in Latent Space đúng, phải không?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants