How many training steps are required to achieve the effect in the sample? #15

arceus-jia · 2023-02-07T08:25:27Z

I tried the 100,000 steps training, but the results still look strange, is this normal?

Can you tell me how many steps I need to take to achieve the right result? Thank you!

zhangjiewu · 2023-02-07T15:49:55Z

The results look weird, like the model is not trained. It usually takes 300~500 steps to train on an 8-frame video. Can you provide more info (e.g, environment, code snippets) for me to look into this issue?

arceus-jia · 2023-02-08T07:06:25Z

Well, I'm not sure if it's the xformers version conflicts, but after I reinstalled the environment and upgraded torch to 13.1 , torchvision to 0.14.1 and installed the latest xformers version , the retraining result is fine.
Anyway, thank you!

zhangjiewu · 2023-02-08T07:22:30Z

Glad to hear that. Let me know if you have any other question. :)

liangbingzhao · 2023-02-14T08:41:55Z

can u share your results after running python -m xformers.info? I construct a new virtual environment, with torch1.13-cu117+torchvision0.14, but after I install xformers with command pip install -U xformers, module triton is not installed. I ran pip install triton, making it installed. But the results of this repo are still like yours. Wonder how to fix it?

arceus-jia · 2023-02-16T01:13:16Z

can u share your results after running python -m xformers.info? I construct a new virtual environment, with torch1.13-cu117+torchvision0.14, but after I install xformers with command pip install -U xformers, module triton is not installed. I ran pip install triton, making it installed. But the results of this repo are still like yours. Wonder how to fix it?

here is my environment, ,you can refer to it and compare it with yours

absl-py==1.4.0
accelerate==0.16.0
antlr4-python3-runtime==4.9.3
bitsandbytes==0.35.4
cachetools==5.3.0
certifi @ file:///croot/certifi_1671487769961/work/certifi
cffi @ file:///tmp/abs_98z5h56wf8/croots/recipe/cffi_1659598650955/work
charset-normalizer==3.0.1
decord==0.6.0
diffusers==0.11.1
einops==0.6.0
filelock==3.9.0
flit_core @ file:///opt/conda/conda-bld/flit-core_1644941570762/work/source/flit_core
ftfy==6.1.1
future @ file:///home/builder/ci_310/future_1640790123501/work
google-auth==2.16.0
google-auth-oauthlib==0.4.6
grpcio==1.51.1
huggingface-hub==0.12.0
idna==3.4
imageio==2.25.0
importlib-metadata==6.0.0
Jinja2==3.1.2
Markdown==3.4.1
MarkupSafe==2.1.2
mkl-fft==1.3.1
mkl-random @ file:///home/builder/ci_310/mkl_random_1641843545607/work
mkl-service==2.4.0
modelcards==0.1.6
mypy-extensions==1.0.0
numpy @ file:///croot/numpy_and_numpy_base_1672336185480/work
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
oauthlib==3.2.2
omegaconf==2.3.0
packaging==23.0
Pillow==9.4.0
protobuf==3.20.3
psutil==5.9.4
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pyre-extensions==0.0.23
PyYAML @ file:///croot/pyyaml_1670514731622/work
regex==2022.10.31
requests==2.28.2
requests-oauthlib==1.3.1
rsa==4.9
six @ file:///tmp/build/80754af9/six_1644875935023/work
tensorboard==2.11.2
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tokenizers==0.13.2
torch==1.13.1
torchvision==0.14.1
tqdm==4.64.1
transformers==4.26.0
typing-inspect==0.8.0
typing_extensions @ file:///croot/typing_extensions_1669924550328/work
urllib3==1.26.14
wcwidth==0.2.6
Werkzeug==2.2.2
xformers==0.0.17.dev444
zipp==3.12.1

liangbingzhao · 2023-02-16T04:55:32Z

Thank you for your response. I upgrade xformers from 0.0.16 to 0.0.17. Upgraded model generates as follows:

This seems better? But many discordances exist.

arceus-jia · 2023-02-20T02:04:26Z

This seems better? But many discordances exist.

Yep, that means the training was successful. In fact the sample given by the author is similar to this one. The author mainly provide an idea for ai-generated animation with diffusion model, but if you want to productize it, it still needs a lot of improvement

zhangjiewu closed this as completed Feb 8, 2023

This was referenced Feb 10, 2023

ask for help #14

Closed

Question about the results #22

Closed

xiangxinhello mentioned this issue May 8, 2024

Done arceus-jia/griendly-ios#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How many training steps are required to achieve the effect in the sample? #15

How many training steps are required to achieve the effect in the sample? #15

arceus-jia commented Feb 7, 2023

zhangjiewu commented Feb 7, 2023

arceus-jia commented Feb 8, 2023

zhangjiewu commented Feb 8, 2023

liangbingzhao commented Feb 14, 2023

arceus-jia commented Feb 16, 2023

liangbingzhao commented Feb 16, 2023

arceus-jia commented Feb 20, 2023 •

edited

Loading

How many training steps are required to achieve the effect in the sample? #15

How many training steps are required to achieve the effect in the sample? #15

Comments

arceus-jia commented Feb 7, 2023

zhangjiewu commented Feb 7, 2023

arceus-jia commented Feb 8, 2023

zhangjiewu commented Feb 8, 2023

liangbingzhao commented Feb 14, 2023

arceus-jia commented Feb 16, 2023

liangbingzhao commented Feb 16, 2023

arceus-jia commented Feb 20, 2023 • edited Loading

arceus-jia commented Feb 20, 2023 •

edited

Loading