Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where is the link of pretrained ckpt? #5

Open
LeopoldACC opened this issue Dec 18, 2022 · 2 comments
Open

Where is the link of pretrained ckpt? #5

LeopoldACC opened this issue Dec 18, 2022 · 2 comments

Comments

@LeopoldACC
Copy link

LeopoldACC commented Dec 18, 2022

Hi, @Philipflyg . Thanks for your sharing of your work, but threre is no link in the pretrained ckpt table. I tried to train the ssc_pretrain task, which need more than 10 hours/epoch and 1000 hours whole training pipeline to reproduce. May you give me a link to the pretrain ckpt that can be downloaded?

I misundestood the processbar, maybe total training pipeline cost 11 hours.

@LeopoldACC
Copy link
Author

when I trained the ssc_pretrain task and try to train the train task, the erros is shown as below.
My environment is

cuda 11.1
torch 1.9.0
[W CUDAGuardImpl.h:112] Warning: CUDA warning: an illegal memory access was encountered (function destroyEvent)
terminate called after throwing an instance of 'c10::CUDAError'
  what():  CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:1055 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f24650eba22 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x10aa3 (0x7f246534caa3 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x1a7 (0x7f246534e147 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10::TensorImpl::release_resources() + 0x54 (0x7f24650d55a4 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #4: std::vector<c10d::Reducer::Bucket, std::allocator<c10d::Reducer::Bucket> >::~vector() + 0x2f9 (0x7f250a2515a9 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libtorch_p
ython.so)
frame #5: c10d::Reducer::~Reducer() + 0x276 (0x7f250a247fd6 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #6: std::_Sp_counted_ptr<c10d::Reducer*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x12 (0x7f250a277c92 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libtorch_python.so
)
frame #7: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x46 (0x7f25099c19d6 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #8: std::_Sp_counted_ptr<c10d::Logger*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x1d (0x7f250a27c9ad in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #9: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x46 (0x7f25099c19d6 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #10: <unknown function> + 0xdaf48f (0x7f250a27a48f in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #11: <unknown function> + 0x4ff598 (0x7f25099ca598 in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #12: <unknown function> + 0x50089e (0x7f25099cb89e in /data/packages/anaconda3/envs/sisc/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #13: /data/packages/anaconda3/envs/sisc/bin/python() [0x4acafd]
frame #14: /data/packages/anaconda3/envs/sisc/bin/python() [0x4d5894]
frame #15: /data/packages/anaconda3/envs/sisc/bin/python() [0x4bbc68]
frame #16: /data/packages/anaconda3/envs/sisc/bin/python() [0x4d05bb]
frame #17: /data/packages/anaconda3/envs/sisc/bin/python() [0x4d05d1]
frame #18: /data/packages/anaconda3/envs/sisc/bin/python() [0x4d05d1]
frame #19: /data/packages/anaconda3/envs/sisc/bin/python() [0x4a1947]
frame #20: PyDict_SetItemString + 0x8b (0x4a529b in /data/packages/anaconda3/envs/sisc/bin/python)
frame #21: PyImport_Cleanup + 0x89 (0x5745b9 in /data/packages/anaconda3/envs/sisc/bin/python)
frame #22: Py_FinalizeEx + 0x67 (0x5702f7 in /data/packages/anaconda3/envs/sisc/bin/python)
frame #23: /data/packages/anaconda3/envs/sisc/bin/python() [0x5449b9]
frame #24: _Py_UnixMain + 0x3c (0x54487c in /data/packages/anaconda3/envs/sisc/bin/python)
frame #25: __libc_start_main + 0xe7 (0x7f251179fc87 in /lib/x86_64-linux-gnu/libc.so.6)
frame #26: /data/packages/anaconda3/envs/sisc/bin/python() [0x54472e]

@Philipflyg
Copy link
Collaborator

hi, sorry for the late reply. i don't know why the ckpt link does not work in the cover of the repo. You can find the ckpt link in the raw content of the README file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants