Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hey help when runnig the first sample test #38

Closed
cyrstem opened this issue May 5, 2020 · 2 comments
Closed

hey help when runnig the first sample test #38

cyrstem opened this issue May 5, 2020 · 2 comments

Comments

@cyrstem
Copy link

cyrstem commented May 5, 2020

▶ python jukebox/sample.py --model=5b_lyrics --name=sample_5b --levels=3 --sample_length_in_seconds=20 --total_sample_length_in_seconds=180 --sr=44100 --n_samples=6 --hop_fraction=0.5,0.5,0.125

i get this error
`
Using cuda True
{'name': 'sample_5b', 'levels': 3, 'sample_length_in_seconds': 20, 'total_sample_length_in_seconds': 180, 'sr': 44100, 'n_samples': 6, 'hop_fraction': (0.5, 0.5, 0.125)}
Setting sample length to 881920 (i.e. 19.998185941043083 seconds) to be multiple of 128
Downloading from gce
Restored from /home/jacos/.cache/jukebox-assets/models/5b/vqvae.pth.tar
0: Loading vqvae in eval mode
Conditioning on 1 above level(s)
Checkpointing convs
Checkpointing convs
Loading artist IDs from /home/jacos/jukebox/jukebox/data/ids/v2_artist_ids.txt
Loading artist IDs from /home/jacos/jukebox/jukebox/data/ids/v2_genre_ids.txt
Level:0, Cond downsample:4, Raw to tokens:8, Sample length:65536
Downloading from gce
Traceback (most recent call last):
File "jukebox/sample.py", line 237, in
fire.Fire(run)
File "/home/jacos/anaconda3/envs/jukebox/lib/python3.7/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/jacos/anaconda3/envs/jukebox/lib/python3.7/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/jacos/anaconda3/envs/jukebox/lib/python3.7/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "jukebox/sample.py", line 234, in run
save_samples(model, device, hps, sample_hps)
File "jukebox/sample.py", line 157, in save_samples
vqvae, priors = make_model(model, device, hps)
File "/home/jacos/jukebox/jukebox/make_models.py", line 185, in make_model
priors = [make_prior(setup_hparams(priors[level], dict()), vqvae, 'cpu') for level in levels]
File "/home/jacos/jukebox/jukebox/make_models.py", line 185, in
priors = [make_prior(setup_hparams(priors[level], dict()), vqvae, 'cpu') for level in levels]
File "/home/jacos/jukebox/jukebox/make_models.py", line 169, in make_prior
restore(hps, prior, hps.restore_prior)
File "/home/jacos/jukebox/jukebox/make_models.py", line 54, in restore
checkpoint = load_checkpoint(checkpoint_path)
File "/home/jacos/jukebox/jukebox/make_models.py", line 37, in load_checkpoint
checkpoint = t.load(restore, map_location=t.device('cpu'))
File "/home/jacos/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/serialization.py", line 529, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/jacos/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/serialization.py", line 709, in _legacy_load
deserialized_objects[key].set_from_file(f, offset, f_should_read_directly)
RuntimeError: unexpected EOF, expected 113540 more bytes. The file might be corrupted.
terminate called after throwing an instance of 'c10::Error'
what(): owning_ptr == NullType::singleton() || owning_ptr->refcount
.load() > 0 INTERNAL ASSERT FAILED at /opt/conda/conda-bld/pytorch_1579040055865/work/c10/util/intrusive_ptr.h:348, please report a bug to PyTorch. intrusive_ptr: Can only intrusive_ptr::reclaim() owning pointers that were created using intrusive_ptr::release(). (reclaim at /opt/conda/conda-bld/pytorch_1579040055865/work/c10/util/intrusive_ptr.h:348)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7fbd602ab627 in /home/jacos/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: + 0x14879df (0x7fbd6345d9df in /home/jacos/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #2: THStorage_free + 0x17 (0x7fbd63c25fe7 in /home/jacos/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #3: + 0x563a9d (0x7fbd915f3a9d in /home/jacos/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

frame #27: __libc_start_main + 0xf3 (0x7fbd9fb34153 in /usr/lib/libc.so.6)

[1] 30984 abort (core dumped) python jukebox/sample.py --model=5b_lyrics --name=sample_5b --levels=3
`

@ZVK
Copy link

ZVK commented May 5, 2020

Duplicate of #30 issue-611515705

Make sure the model fully downloaded from GCE to ~/.cache/jukebox_assets/.

@prafullasd
Copy link
Collaborator

Yes, sometimes the download fails silently (we used quiet flag in wget so it doesnt show you an error), causing the restoring of the checkpoint to fail. You'll need to redownload.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants