Corrupted 1b_lyrics checkpoint? #25

Desm0nt · 2020-05-02T19:47:06Z

Have the same issue on local machine (Ubuntu 20.04, 1080Ti, Anaconda, python 3.7, all installed as in readme) and on Google CoLab.

When fetching checkpoint for 1b_lyrics model and try to start:

(jukebox) desm0nt@desm0nt-linux:~/jukebox$ python jukebox/sample.py --model=1b_lyrics --name=sample_1b --levels=3 --sample_length_in_seconds=20 --total_sample_length_in_seconds=180 --sr=44100 --n_samples=4 --hop_fraction=0.5,0.5,0.125
Using cuda True
{'name': 'sample_1b', 'levels': 3, 'sample_length_in_seconds': 20, 'total_sample_length_in_seconds': 180, 'sr': 44100, 'n_samples': 4, 'hop_fraction': (0.5, 0.5, 0.125)}
Setting sample length to 881920 (i.e. 19.998185941043083 seconds) to be multiple of 128
Downloading from gce
Restored from /home/desm0nt/.cache/jukebox-assets/models/5b/vqvae.pth.tar
0: Loading vqvae in eval mode
Conditioning on 1 above level(s)
Checkpointing convs
Checkpointing convs
Loading artist IDs from /home/desm0nt/jukebox/jukebox/data/ids/v2_artist_ids.txt
Loading artist IDs from /home/desm0nt/jukebox/jukebox/data/ids/v2_genre_ids.txt
Level:0, Cond downsample:4, Raw to tokens:8, Sample length:65536
Downloading from gce
Restored from /home/desm0nt/.cache/jukebox-assets/models/5b/prior_level_0.pth.tar
0: Loading prior in eval mode
Conditioning on 1 above level(s)
Checkpointing convs
Checkpointing convs
Loading artist IDs from /home/desm0nt/jukebox/jukebox/data/ids/v2_artist_ids.txt
Loading artist IDs from /home/desm0nt/jukebox/jukebox/data/ids/v2_genre_ids.txt
Level:1, Cond downsample:4, Raw to tokens:32, Sample length:262144
Downloading from gce
Restored from /home/desm0nt/.cache/jukebox-assets/models/5b/prior_level_1.pth.tar
0: Loading prior in eval mode
Creating cond. autoregress with prior bins [79, 2048], 
dims [384, 6144], 
shift [ 0 79]
input shape 6528
input bins 2127
Self copy is False
Loading artist IDs from /home/desm0nt/jukebox/jukebox/data/ids/v3_artist_ids.txt
Loading artist IDs from /home/desm0nt/jukebox/jukebox/data/ids/v3_genre_ids.txt
Level:2, Cond downsample:None, Raw to tokens:128, Sample length:786432
Downloading from gce
Traceback (most recent call last):
  File "jukebox/sample.py", line 237, in <module>
    fire.Fire(run)
  File "/home/desm0nt/anaconda3/envs/jukebox/lib/python3.7/site-packages/fire/core.py", line 127, in Fire
    component_trace = _Fire(component, args, context, name)
  File "/home/desm0nt/anaconda3/envs/jukebox/lib/python3.7/site-packages/fire/core.py", line 366, in _Fire
    component, remaining_args)
  File "/home/desm0nt/anaconda3/envs/jukebox/lib/python3.7/site-packages/fire/core.py", line 542, in _CallCallable
    result = fn(*varargs, **kwargs)
  File "jukebox/sample.py", line 234, in run
    save_samples(model, device, hps, sample_hps)
  File "jukebox/sample.py", line 157, in save_samples
    vqvae, priors = make_model(model, device, hps)
  File "/home/desm0nt/jukebox/jukebox/make_models.py", line 185, in make_model
    priors = [make_prior(setup_hparams(priors[level], dict()), vqvae, 'cpu') for level in levels]
  File "/home/desm0nt/jukebox/jukebox/make_models.py", line 185, in <listcomp>
    priors = [make_prior(setup_hparams(priors[level], dict()), vqvae, 'cpu') for level in levels]
  File "/home/desm0nt/jukebox/jukebox/make_models.py", line 169, in make_prior
    restore(hps, prior, hps.restore_prior)
  File "/home/desm0nt/jukebox/jukebox/make_models.py", line 54, in restore
    checkpoint = load_checkpoint(checkpoint_path)
  File "/home/desm0nt/jukebox/jukebox/make_models.py", line 37, in load_checkpoint
    checkpoint = t.load(restore, map_location=t.device('cpu'))
  File "/home/desm0nt/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/serialization.py", line 529, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/desm0nt/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/serialization.py", line 709, in _legacy_load
    deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
RuntimeError: unexpected EOF, expected 61312207 more bytes. The file might be corrupted.
corrupted double-linked list
Aborted (core dumped)

The text was updated successfully, but these errors were encountered:

Jovonni · 2020-05-02T21:41:42Z

@Desm0nt can you post the solution before you close it?

Desm0nt · 2020-05-03T09:39:29Z

@Jovonni on local machine I just run out of free space in /home drive and it's abort the downloading process. But I still don't know what the problem happens in CoLab and how to fix it.

ssrp · 2020-05-04T10:21:42Z

@Desm0nt @Jovonni Sorry, I still couldn't fix this problem. Does anybody have a solution? I have 50GB free space on the system, 128GB CPU RAM and 32 GB GPU memory.

apeguero1 · 2020-05-04T17:35:15Z

I had this problem after prematurely exiting a sampling execution before the prior models could finish downloading. You might have a truncated prior model file already saved in your cache folder.

Try clearing the cache found at /root/.cache/jukebox-assets (in google colab). For the 5b model I had to delete /root/.cache/jukebox-assets/models/5b/prior_level_0.pth.tar so that a fresh download would start instead of trying to read the existing file.

ssrp · 2020-05-07T08:09:33Z

@apeguero1 Thank you for the prompt response -- it works now! :)

LeapGamer · 2020-06-21T06:48:40Z

I am getting this error when following the main instructions:

(jukebox) C:\Users\james\jukebox>python jukebox/sample.py --model=5b_lyrics --name=sample_5b --levels=3 --sample_length_in_seconds=20 --total_sample_length_in_seconds=180 --sr=44100 --n_samples=6 --hop_fraction=0.5,0.5,0.125 C:\Users\james\Anaconda3\envs\jukebox\lib\site-packages\librosa\util\decorators.py:9: NumbaDeprecationWarning: An import was requested from a module that has moved location. Import requested from: 'numba.decorators', please update to use 'numba.core.decorators' or pin to Numba version 0.48.0. This alias will not be present in Numba version 0.50.0. from numba.decorators import jit as optional_jit C:\Users\james\Anaconda3\envs\jukebox\lib\site-packages\librosa\util\decorators.py:9: NumbaDeprecationWarning: An import was requested from a module that has moved location. Import of 'jit' requested from: 'numba.decorators', please update to use 'numba.core.decorators' or pin to Numba version 0.48.0. This alias will not be present in Numba version 0.50.0. from numba.decorators import jit as optional_jit Using cuda True {'name': 'sample_5b', 'levels': 3, 'sample_length_in_seconds': 20, 'total_sample_length_in_seconds': 180, 'sr': 44100, 'n_samples': 6, 'hop_fraction': (0.5, 0.5, 0.125)} Setting sample length to 881920 (i.e. 19.998185941043083 seconds) to be multiple of 128 Downloading from gce Traceback (most recent call last): File "jukebox/sample.py", line 279, in <module> fire.Fire(run) File "C:\Users\james\Anaconda3\envs\jukebox\lib\site-packages\fire\core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "C:\Users\james\Anaconda3\envs\jukebox\lib\site-packages\fire\core.py", line 366, in _Fire component, remaining_args) File "C:\Users\james\Anaconda3\envs\jukebox\lib\site-packages\fire\core.py", line 542, in _CallCallable result = fn(*varargs, **kwargs) File "jukebox/sample.py", line 276, in run save_samples(model, device, hps, sample_hps) File "jukebox/sample.py", line 181, in save_samples vqvae, priors = make_model(model, device, hps) File "c:\users\james\jukebox\jukebox\make_models.py", line 191, in make_model vqvae = make_vqvae(setup_hparams(vqvae, dict(sample_length=hps.get('sample_length', 0), sample_length_in_seconds=hps.get('sample_length_in_seconds', 0))), device) File "c:\users\james\jukebox\jukebox\make_models.py", line 95, in make_vqvae restore_model(hps, vqvae, hps.restore_vqvae) File "c:\users\james\jukebox\jukebox\make_models.py", line 55, in restore_model checkpoint = load_checkpoint(checkpoint_path) File "c:\users\james\jukebox\jukebox\make_models.py", line 37, in load_checkpoint checkpoint = t.load(restore, map_location=t.device('cpu')) File "C:\Users\james\Anaconda3\envs\jukebox\lib\site-packages\torch\serialization.py", line 386, in load return _load(f, map_location, pickle_module, **pickle_load_args) File "C:\Users\james\Anaconda3\envs\jukebox\lib\site-packages\torch\serialization.py", line 563, in _load magic_number = pickle_module.load(f, **pickle_load_args) EOFError: Ran out of input

I have deleted the cache and still get it. I have 30gb of space. Any ideas?

LeapGamer · 2020-06-21T23:35:25Z

I am on Windows, my vqvae.pth.tar is 0KB. Happens with both 5b and 1b model.

NoiseGener8r · 2020-07-03T18:07:56Z

I have the same issue. I've tried deleting the file found at C:\Users\rfnoi\.cache\jukebox-assets\models\5b\vqvae.pth.tar, but it re-creates it and crashes with the same error EOFError: Ran out of input.

E: I have also attempted this with the 1b model. Nothing new is found in C:\Users\rfnoi\.cache\jukebox-assets\models\. Should I expect a /1b/ directory?

TheLionArye · 2020-09-29T20:44:45Z

I have the same issue. I've tried deleting the file found at C:\Users\rfnoi\.cache\jukebox-assets\models\5b\vqvae.pth.tar, but it re-creates it and crashes with the same error EOFError: Ran out of input.

E: I have also attempted this with the 1b model. Nothing new is found in C:\Users\rfnoi\.cache\jukebox-assets\models\. Should I expect a /1b/ directory?

did you get your answer? I'm having that problem too

mwcm · 2021-04-16T03:54:39Z

same issue here, WIndows 10

cicinwad · 2022-03-31T17:39:42Z

same issue here, WIndows 10

No issue here, Windows 8.1

Desm0nt closed this as completed May 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrupted 1b_lyrics checkpoint? #25

Corrupted 1b_lyrics checkpoint? #25

Desm0nt commented May 2, 2020 •

edited

Loading

Jovonni commented May 2, 2020

Desm0nt commented May 3, 2020

ssrp commented May 4, 2020

apeguero1 commented May 4, 2020

ssrp commented May 7, 2020

LeapGamer commented Jun 21, 2020

LeapGamer commented Jun 21, 2020 •

edited

Loading

NoiseGener8r commented Jul 3, 2020 •

edited

Loading

TheLionArye commented Sep 29, 2020

mwcm commented Apr 16, 2021

cicinwad commented Mar 31, 2022

Corrupted 1b_lyrics checkpoint? #25

Corrupted 1b_lyrics checkpoint? #25

Comments

Desm0nt commented May 2, 2020 • edited Loading

Jovonni commented May 2, 2020

Desm0nt commented May 3, 2020

ssrp commented May 4, 2020

apeguero1 commented May 4, 2020

ssrp commented May 7, 2020

LeapGamer commented Jun 21, 2020

LeapGamer commented Jun 21, 2020 • edited Loading

NoiseGener8r commented Jul 3, 2020 • edited Loading

TheLionArye commented Sep 29, 2020

mwcm commented Apr 16, 2021

cicinwad commented Mar 31, 2022

Desm0nt commented May 2, 2020 •

edited

Loading

LeapGamer commented Jun 21, 2020 •

edited

Loading

NoiseGener8r commented Jul 3, 2020 •

edited

Loading