Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot finish tasks in Colab because runtime crashes due to low RAM. #57

Closed
ritog opened this issue Jun 22, 2023 · 4 comments · Fixed by #108
Closed

Cannot finish tasks in Colab because runtime crashes due to low RAM. #57

ritog opened this issue Jun 22, 2023 · 4 comments · Fixed by #108

Comments

@ritog
Copy link
Contributor

ritog commented Jun 22, 2023

On the setting up section of this course, it says that:

Google Colab for hands-on exercises. The free version is enough.

But the section where the feature extractor is applied to the music database, the Colab runtime crashes saying that it crashed due to low RAM.

image

What could be a possible workaround?

@MKhalusova
Copy link
Contributor

@sanchit-gandhi Can you please take a look?

@sanchit-gandhi
Copy link
Collaborator

Thanks for flagging @ritog! There are a few 'tricks' we can employ to try and get this working with lower RAM (I'm fairly confident it's just a case of tweaking the .map hyper-parameters to get this to work on a free Google Colab)

Could you try reducing two parameters please?

  • batch_size: defaults to 1000, let's try setting this to 100, and if that doesn't work then reduce it by a factor of 2 again to 50
  • writer_batch_size: defaults to 1000, let's try setting this to 500, and if that doesn't work then reduce it by a factor of 2 to 250

=> using a combination of the above two should be most optimal here, so I would try batch_size=100, writer_batch_size=500, and if that doesn't work then batch_size=50, writer_batch_size=500:

gtzan_encoded = gtzan.map(
    preprocess_function, remove_columns=["audio", "file"], batched=True, num_proc=1, batch_size=100, writer_batch_size=500,
)

@sanchit-gandhi
Copy link
Collaborator

Hey @ritog - wondering if you had any luck here? Would be interested in hearing whether you found a configuration that worked for the .map method. If so, I can update the Unit to use your configs. Otherwise, we'll have to find a different workaround!

@MHRDYN7
Copy link
Contributor

MHRDYN7 commented Jul 15, 2023

@sanchit-gandhi
It works fine with batch_size=100, no need to change writer_batch_size. You may also update the output pointed out in #95.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants