Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can`t running this code with low equipments #6

Closed
ChloeWongxt opened this issue Jul 21, 2020 · 5 comments
Closed

Can`t running this code with low equipments #6

ChloeWongxt opened this issue Jul 21, 2020 · 5 comments

Comments

@ChloeWongxt
Copy link

Hi,Thank you for releasing your code. I am currently trying to reproduce the result of pre-training experiment on E2E.

However I cant running this code because of my PCs Configuration is too low.

Is there any methods to deal with it?

Maybe lower the parameters of the experiment? I have tried many times to lower the batch-size to 16. But its still cant work. I have no idea to deal with it.

How can i run this code with the following equipment?

I am looking forward to your reply. Thanks you very much.

My PC`s System parameters
image

@qibinc
Copy link
Collaborator

qibinc commented Jul 21, 2020

Hi @ChloeWongxt ,

Please tell us what do you mean by "can't work". Is there any error raised during pretraining? For example, reducing the batch size will only work if it's an OOM error on GPU. If it's an OOM error on CPU, decreasing the num_workers argument might be a better way to save memory for the dataset loader.

Our pretraining requires a lot of computation CPU resources for matrix decomposition and it took 1 day on our 56-core Xeon(R) CPU E5-2680 v4 @ 2.40GHz so please expect the running time to be long. However, the experiments should be run without any errors with proper configuration. Please paste the specific error/reason preventing the experiments running here and we can see how to solve it.

@ChloeWongxt
Copy link
Author

ChloeWongxt commented Jul 21, 2020

Our pretraining requires a lot of computation CPU resources for matrix decomposition and it took 1 day on our 56-core

Thanks for your reply!
There isn't any error when i run with the code.

But when i execute the following code, my computer crashed soon.

bash scripts/pretrain.sh 0 --batch-size 16 --num-workers 6 --num-copies 3 --num-samples 200 --epochs 10

The mouse didn't move either. I had to reboot to get it back to normal use.

But I can`t get the result of pre-traing.

@qibinc
Copy link
Collaborator

qibinc commented Jul 21, 2020

Hi @ChloeWongxt ,

Thanks for your feedback. I apologize for the problem caused. It must be frustrating to see the computer crash every time you run the code. I suggest that you try the most low profile setting with:

bash scripts/pretrain.sh 0 --num-workers 1 --num-copies 1

This should consume minimum CPU and RAM on your machine. If this fails, I'm sorry that it's not possible to run pretraining on your machine. However, you can still download our pretrained models and evaluate it on downstream tasks https://github.com/THUDM/GCC/blob/master/GETTING_STARTED.md#download-pretrained-models.

@ChloeWongxt
Copy link
Author

bash scripts/pretrain.sh 0 --num-workers 1 --num-copies 1

Hi @qibinc
Thanks for your patient and detailed rely!
I will try again.

@qibinc
Copy link
Collaborator

qibinc commented Jul 29, 2020

Hi @ChloeWongxt ,

I hope you're doing well. I will close this issue as it seems to be solved. If you meet any other issues on this topic, please feel free to reopen it.

@qibinc qibinc closed this as completed Jul 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants