Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何多卡训练 #132

Open
Jianghaiyang0729 opened this issue Jun 1, 2024 · 1 comment
Open

如何多卡训练 #132

Jianghaiyang0729 opened this issue Jun 1, 2024 · 1 comment

Comments

@Jianghaiyang0729
Copy link

作者你好!我现在想用几个baseline模型在大规模数据集上测试,比如GBA和GLA。但是我在H100上测试的时候,有些baseline模型会out of memory,比如一些Transformer模型(Pyraformer)。那么请问,怎么设置多卡训练呢?(我使用的服务器上,每个节点有三个H100)是不是要在baselines/Pyraformer/GBA.py文件中加入一些设置呢?(这个文件是我后加的,在您原来的文件中没有这个GBA.py)

@zezhishao
Copy link
Owner

假设您有3块GPU,他们的编号分别是0,1,2,那么可以通过设置CFG.GPU_NUM为3,并在运行https://github.com/zezhishao/BasicTS/blob/master/experiments/train.py脚本时,指定--gpus为"0,1,2"即可。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants