Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-gpu training #125

Open
Jinjun58 opened this issue Feb 21, 2024 · 1 comment
Open

Multi-gpu training #125

Jinjun58 opened this issue Feb 21, 2024 · 1 comment

Comments

@Jinjun58
Copy link

Thanks for your great works.
I refer to the documentation for multi-GPU training instructions, but only the first GPU seems to be used in my project. why?

CUDA_VISIBLE_DEVICES=1,2,3 mpirun -n 3 python entry.py train \
            --conf_files configs/seem/focall_unicl_lang_v1.yaml \
@Beck-127
Copy link

Beck-127 commented May 8, 2024

I met the same problem. I have checked the log output, it seems like there are some problem with MPI.
My command:
CUDA_VISIBLE_DEVICES=0,1,2,3 mpirun -n 4 python entry.py train \ --conf_files configs/seem/focalt_unicl_lang_v1.yaml \ --overrides \ FP16 True \ COCO.INPUT.IMAGE_SIZE 1024 \ MODEL.DECODER.HIDDEN_DIM 512 \ MODEL.ENCODER.CONVS_DIM 512 \ MODEL.ENCODER.MASK_DIM 512 \ TEST.BATCH_SIZE_TOTAL 8 \ TRAIN.BATCH_SIZE_TOTAL 16 \ TRAIN.BATCH_SIZE_PER_GPU 2 \ SOLVER.MAX_NUM_EPOCHS 50 \ SOLVER.BASE_LR 0.0001 \ SOLVER.FIX_PARAM.backbone True \ SOLVER.FIX_PARAM.lang_encoder True \ SOLVER.FIX_PARAM.pixel_decoder True \ MODEL.DECODER.COST_SPATIAL.CLASS_WEIGHT 5.0 \ MODEL.DECODER.COST_SPATIAL.MASK_WEIGHT 2.0 \ MODEL.DECODER.COST_SPATIAL.DICE_WEIGHT 2.0 \ MODEL.DECODER.TOP_SPATIAL_LAYERS 10 \ MODEL.DECODER.SPATIAL.ENABLED True \ MODEL.DECODER.GROUNDING.ENABLED True \ FIND_UNUSED_PARAMETERS True \ ATTENTION_ARCH.SPATIAL_MEMORIES 32 \ MODEL.DECODER.SPATIAL.MAX_ITER 5 \ ATTENTION_ARCH.QUERY_NUMBER 3 \ STROKE_SAMPLER.MAX_CANDIDATE 10 \ WEIGHT True \ RESUME_FROM ./xdecoder_data/pretrained/xdecoder_focalt_last.pt
ERROR LOG:
image
image
Waiting for response!THX!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants