Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-gpu training #12

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

qinjian623
Copy link

Add a new file train_culane_mp.py which is single-node , multi-gpu training script.

The original train_culane.py was deleted while developing the new one, maybe we should restore it back.

Default batch_size is still 2, but it should be too small for multi-gpu training. 64 is acceptable on a 4-V100 node.

A new var metric_skips=10 is introduced. This could low the frequency of f1 metric which using CPU only.

Start training with:

CUDA_VISIBLE_DEVICES=4,5,6,7 python train_culane_mp.py ...

to control how many GPUs involved in training.

@qinjian623
Copy link
Author

qinjian623 commented May 7, 2021

Sorry for forgetting mention this,
In the file ./datasets/culane.py line 115 to line 124,
those l[1:] or l[:] differences came from an unofficial version CULane dataset, we fix part of links in the list file.

So may need some minor changes to use official CULane.

                if self.image_set == 'test':
                    self.img_list.append(os.path.join(self.data_dir_path,
                                                      l[1:]))  # l[1:]  get rid of the first '/' so as for os.path.join
                else:
                    self.img_list.append(os.path.join(self.data_dir_path, l))

                if self.image_set == 'test':
                    self.seg_list.append(os.path.join(self.data_dir_path, 'laneseg_label_w16_test', l[1:-3] + 'png'))
                else:
                    self.seg_list.append(os.path.join(self.data_dir_path, 'laneseg_label_w16', l[:-3] + 'png'))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant