Gpu enable roar #20

AndreasMadsen · 2021-02-23T00:13:20Z

~~Depends on #19 landing. I will rebase this once #19 lands~~

I enabled CUDA for ROAR and recallibrated all the walltime estimates. Everything should be working and running in a resonable amount of time now.

I did look into using Trainer().predict() which was added in 1.2.0. However, it doesn't appear to work and while in the CHANGELOG it is still undocumented.

ncmeade · 2021-03-03T02:58:58Z

comp550/dataset/roar.py

@@ -133,7 +146,7 @@ def _mask_batch(self, batch):

    def _mask_dataset(self, dataloader, name):
        outputs = []
-        for batch in tqdm(dataloader(batch_size=self.batch_size, num_workers=0, shuffle=False),
+        for batch in tqdm(dataloader(batch_size=8, num_workers=0, shuffle=False),


Is the batch size fixed here due to memory issues?

Yes. I haven't found another way. It doesn't appear to affect performance much, but properly something I should look more into.

ncmeade · 2021-03-03T03:06:31Z

python_job.sh

@@ -21,4 +21,4 @@ pip3 install --no-index --find-links $HOME/python_wheels \
 # Install comp550
 cd $HOME/workspace/comp550
 pip3 install --no-index --no-deps -e .
-python3 -u -X faulthandler "$@" --use-gpu True --num-workers 4 --persistent-dir $SCRATCH/comp550
+python3 -u -X faulthandler "$@" --use-gpu True --num-workers 3 --persistent-dir $SCRATCH/comp550


Why did we enable the fault handler again? I think you mentioned it in a comment before, but I couldn't find it

It essentially has no performance impact and in case of a Segmentation Fault, it provides a stack trace based on the coredump. I was getting Segmentation Faults when looking into memory issues.

ncmeade · 2021-03-03T03:07:40Z

This looks good to merge to me.

AndreasMadsen force-pushed the gpu-enable-roar branch 3 times, most recently from 94838dc to 3f45565 Compare February 24, 2021 15:43

AndreasMadsen mentioned this pull request Feb 28, 2021

Relative ROAR #22

Merged

enable cuda for ROAR and fix memory issues

95ccbe2

AndreasMadsen force-pushed the gpu-enable-roar branch from 3f45565 to 95ccbe2 Compare March 2, 2021 21:28

ncmeade reviewed Mar 3, 2021

View reviewed changes

AndreasMadsen merged commit 97292d9 into master Mar 3, 2021

AndreasMadsen deleted the gpu-enable-roar branch March 16, 2021 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gpu enable roar #20

Gpu enable roar #20

AndreasMadsen commented Feb 23, 2021 •

edited

ncmeade Mar 3, 2021 •

edited

AndreasMadsen Mar 3, 2021

ncmeade Mar 3, 2021

AndreasMadsen Mar 3, 2021 •

edited

ncmeade commented Mar 3, 2021

Gpu enable roar #20

Gpu enable roar #20

Conversation

AndreasMadsen commented Feb 23, 2021 • edited

ncmeade Mar 3, 2021 • edited

Choose a reason for hiding this comment

AndreasMadsen Mar 3, 2021

Choose a reason for hiding this comment

ncmeade Mar 3, 2021

Choose a reason for hiding this comment

AndreasMadsen Mar 3, 2021 • edited

Choose a reason for hiding this comment

ncmeade commented Mar 3, 2021

AndreasMadsen commented Feb 23, 2021 •

edited

ncmeade Mar 3, 2021 •

edited

AndreasMadsen Mar 3, 2021 •

edited