Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gpu enable roar #20

Merged
merged 1 commit into from
Mar 3, 2021
Merged

Gpu enable roar #20

merged 1 commit into from
Mar 3, 2021

Conversation

AndreasMadsen
Copy link
Owner

@AndreasMadsen AndreasMadsen commented Feb 23, 2021

Depends on #19 landing. I will rebase this once #19 lands

I enabled CUDA for ROAR and recallibrated all the walltime estimates. Everything should be working and running in a resonable amount of time now.

I did look into using Trainer().predict() which was added in 1.2.0. However, it doesn't appear to work and while in the CHANGELOG it is still undocumented.

@AndreasMadsen AndreasMadsen force-pushed the gpu-enable-roar branch 3 times, most recently from 94838dc to 3f45565 Compare February 24, 2021 15:43
@AndreasMadsen AndreasMadsen mentioned this pull request Feb 28, 2021
@@ -133,7 +146,7 @@ def _mask_batch(self, batch):

def _mask_dataset(self, dataloader, name):
outputs = []
for batch in tqdm(dataloader(batch_size=self.batch_size, num_workers=0, shuffle=False),
for batch in tqdm(dataloader(batch_size=8, num_workers=0, shuffle=False),
Copy link
Collaborator

@ncmeade ncmeade Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the batch size fixed here due to memory issues?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I haven't found another way. It doesn't appear to affect performance much, but properly something I should look more into.

@@ -21,4 +21,4 @@ pip3 install --no-index --find-links $HOME/python_wheels \
# Install comp550
cd $HOME/workspace/comp550
pip3 install --no-index --no-deps -e .
python3 -u -X faulthandler "$@" --use-gpu True --num-workers 4 --persistent-dir $SCRATCH/comp550
python3 -u -X faulthandler "$@" --use-gpu True --num-workers 3 --persistent-dir $SCRATCH/comp550
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we enable the fault handler again? I think you mentioned it in a comment before, but I couldn't find it

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It essentially has no performance impact and in case of a Segmentation Fault, it provides a stack trace based on the coredump. I was getting Segmentation Faults when looking into memory issues.

@ncmeade
Copy link
Collaborator

ncmeade commented Mar 3, 2021

This looks good to merge to me.

@AndreasMadsen AndreasMadsen merged commit 97292d9 into master Mar 3, 2021
@AndreasMadsen AndreasMadsen deleted the gpu-enable-roar branch March 16, 2021 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants