Hardcoded GPU 0? #9

mfelice · 2021-03-24T19:15:43Z

Hi there,

I'm facing an issue with your PyTorch implementation and some input sentences. E.g.

s = 'RT @HISPANlCPROBS : When u walk straight into the kitchen to eat & ur mom hits u with the " ya saludaste " #ThanksgivingWithHispanics https://…'
print(scorer.score_sentences([s]))

gives the following error:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 11.91 GiB total capacity; 451.65 MiB already allocated; 12.12 MiB free; 40.35 MiB cached)

I'm working on a server with three GPUs and tried setting ctxs = [mx.gpu(0)], ctxs = [mx.gpu(1)], ctxs = [mx.gpu(2)] and ctxs = [mx.cpu()] but I always get the same error about GPU 0. I'm wondering if this is hardcoded somewhere in your code? Changing the ctxs variable seems to have no effect.

Thanks.

The text was updated successfully, but these errors were encountered:

DarrenAbramson · 2021-03-24T21:50:43Z

From the fourth line of the readme:

ctxs = [mx.cpu()] # or, e.g., [mx.gpu(0), mx.gpu(1)]

Did you happen to try ctxs = [mx.gpu(0), mx.gpu(1), mx.gpu(2)]?

As for finding things that are hard-coded, are you aware that you can search the repository?

mfelice · 2021-03-24T23:14:30Z

Thanks! The values in ctxs seem to be ignored. However, I've been able to circumvent the issue by setting CUDA_VISIBLE_DEVICES. I believe the culprit is cuda:0 and/or device_ids=[0] in the following block:

mlm-scoring/src/mlm/scorers.py

Lines 561 to 568 in 6727297

    
           self._device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 
        
           torch.manual_seed(0) 
        
           torch.cuda.manual_seed_all(0) 
        
           # TODO: This does not restrict to specific GPUs however, use CUDA_VISIBLE_DEVICES? 
        
           # TODO: It also unnecessarily locks the GPUs to each other 
        
           self._model.to(self._device) 
        
           self._model = torch.nn.DataParallel(self._model, device_ids=[0]) 
        
           self._model.eval()

Maybe that should be set to whatever is specified by ctxs?

miidas referenced this issue in miidas/mlm-scoring Feb 15, 2022

Fix ctxs parameter handling in get_pretrained

86b34b9

zolastro linked a pull request Dec 20, 2022 that will close this issue

Added CPU support #23

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hardcoded GPU 0? #9

Hardcoded GPU 0? #9

mfelice commented Mar 24, 2021

DarrenAbramson commented Mar 24, 2021

mfelice commented Mar 24, 2021

Hardcoded GPU 0? #9

Hardcoded GPU 0? #9

Comments

mfelice commented Mar 24, 2021

DarrenAbramson commented Mar 24, 2021

mfelice commented Mar 24, 2021