`AttributeError: Can't pickle local object...` when using multiprocessing #1554

harpone · 2020-01-24T08:55:26Z

🐛 Bug

This isn't really a torch_xla bug, but rather a "feature" when using distributed samplers with multiple worker processes as described here.

The problem arises if you're using a lambda expression or a function inside a generate_datasets function or similar, instead of e.g. a proper transformations class, because for some reason, multiprocessing can't pickle that properly (I don't even have an idea why MP wants to pickle anything in the first place but I'm sure there's a good reason for that).

To Reproduce

Steps to reproduce the behavior:

Let's say we have
dataset_trn, loader_trn = get_datasets(args_)

and

def get_datasets(args):
    transform_target = lambda y: dict_of_stuff[y]
    (other stuff)
    dataset_trn = torch.utils.data.Dataset(...,
                                   target_transform=transform_target,
                                   ...)
    sampler_trn = DistributedSampler(...)
    loader_trn = torch.utils.data.DataLoader(dataset_trn, sampler=sampler_trn, ...)

    return dataset_trn, loader_trn

This will throw

AttributeError: Can't pickle local object 'get_datasets.<locals>.<dataset_trn>'

but only when using the multiprocessing style training. Fix is to use a class for the transform instead of a lambda expression or a function.

Anyway, I struggled quite a bit with this weird error (multiprocessing stuff makes debugging pretty hard), so just posting this here in case others will encounter it. Not sure if there's an official fix per se, so feel free to close immediately.

The text was updated successfully, but these errors were encountered:

dlibenzi · 2020-01-24T15:40:11Z

Are you creating these datasets before calling xmp.spawn() and passing them as args or globals?

harpone · 2020-01-27T07:36:41Z

No, they're created for each process.

harpone · 2020-01-27T09:44:16Z

also closing since not really a torch_xla bug...

Rainbowman0 · 2023-03-01T03:07:26Z

Are you creating these datasets before calling xmp.spawn() and passing them as args or globals?

creating these datasets before calling xmp.spawn() and passing them as args or globals

I did exactly what you said, i.e. 'creating these datasets before calling xmp.spawn() and passing them as args or globals' and ran into the same problem. How to solve it?

Rainbowman0 · 2023-03-01T03:12:16Z

Are you creating these datasets before calling xmp.spawn() and passing them as args or globals?

creating these datasets before calling xmp.spawn() and passing them as args or globals

I did exactly what you said, i.e. 'creating these datasets before calling xmp.spawn() and passing them as args or globals' and ran into the same problem. How to solve it?

I did this because I was using Nvidia's DALI to speed up the data loading process, and I found that if I create a DataLoader by each process in xmp.spawn(), DALI will create four DataLoaders instead of one like Pytorch .

harpone closed this as completed Jan 27, 2020

axsaucedo mentioned this issue Nov 30, 2020

Transformers model unable to run with Cuda SeldonIO/seldon-core#2680

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`AttributeError: Can't pickle local object...` when using multiprocessing #1554

`AttributeError: Can't pickle local object...` when using multiprocessing #1554

harpone commented Jan 24, 2020 •

edited

Loading

dlibenzi commented Jan 24, 2020

harpone commented Jan 27, 2020

harpone commented Jan 27, 2020

Rainbowman0 commented Mar 1, 2023

Rainbowman0 commented Mar 1, 2023

AttributeError: Can't pickle local object... when using multiprocessing #1554

AttributeError: Can't pickle local object... when using multiprocessing #1554

Comments

harpone commented Jan 24, 2020 • edited Loading

🐛 Bug

To Reproduce

dlibenzi commented Jan 24, 2020

harpone commented Jan 27, 2020

harpone commented Jan 27, 2020

Rainbowman0 commented Mar 1, 2023

Rainbowman0 commented Mar 1, 2023

`AttributeError: Can't pickle local object...` when using multiprocessing #1554

`AttributeError: Can't pickle local object...` when using multiprocessing #1554

harpone commented Jan 24, 2020 •

edited

Loading