On Windows 10, AttributeError: Can't pickle local object 'StringEncoder.<locals>.EncodeField' #130

DONJYARAHOI · 2020-08-30T12:59:55Z

Hello,

I’m using windows10, torch 1.5.1+cu101 and torchvision 0.6.1+cu101.

I tried to run the example scripts. visualise_data worked well. But, in the agent_motion_prediction, I get an error when converting dataloader to an iterator as follows:

AttributeError                            Traceback (most recent call last)
<ipython-input-19-b7d59c605d01> in <module>
      1 # ==== TRAIN LOOP
----> 2 tr_it = iter(train_dataloader)
      3 progress_bar = tqdm(range(cfg["train_params"]["max_num_steps"]))
      4 losses_train = []
      5 for _ in progress_bar:

~\Anaconda3\envs\Kaggle_Lyft_201126A\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
    277             return _SingleProcessDataLoaderIter(self)
    278         else:
--> 279             return _MultiProcessingDataLoaderIter(self)
    280 
    281     @property

~\Anaconda3\envs\Kaggle_Lyft_201126A\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
    717             #     before it starts, and __del__ tries to join but will get:
    718             #     AssertionError: can only join a started process.
--> 719             w.start()
    720             self._index_queues.append(index_queue)
    721             self._workers.append(w)

~\Anaconda3\envs\Kaggle_Lyft_201126A\lib\multiprocessing\process.py in start(self)
    119                'daemonic processes are not allowed to have children'
    120         _cleanup()
--> 121         self._popen = self._Popen(self)
    122         self._sentinel = self._popen.sentinel
    123         # Avoid a refcycle if the target function holds an indirect

~\Anaconda3\envs\Kaggle_Lyft_201126A\lib\multiprocessing\context.py in _Popen(process_obj)
    222     @staticmethod
    223     def _Popen(process_obj):
--> 224         return _default_context.get_context().Process._Popen(process_obj)
    225 
    226 class DefaultContext(BaseContext):

~\Anaconda3\envs\Kaggle_Lyft_201126A\lib\multiprocessing\context.py in _Popen(process_obj)
    325         def _Popen(process_obj):
    326             from .popen_spawn_win32 import Popen
--> 327             return Popen(process_obj)
    328 
    329     class SpawnContext(BaseContext):

~\Anaconda3\envs\Kaggle_Lyft_201126A\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     91             try:
     92                 reduction.dump(prep_data, to_child)
---> 93                 reduction.dump(process_obj, to_child)
     94             finally:
     95                 set_spawning_popen(None)

~\Anaconda3\envs\Kaggle_Lyft_201126A\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 
     62 #

AttributeError: Can't pickle local object 'StringEncoder.<locals>.EncodeField'

The text was updated successfully, but these errors were encountered:

lucabergamini · 2020-09-01T08:45:02Z

Hi @DONJYARAHOI ,
This seems to be a torch multiprocessing related issue, not sure it's something we can fix on our side really.
Do you still get it with num_workers set to 0? Or alternatively using a for loop on the dataloader?

DONJYARAHOI · 2020-09-01T12:29:00Z

Setting num_workers to 0 works well. (Though I don't know how fast it is.)
Using a for loop gets same error.
I see. I would consider other environments. Such as WSL or kaggle Kernel.

lucabergamini · 2020-09-01T12:58:51Z

Setting num_workers to 0 works well. (Though I don't know how fast it is.)

I expect it to be quite slow, as rasterisation is our current bottleneck

I see. I would consider other environments. Such as WSL or kaggle Kernel.

Yeah, we haven't disabled support for Windows as some people have successfully run L5Kit on it but we're currently lacking an active developer for that platform, so our support is very limited..

nosound2 · 2020-09-06T15:19:10Z

num_workers = 0 is a torch level limitation on Windows, it is a well-known issue. I am able to train successfully on Windows.

lucabergamini self-assigned this Sep 1, 2020

lucabergamini closed this as completed Sep 1, 2020

hppRC mentioned this issue Feb 24, 2023

collate_fn関数のエラーについて hppRC/bert-classification-tutorial#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On Windows 10, AttributeError: Can't pickle local object 'StringEncoder.<locals>.EncodeField' #130

On Windows 10, AttributeError: Can't pickle local object 'StringEncoder.<locals>.EncodeField' #130

DONJYARAHOI commented Aug 30, 2020

lucabergamini commented Sep 1, 2020

DONJYARAHOI commented Sep 1, 2020

lucabergamini commented Sep 1, 2020

nosound2 commented Sep 6, 2020

On Windows 10, AttributeError: Can't pickle local object 'StringEncoder.<locals>.EncodeField' #130

On Windows 10, AttributeError: Can't pickle local object 'StringEncoder.<locals>.EncodeField' #130

Comments

DONJYARAHOI commented Aug 30, 2020

lucabergamini commented Sep 1, 2020

DONJYARAHOI commented Sep 1, 2020

lucabergamini commented Sep 1, 2020

nosound2 commented Sep 6, 2020