Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't pickle local object 'TrainAugmentation.__init__.<locals>. #19

Closed
ZLeopard opened this issue Nov 20, 2018 · 9 comments
Closed

Can't pickle local object 'TrainAugmentation.__init__.<locals>. #19

ZLeopard opened this issue Nov 20, 2018 · 9 comments

Comments

@ZLeopard
Copy link

w.start()

File "D:\SoftWare\Anaconda3\envs\deeplearning\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "D:\SoftWare\Anaconda3\envs\deeplearning\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\SoftWare\Anaconda3\envs\deeplearning\lib\multiprocessing\context.py", line 322, in Popen
return Popen(process_obj)
File "D:\SoftWare\Anaconda3\envs\deeplearning\lib\multiprocessing\popen_spawn
win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "D:\SoftWare\Anaconda3\envs\deeplearning\lib\multiprocessing\reduction.py ", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'TrainAugmentation.init..< lambda>'
2018-11-20 16:16:08,607 - root - INFO - working_dir is ./events/ Please using co mmand : tensorboard --logdir=.
Traceback (most recent call last):
File "", line 1, in
File "D:\SoftWare\Anaconda3\envs\deeplearning\lib\multiprocessing\spawn.py", l ine 105, in spawn_main
exitcode = _main(fd)
File "D:\SoftWare\Anaconda3\envs\deeplearning\lib\multiprocessing\spawn.py", l ine 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

have you meet this problem related to the "pickle"?? thank you for the code, firstly~~

@ZLeopard
Copy link
Author

ooooooh, i know the reason, just because the OS is Windows, In Windows, multiprocessing uses pickle to transfer objects between processes. Socket objects can not be pickled !!! It's ok in linux~

@ygean
Copy link

ygean commented Dec 4, 2018

@ZLeopard Hey, bro! Where and what should I change codes if I want to run on windows platform, I'd like to know your solutions or you just run on linux

@ZLeopard
Copy link
Author

ZLeopard commented Dec 4, 2018

@zhouyuangan no, i have no idea about using the code in windows, you may search on the StackOverFlow, some like replace the pickle lib using another lib

@OakSuen
Copy link

OakSuen commented Jan 14, 2019

Hi, have you found the answer how to run the code in Windows, without this error?

@FanLu1994
Copy link

anybody solved this problem?

@lianxxx
Copy link

lianxxx commented Feb 25, 2019

It looks some problem with multi-processing, how about setting the num_workers = 0.

@drcdr
Copy link

drcdr commented Feb 26, 2019

I'm using Windows10 and recent pytorch. I was able to get things to run by :
(1) using num_workers = 0; or,
(2) using num_workers = 4, and:
(2a) replacing
lambda img, boxes=None, labels=None: (img / std, boxes, labels),
with a call to
ScaleStd(std),
in the 3 appropriate places in TrainAugmentation in data_preprocessing.py, and adding an appropriate ScaleStd function to transforms.py.

(2b) for safety, I also added:

    if __name__ == '__main__':
        main()

in train_ssd, see https://discuss.pytorch.org/t/if-name-main-for-window10/19377

Using num_workers=0: [~77s/100 steps]
2019-02-26 00:04:29,682 - root - INFO - Epoch: 0, Step: 100, Average Loss: 10.2757, Average Regression Loss 2.7290, Average Classification Loss: 7.5467
2019-02-26 00:05:47,321 - root - INFO - Epoch: 0, Step: 200, Average Loss: 7.1005, Average Regression Loss 2.3017, Average Classification Loss: 4.7987

Using num_workers=4: [~25s/100 steps]
2019-02-26 00:09:41,438 - root - INFO - Epoch: 0, Step: 100, Average Loss: 10.2432, Average Regression Loss 2.7734, Average Classification Loss: 7.4699
2019-02-26 00:10:05,654 - root - INFO - Epoch: 0, Step: 200, Average Loss: 6.9949, Average Regression Loss 2.2865, Average Classification Loss: 4.7084

So, ~3x speedup (in both cases, I'm reading files from an SSD drive).

I made a few other changes, but I think they are inconsequential. Let me know if this doesn't work for you, and I can check for other possible changes.

@drcdr
Copy link

drcdr commented Feb 26, 2019

@ZLeopard @zhouyuangan @OakSuen @CodeForWuyu @lianxxx updating you, as this issue is marked closed.

@littlerain2310
Copy link

littlerain2310 commented Dec 27, 2019

can u find other solution ? It doesn't seem to work on my laptop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants