New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize absubmit #3003

Merged
merged 7 commits into from Aug 14, 2018

Conversation

Projects
None yet
2 participants
@bemeurer
Contributor

bemeurer commented Aug 14, 2018

Solves #2442

@bemeurer bemeurer changed the title from Parallelize absubmit to [WIP] Parallelize absubmit Aug 14, 2018

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Aug 14, 2018

Member

Looks like just what the doctor ordered! Would you mind adding a changelog entry?

Member

sampsyo commented Aug 14, 2018

Looks like just what the doctor ordered! Would you mind adding a changelog entry?

@bemeurer

This comment has been minimized.

Show comment
Hide comment
@bemeurer

bemeurer Aug 14, 2018

Contributor

@sampsyo Almost! Trying to fix this right now

Traceback (most recent call last):
  File "./beet", line 23, in <module>
    beets.ui.main()
  File "/home/bemeurer/src/beets/beets/ui/__init__.py", line 1256, in main
    _raw_main(args)
  File "/home/bemeurer/src/beets/beets/ui/__init__.py", line 1243, in _raw_main
    subcommand.func(lib, suboptions, subargs)
  File "/home/bemeurer/src/beets/beetsplug/absubmit.py", line 110, in command
    pool.map(self.analyze_submit, items)
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
    put(task)
  File "/usr/lib64/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/lib64/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: can't pickle _thread._local objects

Any ideas?

Contributor

bemeurer commented Aug 14, 2018

@sampsyo Almost! Trying to fix this right now

Traceback (most recent call last):
  File "./beet", line 23, in <module>
    beets.ui.main()
  File "/home/bemeurer/src/beets/beets/ui/__init__.py", line 1256, in main
    _raw_main(args)
  File "/home/bemeurer/src/beets/beets/ui/__init__.py", line 1243, in _raw_main
    subcommand.func(lib, suboptions, subargs)
  File "/home/bemeurer/src/beets/beetsplug/absubmit.py", line 110, in command
    pool.map(self.analyze_submit, items)
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
    put(task)
  File "/usr/lib64/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/lib64/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: can't pickle _thread._local objects

Any ideas?

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Aug 14, 2018

Member

Hmm, not exactly. Maybe the problem is that multiprocessing is trying to pickle self to send it across the channel there, which could be causing problems? Maybe it’s possible to use pdb to find out what obj is here.

Member

sampsyo commented Aug 14, 2018

Hmm, not exactly. Maybe the problem is that multiprocessing is trying to pickle self to send it across the channel there, which could be causing problems? Maybe it’s possible to use pdb to find out what obj is here.

@bemeurer

This comment has been minimized.

Show comment
Hide comment
@bemeurer

bemeurer Aug 14, 2018

Contributor

Good idea, on it!

Contributor

bemeurer commented Aug 14, 2018

Good idea, on it!

bemeurer added some commits Aug 14, 2018

@bemeurer bemeurer changed the title from [WIP] Parallelize absubmit to Parallelize absubmit Aug 14, 2018

@bemeurer

This comment has been minimized.

Show comment
Hide comment
@bemeurer

bemeurer Aug 14, 2018

Contributor

@sampsyo Tested and ready for merging!

Contributor

bemeurer commented Aug 14, 2018

@sampsyo Tested and ready for merging!

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Aug 14, 2018

Member

Hmm; threads do seem like the right thing here, but I’m slightly concerned that ThreadPool isn’t documented in the stdlib. Does this work on Python 2? If not, maybe we’d be better off with the thread pool in concurrent.futures?

Member

sampsyo commented Aug 14, 2018

Hmm; threads do seem like the right thing here, but I’m slightly concerned that ThreadPool isn’t documented in the stdlib. Does this work on Python 2? If not, maybe we’d be better off with the thread pool in concurrent.futures?

@bemeurer

This comment has been minimized.

Show comment
Hide comment
@bemeurer

bemeurer Aug 14, 2018

Contributor

Oh, I don't have Python2 on my machine. Let me install it and test.

Contributor

bemeurer commented Aug 14, 2018

Oh, I don't have Python2 on my machine. Let me install it and test.

@bemeurer

This comment has been minimized.

Show comment
Hide comment
@bemeurer

bemeurer Aug 14, 2018

Contributor

It explodes with Python2

Contributor

bemeurer commented Aug 14, 2018

It explodes with Python2

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Aug 14, 2018

Member

Got it. Well, it would be a perfectly defensible move to make this a Python-3-only feature and fall back to sequential execution on Python 2. If we do that, then the concurrent.futures route might be a more reliable route to get similar parallel map behavior here.

Member

sampsyo commented Aug 14, 2018

Got it. Well, it would be a perfectly defensible move to make this a Python-3-only feature and fall back to sequential execution on Python 2. If we do that, then the concurrent.futures route might be a more reliable route to get similar parallel map behavior here.

@bemeurer

This comment has been minimized.

Show comment
Hide comment
@bemeurer

bemeurer Aug 14, 2018

Contributor

@sampsyo I think I can make it works in parallel for both Python 2 and Python 3, but with slightly different techniques. I am giving it a shot right now :)

Contributor

bemeurer commented Aug 14, 2018

@sampsyo I think I can make it works in parallel for both Python 2 and Python 3, but with slightly different techniques. I am giving it a shot right now :)

bemeurer added some commits Aug 14, 2018

@bemeurer

This comment has been minimized.

Show comment
Hide comment
@bemeurer

bemeurer Aug 14, 2018

Contributor

@sampsyo I couldn't get it to work in Python 2, so I reverted to the nice original multithreding code for Python 3, and made it fallback to sequential on Python 2.

Contributor

bemeurer commented Aug 14, 2018

@sampsyo I couldn't get it to work in Python 2, so I reverted to the nice original multithreding code for Python 3, and made it fallback to sequential on Python 2.

@sampsyo sampsyo merged commit 4eafa40 into beetbox:master Aug 14, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

sampsyo added a commit that referenced this pull request Aug 14, 2018

sampsyo added a commit that referenced this pull request Aug 14, 2018

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Aug 14, 2018

Member

Looks fantastic; thank you for your help!

Member

sampsyo commented Aug 14, 2018

Looks fantastic; thank you for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment