Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pending python 3 upgrade tasks #111

Open
alastair opened this issue Jun 25, 2020 · 3 comments
Open

pending python 3 upgrade tasks #111

alastair opened this issue Jun 25, 2020 · 3 comments

Comments

@alastair
Copy link
Member

alastair commented Jun 25, 2020

✔️ This item is fixed in #115

There is still some python code which only runs in python 2.
Most importantly are uses of basestring and unicode:

if not isinstance(key, basestring):
raise TypeError('A key in the map is not a string; can not convert it')
# until python 3, everything should always be utf-8 encoded bytestrings
if isinstance(key, unicode):
key = key.encode('utf-8')
if isinstance(value, unicode):
value = value.encode('utf-8')

These will have to be converted, but we have to understand the use of the types in this method. Theoretically both basestring and unicode can be changed to str, but in python 3, we should double-check where items should be actual strings, and where they should be encoded bytes.

@alastair
Copy link
Member Author

config = cPickle.load(sys.stdin)

This doesn't work in python3, as pickle.load requires bytes, but sys.stdin is a string. Ideally this would be fixed by #96, allowing us to remove cluster mode, but in the meantime we should work out a solution to this specific problem.

@alastair
Copy link
Member Author

alastair commented Jun 26, 2020

In python 3, taskhash is generating different hashes for the same combination of parameters:

when I run this script multiple times, it doesn't skip already done jobs as it does in python 2:

Running the same script multiple types in the same docker container results in different hashes each time. There is a random seed set in the project file.

I've replaced it with sha1(json.dumps(config)), and now it's stable

@alastair
Copy link
Member Author

Running in python 3, clustermode=False, after about 290 jobs I get an exception

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.6/concurrent/futures/process.py", line 295, in _queue_management_worker
    shutdown_worker()
  File "/usr/lib/python3.6/concurrent/futures/process.py", line 253, in shutdown_worker
    call_queue.put_nowait(None)
  File "/usr/lib/python3.6/multiprocessing/queues.py", line 129, in put_nowait
    return self.put(obj, False)
  File "/usr/lib/python3.6/multiprocessing/queues.py", line 83, in put
    raise Full
queue.Full

This has happened more than once. Is something not cleaning up properly? Maybe a difference between concurrent.futures in python 3.6 and the backport we use in python 2?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant