Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler and worker fail in PyPy3 #2734

Open
superit23 opened this issue May 30, 2019 · 4 comments
Open

scheduler and worker fail in PyPy3 #2734

superit23 opened this issue May 30, 2019 · 4 comments

Comments

@superit23
Copy link

Python version: Python 3.5.3 (89428233efed, Apr 12 2018, 16:18:00)
Platform: pypy3-5.10
OS: Fedora 28 5.0.16-100.fc28.x86_64

Calling dask-scheduler or dask-worker fails due to GC tuning from #1653.

[root@localhost ~]$ dask-scheduler
Traceback (most recent call last):
  File "/usr/lib64/pypy3-5.10/bin/dask-scheduler", line 10, in <module>
    sys.exit(go())
  File "/usr/lib64/pypy3-5.10/site-packages/distributed/cli/dask_scheduler.py", line 245, in go
    main()
  File "/usr/lib64/pypy3-5.10/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib64/pypy3-5.10/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib64/pypy3-5.10/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib64/pypy3-5.10/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib64/pypy3-5.10/site-packages/distributed/cli/dask_scheduler.py", line 141, in main
    g0, g1, g2 = gc.get_threshold()  # https://github.com/dask/distributed/issues/1653

Commenting out these lines solves the problem. Maybe include a check using platform.python_implementation() before attempting to tune the GC.

@mrocklin
Copy link
Member

mrocklin commented May 31, 2019 via email

@superit23
Copy link
Author

Yeah, should be able to. Just gotta play with getting dev environment up.

@superit23
Copy link
Author

I'm having difficulties setting up the dev environment on a clean VM (Fedora Server 29, PyPy 6.0.0 (3.5.3)). The current versions of numpy and PyPy have a bug that prevents installation of the dev-requirements.

...
    Traceback (most recent call last):
      File "/usr/lib64/pypy3-6.0/site-packages/pkg_resources/__init__.py", line 359, in get_provider
        module = sys.modules[moduleOrReq]
    KeyError: 'numpy'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-jz_92hpa/pandas/setup.py", line 746, in <module>
        **setuptools_kwargs)
      File "/usr/lib64/pypy3-6.0/site-packages/setuptools/__init__.py", line 145, in setup
        return distutils.core.setup(**attrs)
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/core.py", line 148, in setup
        dist.run_commands()
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/dist.py", line 955, in run_commands
        self.run_command(cmd)
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/dist.py", line 974, in run_command
        cmd_obj.run()
      File "/usr/lib64/pypy3-6.0/site-packages/setuptools/command/install.py", line 61, in run
        return orig.install.run(self)
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/command/install.py", line 549, in run
        self.run_command('build')
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/dist.py", line 974, in run_command
        cmd_obj.run()
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/command/build.py", line 135, in run
        self.run_command(cmd_name)
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/dist.py", line 974, in run_command
        cmd_obj.run()
      File "/usr/lib64/pypy3-6.0/lib-python/3/distutils/command/build_ext.py", line 347, in run
        self.build_extensions()
      File "/tmp/pip-install-jz_92hpa/pandas/setup.py", line 373, in build_extensions
        build_ext.build_extensions(self)
      File "/tmp/pip-install-jz_92hpa/pandas/setup.py", line 125, in build_extensions
        numpy_incl = pkg_resources.resource_filename('numpy', 'core/include')
      File "/usr/lib64/pypy3-6.0/site-packages/pkg_resources/__init__.py", line 1144, in resource_filename
        return get_provider(package_or_requirement).get_resource_filename(
      File "/usr/lib64/pypy3-6.0/site-packages/pkg_resources/__init__.py", line 361, in get_provider
        __import__(moduleOrReq)
      File "/usr/lib64/pypy3-6.0/site-packages/numpy/__init__.py", line 142, in <module>
        from . import core
      File "/usr/lib64/pypy3-6.0/site-packages/numpy/core/__init__.py", line 40, in <module>
        from . import multiarray
      File "/usr/lib64/pypy3-6.0/site-packages/numpy/core/multiarray.py", line 44, in <module>
        _reconstruct.__module__ = 'numpy.core.multiarray'
    AttributeError: readonly attribute '__module__'
    ----------------------------------------
ERROR: Command "/usr/bin/pypy3 -u -c 'import setuptools, tokenize;__file__='"'"'/tmp/pip-install-jz_92hpa/pandas/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-5kwohlby/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-jz_92hpa/pandas/

After pinning numpy to 1.15.4, numpy and pandas successfully installed. However, running pytest throws this new error.

ImportError while importing test module '/root/Git/distributed/distributed/protocol/tests/test_pandas.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib64/pypy3-6.0/site-packages/pandas/__init__.py:26: in <module>
    from pandas._libs import (hashtable as _hashtable,
E   ImportError: /usr/lib64/pypy3-6.0/site-packages/pandas/_libs/hashtable.pypy3-60-x86_64-linux-gnu.so: undefined symbol: Py_IS_NAN

During handling of the above exception, another exception occurred:
distributed/protocol/tests/test_pandas.py:4: in <module>
    import pandas as pd
/usr/lib64/pypy3-6.0/site-packages/pandas/__init__.py:35: in <module>
    "the C extensions first.".format(module))
E   ImportError: C extension: /usr/lib64/pypy3-6.0/site-packages/pandas/_libs/hashtable.pypy3-60-x86_64-linux-gnu.so: undefined symbol: Py_IS_NAN not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.

I haven't been able to find information on how to solve this one. My pandas version and numpy do not seem to conflict and both were installed using pip.

[root@localhost distributed]# pipdeptree -p pandas
pandas==0.24.2
  - numpy [required: >=1.12.0, installed: 1.15.4]
  - python-dateutil [required: >=2.5.0, installed: 2.8.0]
    - six [required: >=1.5, installed: 1.12.0]
  - pytz [required: >=2011k, installed: 2019.1]

I could submit a PR, but I wouldn't be able to run the test suite with PyPy. I could still verify that the tests succeed with CPython, and that the worker and scheduler run with PyPy. How do we want to proceed?

@TomAugspurger
Copy link
Member

NumPy and pandas shouldn't be required. We may attempt to import them to register special serializers, but I believe all the tests requiring them should be skipped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants