-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError: sequence item 0: expected str instance, NoneType found on running python setup.py java on source #127
Comments
Thanks again!
Thanks for testing all that out. As you see, dask-sql does not have a very good support for non-conda installations (so far). |
Hello @nils-braun |
I am planning to do a patch release soon, might already be in the next days. Hopefully it will be easier after that! |
Thanks again for still testing even after running against all those blockers. I have added a better error message in the related PR for the error message you have mentioned and some more documentation. After the PR is merged, you should be able to install the development requirements with
|
* Change conda.yaml to conda.txt and make pip installation more clear * Give better error message on missing maven (see #127) * Missing change in Dockerfile * Added dev requirements
It got much further but still fails some tests $ git clone https://github.com/nils-braun/dask-sql.git
$ cd dask-sql
$ sudo apt install maven
$ pip install pytest-cov
$ python setup.py java
$ pytest tests
=============================================== short test summary info ================================================
FAILED tests/integration/test_analyze.py::test_analyze - TypeError: assert_frame_equal() got an unexpected keyword ar...
FAILED tests/integration/test_groupby.py::test_group_by_nan - AssertionError: Attributes of DataFrame.iloc[:, 0] (col...
FAILED tests/integration/test_model.py::test_training_and_prediction - ModuleNotFoundError: No module named 'dask_ml'
FAILED tests/integration/test_model.py::test_clustering_and_prediction - ValueError: Can not import model dask_ml.clu...
FAILED tests/integration/test_model.py::test_iterative_and_prediction - ModuleNotFoundError: No module named 'dask_ml'
================================ 5 failed, 119 passed, 3 skipped, 8 warnings in 47.94s ================================= |
$ pip install "dask[complete]"
$ pip install dask-ml
$ pytest tests
================================================================= short test summary info ==================================================================
FAILED tests/integration/test_analyze.py::test_analyze - TypeError: assert_frame_equal() got an unexpected keyword argument 'atol'
FAILED tests/integration/test_groupby.py::test_group_by_nan - AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="c") are different
================================================== 2 failed, 122 passed, 3 skipped, 6 warnings in 55.26s =================================================== the first error seems to be a pandas version error: $ python
Python 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'1.0.5' |
Right! Just for your information, I have added |
Ah, you edited your answer in the same moment that I did. Very good! So it seems there is some incompatibility. Could you try with 1.1.5 again if possible? |
just updated: $ pip install --upgrade pandas
Collecting pandas
Downloading pandas-1.2.1-cp37-cp37m-manylinux1_x86_64.whl (9.9 MB)
|████████████████████████████████| 9.9 MB 5.5 MB/s
Requirement already satisfied, skipping upgrade: pytz>=2017.3 in /home/saulo/anaconda3/lib/python3.7/site-packages (from pandas) (2020.1)
Requirement already satisfied, skipping upgrade: numpy>=1.16.5 in /home/saulo/anaconda3/lib/python3.7/site-packages (from pandas) (1.19.4)
Requirement already satisfied, skipping upgrade: python-dateutil>=2.7.3 in /home/saulo/anaconda3/lib/python3.7/site-packages (from pandas) (2.8.1)
Requirement already satisfied, skipping upgrade: six>=1.5 in /home/saulo/anaconda3/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
ERROR: dask-sql 0.3.0 has requirement pandas<1.2.0, but you'll have pandas 1.2.1 which is incompatible.
Installing collected packages: pandas
Attempting uninstall: pandas
Found existing installation: pandas 1.0.5
Uninstalling pandas-1.0.5:
Successfully uninstalled pandas-1.0.5
Successfully installed pandas-1.2.1 despite the error it passed the test with some warnings ===================================================================== warnings summary =====================================================================
tests/integration/test_rex.py::test_like
tests/integration/test_rex.py::test_date_functions
/mnt/d/Programs/dask/dask-sql/dask_sql/context.py:201: DeprecationWarning: register_dask_table is deprecated, use the more general create_table instead.
DeprecationWarning,
tests/integration/test_rex.py::test_math_operations
tests/integration/test_rex.py::test_math_operations
/home/saulo/anaconda3/lib/python3.7/site-packages/pandas/core/arraylike.py:358: RuntimeWarning: invalid value encountered in arcsin
result = getattr(ufunc, method)(*inputs, **kwargs)
tests/integration/test_rex.py::test_math_operations
tests/integration/test_rex.py::test_math_operations
/home/saulo/anaconda3/lib/python3.7/site-packages/pandas/core/arraylike.py:358: RuntimeWarning: invalid value encountered in arccos
result = getattr(ufunc, method)(*inputs, **kwargs)
tests/integration/test_rex.py::test_date_functions
/home/saulo/anaconda3/lib/python3.7/site-packages/dask/dataframe/accessor.py:88: FutureWarning: Series.dt.weekofyear and Series.dt.week have been deprecated. Please use Series.dt.isocalendar().week instead.
if callable(getattr(self._meta, key)):
tests/integration/test_rex.py::test_date_functions
tests/integration/test_rex.py::test_date_functions
/home/saulo/anaconda3/lib/python3.7/site-packages/dask/dataframe/accessor.py:43: FutureWarning: Series.dt.weekofyear and Series.dt.week have been deprecated. Please use Series.dt.isocalendar().week instead.
out = getattr(getattr(obj, accessor, obj), attr)
-- Docs: https://docs.pytest.org/en/latest/warnings.html
======================================================= 124 passed, 3 skipped, 9 warnings in 51.46s ======================================================== |
The warnings are fine (its deprecations in the pandas <-> dask interface). Thank you so much for testing this all out. I will increase the minimal required pandas version to 1.1.0 (which works, I just tested). |
Always a pleasure to help and to get a great tool to the toolbox :) |
After running $ python dask-sql-test.py
name id x
0 Tim 1017 0.999988
1 Norbert 994 0.999949
0 Frank 983 0.999970
0 Oliver 990 0.999987
1 Dan 979 0.999991
0 Michael 1021 0.999992
0 Quinn 1012 0.999973
1 Xavier 986 0.999925
0 Charlie 961 0.999986
1 Alice 1003 0.999981
0 Ingrid 1030 0.999991
0 Zelda 1050 0.999923
0 Sarah 1084 0.999987
0 Edith 1013 0.999989
0 Ursula 1015 0.999990
1 Patricia 974 0.999999
0 Jerry 994 0.999999
0 Wendy 1000 0.999990
0 Laura 1014 0.999986
0 Ray 975 0.999939
1 Hannah 940 0.999986
0 Yvonne 1033 0.999981
0 Bob 976 0.999978
0 George 1026 0.999993
1 Kevin 992 0.999981
0 Victor 983 0.999997
0.9999788188689883 |
I have added a PR in #129 that fixes the tests for pandas 1.0 and 1.1. Now the requirement is actually only >=1.0 and <1.2 (the latter we still need, due to dask/dask#7156) |
The upper pandas version requirement is gone now - the problem in dask is fixed. |
$ java -version openjdk version "14.0.2" 2020-07-14 OpenJDK Runtime Environment (build 14.0.2+12-Ubuntu-120.04) OpenJDK 64-Bit Server VM (build 14.0.2+12-Ubuntu-120.04, mixed mode, sharing)
The text was updated successfully, but these errors were encountered: