Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Restore PySpark tests #2212

Conversation

timothydijamco
Copy link
Contributor

See #2201.

This PR is for restoring/troubleshooting the PySpark tests (but not the Spark tests) in CI (the Spark tests have issues on their own).

Locally I'm seeing the same failing PySpark tests that were mentioned in #2201, and I expect the CI for this PR to also fail with the same tests for now

Trying to do this in parallel with #2205 which has other CI config improvements I'd guess we want (probably some merging in the future)

@timothydijamco
Copy link
Contributor Author

It looks like the CI will fail before it even gets to the tests, until #2211 is in

--

In the meantime, troubleshooting progress notes:

Currently, these PySpark tests fail (locally, but probably also will in the CI here too):

FAILED ibis/tests/test_version.py::test_import_time - assert 4.270475671000895 < 2.0
FAILED ibis/tests/all/test_temporal.py::test_strftime[PySpark-%Y%m%d-%Y%m%d] - py4j.protocol.Py4JJavaError: An error occurred while calling o...
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_scalar[PySpark-2017-01-03-1-Tuesday] - py4j.protocol.Py4JJavaError: An error occurre...
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_scalar[PySpark-2017-01-04-2-Wednesday] - py4j.protocol.Py4JJavaError: An error occur...
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_scalar[PySpark-2017-01-06-4-Friday] - py4j.protocol.Py4JJavaError: An error occurred...
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_scalar[PySpark-2017-01-01-6-Sunday] - py4j.protocol.Py4JJavaError: An error occurred...
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_scalar[PySpark-2017-01-05-3-Thursday] - py4j.protocol.Py4JJavaError: An error occurr...
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_scalar[PySpark-2017-01-07-5-Saturday] - py4j.protocol.Py4JJavaError: An error occurr...
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_scalar[PySpark-2017-01-02-0-Monday] - py4j.protocol.Py4JJavaError: An error occurred...
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_column_group_by[PySpark-<lambda>-<lambda>0] - py4j.protocol.Py4JJavaError: An error ...
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_column[PySpark] - py4j.protocol.Py4JJavaError: An error occurred while calling o671....
FAILED ibis/tests/all/test_temporal.py::test_day_of_week_column_group_by[PySpark-<lambda>-<lambda>1] - py4j.protocol.Py4JJavaError: An error ...
FAILED ibis/tests/all/test_vectorized_udf.py::test_elementwise_udf[PySpark] - py4j.protocol.Py4JJavaError: An error occurred while calling o7...
======================== 13 failed, 4302 passed, 1098 skipped, 243 xfailed, 2 xpassed, 15 warnings in 304.33s (0:05:04) ========================

In requirements-3.6-dev.yml and requirements-3.7-dev.yml, changing

  - pyarrow>=0.13

to

  - pyarrow>=0.13, <0.15.0

makes the tests pass (locally) for 3.6 and 3.7

However, when pymapd is re-enabled (as it will be in #2205), it seems like the requirements conflict according to Conda

Will continue troubleshooting

@jreback
Copy link
Contributor

jreback commented May 20, 2020

@timothydijamco so what we want to do is have a separate .yaml for each of the 3 backend sets, so there are no conflicts :->

IOW simply add one with as permission deps as you can

Copy link
Contributor

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm refactoring all this code in #2205. The problem with Spark is not that much adding it back, but that the spark tests are failing (that's why they were disabled). May be you want to have a look at that?

@@ -81,7 +96,10 @@ jobs:
displayName: 'Setup BigQuery credentials'
condition: eq(variables['System.PullRequest.IsFork'], 'False')

- bash: make start PYTHON_VERSION=$PYTHON_VERSION BACKENDS="${BACKENDS}"
- bash: |
if [ ! -z "${BACKENDS}" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think -n is the opposite of -z ;)

@datapythonista
Copy link
Contributor

@timothydijamco I think you'll want to close this PR. #2205 has been merged, and I don't think the changes here are relevant anymore. To restore spark, it should be "as easy" as to fix the spark tests, and then just remove the not pyspark stuff in this line: https://github.com/ibis-project/ibis/blob/master/ci/azure/linux.yml#L20

Closing this, but let me know if you disagree and want to continue here, happy to reopen.

@timothydijamco
Copy link
Contributor Author

@datapythonista
Thanks for the comments! I think we can keep this open to track troubleshooting the PySpark tests—I wasn't really wedded to any particular way of re-enabling the PySpark tests in the CI because I started this in parallel with #2205 which has other CI config improvements, but I did want to re-enable the PySpark tests in some way in this PR from the start (88ad219) to see how the CI would behave as I did the troubleshooting

I'll do this (below) now for the re-enabling aspect since #2205 is in:

just remove the not pyspark stuff in this line: https://github.com/ibis-project/ibis/blob/master/ci/azure/linux.yml#L20

@jreback
I'll try that out, thanks

@datapythonista
Copy link
Contributor

Sure, but I think it'll be easier to start in a new PR, the conflicts here won't be trivial to fix.

@timothydijamco
Copy link
Contributor Author

@datapythonista
I think you're right, I'll start a new PR to keep things clean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants