Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reenable development build, uses upstream libraries #4696

Merged
merged 7 commits into from Apr 15, 2019

Conversation

Projects
None yet
4 participants
@pentschev
Copy link
Member

commented Apr 12, 2019

It's been suggested that we reenable development builds, using some libraries from upstream. We'll try here to find a useful yet reliable way of testing those cases for everyday use.

@jakirkham @jcrist @mrocklin @hameerabbasi

  • Tests added / passed
  • Passes flake8 dask

@pentschev pentschev referenced this pull request Apr 12, 2019

Merged

Fixed mean() and moment() on sparse arrays #4525

0 of 2 tasks complete
@hameerabbasi
Copy link
Contributor

left a comment

LGTM

@@ -52,15 +52,15 @@ jobs:
- *no_optimize
- *imports

- env: &py36_dev
- env: &py37_dev

This comment has been minimized.

Copy link
@jcrist

jcrist Apr 12, 2019

Member

I'd prefer that this build was marked as an allowed failure - frequently upstream libraries make changes that break our test suite. While we should fix these (either by reporting fixes up stream, or by patching locally), they result in our tests being red for a long amount of time. A test suite that is always red, but has "known failures that we're ignoring for now" isn't useful. Marking it as an allowed failure allows for developers to try and fix these failures in the CI (tests are still run, can see results), but doesn't block normal development.

This comment has been minimized.

Copy link
@pentschev

pentschev Apr 12, 2019

Author Member

@jcrist I agree with your point, a test suite that is always red isn't useful at all. However, having I feel that having an entire build that is allowed to fail or is manually triggered is probably just as useless as the previous case, it will certainly get forgotten/ignored by people.

Honestly, I don't know what's a well-balanced solution for this situation, which is very likely the reason why that development build got disabled in the first place.

I think the only possible uses for such a build is to either use it as a means to track down when a change was introduced that broke something, or as a nightly build that may indeed fail without disturbing contributors with PRs failing all the time. Both cases would involve someone manually checking for issues frequently, which I don't know if it's realistically useful.

This comment has been minimized.

Copy link
@jcrist

jcrist Apr 12, 2019

Member

I see a dev build as useful for having CI introspection when debugging fixes for upstream dependencies (say I want to use pyarrow dev with dask, it's nice to use the CI to double-check that the things I fix lead to things working with pyarrow dev). However, with the number of upstream libraries we care about, and the pace of development for these libraries, having a dev build has led to frequent breakages in the past. While we could try and stay on top of these, it's often easier just to wait for the next release and fix them all then. I suspect the latter approach uses less dev time overall (there is some overhead for each patch we make to keep up with e.g. pandas master).

In my comment here I suggested that we either make the build an allowed failure or use travis conditional build logic to make running it optional. This still allows for checking if changes fix existing upstream bugs without slowing down our development pace.

This comment has been minimized.

Copy link
@pentschev

pentschev Apr 12, 2019

Author Member

Fair enough. I'd say having such a dev build running nightly (assuming Travis allows that, of course) would then be a useful thing to have, this way we can track how upstream is developing without us having to trigger that build manually.

This comment has been minimized.

Copy link
@jcrist

jcrist Apr 12, 2019

Member

I suggest still allowing a way to run the build in a PR with a conditional build. We already do this for the hdfs tests:

dask/.travis.yml

Lines 71 to 78 in c9285c5

- env:
- TEST_HDFS='true'
if: type != pull_request OR commit_message =~ test-hdfs # Skip on PRS unless the commit message contains "test-hdfs"
sudo: true
services:
- docker
before_install:
- source continuous_integration/hdfs/startup_hdfs.sh

I think the following should work:

- env
  if: type == cron OR commit_message =~ test-upstream  # Run on cron jobs or any build where the commit message contains "test-upstream"
  ...

We'd also want to add:

- env:
  if: type != cron  # don't run on cron builds
  ...

to the other builds.

We can then add a cron build as described here: https://docs.travis-ci.com/user/cron-jobs/

This comment has been minimized.

Copy link
@pentschev

pentschev Apr 15, 2019

Author Member

@jcrist thanks for the directions, and yes, this is what meant, to allow triggering during PR and nightly.

@jakirkham @mrocklin @hameerabbasi what do you guys think of having this dev build running nightly in addition to the possibility of triggering it during PRs?

This comment has been minimized.

Copy link
@pentschev

pentschev Apr 15, 2019

Author Member

Btw, I think we should also build the other builds nightly as well, I can't think of a reason not to at least.

This comment has been minimized.

Copy link
@hameerabbasi

hameerabbasi Apr 15, 2019

Contributor

Per-commit status is always good, especially for auto-merged code. Just because a PR succeeded, doesn't mean that merged master will. :-)

@jakirkham

This comment has been minimized.

Copy link
Member

commented Apr 12, 2019

@pentschev, I pushed a commit to try and resolve the merge conflict introduced by merging PR ( #4525 ). Hope that is ok.

Based on your comment in the aforementioned PR, I think we should just pin Numba to the last know working version (i.e. 0.41.0) for now to fix the CI.

@pentschev

This comment has been minimized.

Copy link
Member Author

commented Apr 12, 2019

Thanks @jakirkham, I think this is probably the best solution we can get for now!

@hameerabbasi

This comment has been minimized.

Copy link
Contributor

commented Apr 12, 2019

If someone could get me a reproducer (with versions), I'd like to check once more whether this is an issue with sparse or Numba.

@jakirkham

This comment has been minimized.

Copy link
Member

commented Apr 12, 2019

The CI seems to reproduce it, but am having trouble locally. It may be I'm not reproducing the same environment yet.

FWIW I noticed that numba in conda-forge is pinned to numpy version 1.15.*, which could be related (as the same version shows up on CI). This would mean we have a version of numpy that doesn't support the __array_function__ protocol.

@hameerabbasi

This comment has been minimized.

Copy link
Contributor

commented Apr 12, 2019

Hmm. That should have no bearing on whether or not sparse works...

@pentschev

This comment has been minimized.

Copy link
Member Author

commented Apr 12, 2019

@hameerabbasi sorry, I don't think I understand your question, versions of what? What I observed from Travis' log seems to indicate the most likely cause is the new version of Numba.

This is a diff (only the relevant part, I omitted overhead of packages that are exactly the same) I compiled from the passing Python 3.7 build, and failing development builds:

--- pass        2019-04-12 22:53:53.505918282 +0200
+++ fail        2019-04-12 22:53:53.737914841 +0200
-arrow-cpp                 0.13.0           py37h73cd1b8_0    conda-forge
+arrow-cpp                 0.11.1           py37h5c3f529_0
-boost-cpp                 1.68.0            h11c811c_1000    conda-forge
-brotli                    1.0.7             hf484d3e_1000    conda-forge
-certifi                   2019.3.9                 py37_0    conda-forge
+certifi                   2019.3.9                 py37_0
-dask                      1.1.5                      py_0    conda-forge
+dask                      1.1.5                      py_0    conda-forge
-gflags                    2.2.2             hf484d3e_1001    conda-forge
+gflags                    2.2.2                he6710b0_0
-glog                      0.3.5             hf484d3e_1001    conda-forge
+glog                      0.3.5                hf484d3e_1
+libboost                  1.67.0               h46d08c1_4
+libevent                  2.1.8                h1ba5d50_0
-libprotobuf               3.7.1                h8b12597_0    conda-forge
-llvmlite                  0.26.0          py37hdbcaa40_1000    conda-forge
+llvmlite                  0.28.0           py37hdbcaa40_0    conda-forge
-numba                     0.41.0          py37h637b7d7_1000    conda-forge
+numba                     0.43.1           py37he1b5a44_0    conda-forge
-numpy                     1.16.2           py37h8b7e671_1    conda-forge
+numpy                     1.17.0.dev0+96d892a           <pip>
-pandas                    0.24.2           py37hf484d3e_0    conda-forge
+pandas                    0.25.0.dev0+390.ge02ec8f89           <pip>
-parquet-cpp               1.5.1                         2    conda-forge
+parquet-cpp               1.4.0+20.nightly               0    twosigma
-pyarrow                   0.13.0           py37h0978efd_0    conda-forge
+pyarrow                   0.11.1           py37he6710b0_0
-re2                       2019.04.01           he1b5a44_0    conda-forge
-snappy                    1.1.7             hf484d3e_1002    conda-forge
+snappy                    1.1.7                hbae5bb6_3
-sparse                    0.7.0                     <pip>
+sparse                    0.7.0+6.g948cee7           <pip>
-thrift-cpp                0.12.0            h0a07b25_1002    conda-forge
+thrift-cpp                0.11.0               h02b749d_3
-zict                      0.1.4                      py_0    conda-forge
+zict                      0.1.4                      py_0    conda-forge
-zstd                      1.3.3                         1    conda-forge
+zstd                      1.3.3                h84994c4_0
@hameerabbasi

This comment has been minimized.

Copy link
Contributor

commented Apr 13, 2019

Hmm... This is weird. The only sparse dependency in that list is Numba, and I can't seem to reproduce the error with the exact same Numba version and sparse master...

The entire test suite passes for me, other than the distributed failures I already mentioned in another thread.

The entire sparse test suite also passes with that Numba version.

I made sure to get the relevant dependencies from conda-forge as well.

@pentschev

This comment has been minimized.

Copy link
Member Author

commented Apr 15, 2019

@hameerabbasi I've adapted a Dockerfile that I use sometimes to reproduce Dask Travis' builds locally for debugging, I was able to reproduce the errors we see in this PR. If you would like to try, please make a copy of https://gist.github.com/pentschev/dcbe78364640f3e143795f03c6faaed8 in an empty directory dask-devbuild or some other name that you'd prefer and run it as follows:

cd dask-devbuild
docker build -t dask-devbuild .
docker run -it dask-devbuild /bin/bash -c "cd /usr/src/dask; source /root/miniconda/etc/profile.d/conda.sh; conda activate test-environment; source ./continuous_integration/travis/run_tests.sh"

Let me know if that works for you.

@hameerabbasi
Copy link
Contributor

left a comment

I would prefer manual triggering, yes.

However, about nightlies, I’m only +1 on them if they don’t make the main badge red.

@jcrist

This comment has been minimized.

Copy link
Member

commented Apr 15, 2019

To make the build an allowed failure, you need to update:

  allow_failures:
    - env: *py36_dev
    - os: osx

to

  allow_failures:
    - env: *py37_dev
    - os: osx
@pentschev

This comment has been minimized.

Copy link
Member Author

commented Apr 15, 2019

Now it looks like Travis completely ignored building this PR, but I don't see why that happens.

Show resolved Hide resolved .travis.yml Outdated
@jcrist

This comment has been minimized.

Copy link
Member

commented Apr 15, 2019

Looks good to me, good to merge?

@hameerabbasi

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

Maybe a test commit to check it?

@hameerabbasi

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

LGTM. I guess we can merge.

@pentschev

This comment has been minimized.

Copy link
Member Author

commented Apr 15, 2019

Let's just wait for this last build to pass, which seems like it will now (allowed failure properly flagged by Travis), after that, it can be merged.

@jcrist

This comment has been minimized.

Copy link
Member

commented Apr 15, 2019

LGTM. Merging. Thanks @pentschev!

@jcrist jcrist merged commit 2e0a315 into dask:master Apr 15, 2019

3 checks passed

codecov/project 91.24% (target 90%)
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@pentschev

This comment has been minimized.

Copy link
Member Author

commented Apr 15, 2019

Thanks everyone for the reviews!

@pentschev pentschev deleted the pentschev:reenable-development-build branch Apr 17, 2019

asmith26 added a commit to asmith26/dask that referenced this pull request Apr 22, 2019

Reenable development build, uses upstream libraries (dask#4696)
* Reenable development build, uses upstream libraries

* Change development build trigger and failure conditions

* Check test-upstream build

* Fix .travis.yml

* Build test-upstream

jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this pull request May 14, 2019

Reenable development build, uses upstream libraries (dask#4696)
* Reenable development build, uses upstream libraries

* Change development build trigger and failure conditions

* Check test-upstream build

* Fix .travis.yml

* Build test-upstream
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.