Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate CI to GitHub Actions #2637

Open
ssanderson opened this issue Jan 28, 2020 · 1 comment
Open

Migrate CI to GitHub Actions #2637

ssanderson opened this issue Jan 28, 2020 · 1 comment
Labels

Comments

@ssanderson
Copy link
Contributor

ssanderson commented Jan 28, 2020

@danisim has been working on a PR at #2634 that adds a GitHub Actions based CI build. GitHub Actions seems like a big improvement over our current use of Travis and Appveyor, for a few reasons:

  • More Workers: GH Actions gives us more free workers than Travis or Appveyor. It also provides the option to self-host workers, which we could potentially use to scale up further.
  • Single Platform: We currently use Appveyor for our windows builds, and we don't have any OSX builds. GH Actions natively supports linux, osx, and windows, which would allow us to consolidate on a single platform.
  • Bigger Ecosystem: GH Actions has a large and growing community of people providing pre-built plugins that encapsulate best practices. It'd be great to be able to take advantage of these.

Since part of the goal of migrating to GH Actions would be to migrate away from our existing Travis/Appveyor builds, for this project to be successful we need to actually drop those builds. Those builds have accumulated quite a lot of complexity over the years, so to make progress here we need to decide what of that complexity we want to retain and what we should remove and/or replace.

Current Status

What Does Our Travis Build Do Today?

The basic flow of our travis build right now is as follows:

  1. Install conda via ci/travis/install_miniconda.sh.
  2. Set dependency versions. We currently build in the following configurations:
    • Python 2.7, Numpy 1.11.3, Pandas 0.18.1
    • Python 3.5, Numpy 1.11.3, Pandas 0.18.1
    • Python 3.5, Numpy 1.14.1, Pandas 0.22.0
  3. Build conda packages for all libraries listed in conda/. If we're running a build after a merge to master, upload the results to anaconda.org under the quantopian channel, and tag them with ci. We do this because many of zipline's dependencies to not have conda packages in the default channel for the versions of python and numpy that we target. A few of our dependencies just don't have conda packages at all, but many of them simply don't have packages for our currently-supported versions.
  4. Create a conda environment and use it to install python, numpy, pandas, and a few other cherry-picked dependencies.
  5. Activate the conda environment, and use pip to install our remaining dependencies.
  6. Run flake8.
  7. Run the zipline test suite.
  8. Run a conda build of the zipline package.
  9. Install the built conda package, using the dependencies created in (3).
  10. If we're running on a merge to master, upload the built zipline package to anaconda.org.

What Does Our Appveyor Build Do Today?

Basically the same thing as Travis, but on Windows. We also install a few more packages via conda, presumably because there aren't wheels for those package versions that can be installed via pip.

Takeaways

As is hopefully clear from the writeup above, most of the complexity in the Travis/Appveyor build comes from the fact that we use a mix of conda and pip to install, and that we build our own conda packages as part of the zipline build. The fact that we use conda in this way is driven by a few historical factors, many of which (I believe) are no longer applicable to us:

  1. Historically, we used conda to install numpy/python/pandas, because conda provided pre-compiled binaries, and we didn't want to waste time during the build compiling stuff. We also didn't want to have to deal with the hassle of building our own packages on windows. These concerns are less important today because, at least for modern package versions, most of our dependencies provide wheels for the packages we care about. There's also good built-in support for caching dependency installs as a built-in action, so I think we can avoid most of the concerns about taking time building dependencies.
  2. Historically, we built our own conda packages, because there weren't generally-available conda packages for many of our dependencies. Today, conda forge is widely used, and I suspect that many of our home-built packages could be pulled from conda forge instead. The others could likely be added to conda forge by us. This would most likely only work for newer versions of those packages though.
  3. All of this conda machinery primarily exists to support non-Quantopian users of Zipline. Most internal users at Q install zipline as part of larger systems that use pip for dependency management. In particular, the teams that primarily work on Zipline generally don't use conda at all, and they work on either linux or OSX. This means that the Zipline conda machinery, especially the windows machinery, is not used or well understood by most of Zipline development team.

Proposal

I propose that we stop building conda packages for Zipline's dependencies. Instead, we should direct zipline users to install dependencies from conda forge (which I believe is now used by conda by default anyway). This will most likely result in us dropping support for using conda to install zipline with old versions of numpy/pandas/python, but I expect that that legacy version support is primarily used by Quantopian, and we don't use conda to install with these old versions anyway.

Assuming we can stop building our dependent conda packages, I think we can radically simplify our GitHub Actions build. The structure of the new build would be, essentially:

GitHub Actions CI Build

  1. Create a new virtualenv using the desired Python version.
  2. Install required versions of numpy and Cython. This currently needs to be a separate step, but could be improved in the future by using pyproject.toml. We should cache these installs and re-use them if possible.
  3. Install requirements.
  4. Run lint checks.
  5. Run tests.

On linux, I'd expect the above structure to work for any set of package versions. On windows, the above might require that there exist pre-compiled wheels for any of our binary packages (although it looks like the windows env provided by GitHub comes with Visual Studio, so we should be able to build ourselves).

Replacing Our Conda Build Infrastructure

Our current conda build infrastructure serves three primary purposes:

  1. It builds conda packages for Zipline's dependencies so that conda users can install them.
  2. It builds a conda package for Zipline itself.
  3. It ensures that Zipline's conda package installs properly and works as expected.

My proposal is to eliminate (1) in favor of conda forge, but (2) and (3) are still potentially useful. In particular, it probably still makes sense to have some automated system be responsible for building Zipline's conda packages. I think the interesting question is whether we should still do this "ourselves" as part of the zipline build, or if we should set up zipline on conda forge.

Open Questions

  • Assuming people are in favor of this proposal, it's going to take some time to migrate the conda machinery to conda forge. Do we need to delay switching to GitHub actions until that happens? I think my preferred answer to this is no, but I'm curious what others think.
  • Once we've switched over to actions, what package/environment versions should we be testing. I think we need at least the existing versions supported by the travis/appveyor matrices, but one of the goals for this project should be to make it easier for us to add support for newer python/pandas versions (see discussion at Support for Newer Python Versions #2616).
  • How does this affect CI for Quantopian's other projects. I think alphalens, pyfolio, and empyrical may be implicitly depending on the conda packages we build here.
@richafrank
Copy link
Member

xref #2665

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants