Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI/CD refactor and switch to github actions #1620

Merged
merged 213 commits into from
Jan 23, 2022
Merged

CI/CD refactor and switch to github actions #1620

merged 213 commits into from
Jan 23, 2022

Conversation

lmmentel
Copy link
Contributor

@lmmentel lmmentel commented Nov 16, 2021

Reference Issues/PRs

See also #1150
Refactoring of github actions workflows started in #1321

Currently blocked by:

What does this implement/fix? Explain your changes.

This is a rather substantial effort involving changes to how sktime is build and distributed. The main changes are:

  • Add pyproject.toml (see PEP 621 and PEP 633]. The intention behind is to collect all package matadata and dependencies in one place.
  • Add sets of optional dependencies:
    • all_extras all "soft" dependencies
    • dev packages useful for development and testing
    • binder packages required for binder and running notebook examples
    • docs packages required to build the documentation
  • Add section to build time dependencies.
  • Refactor existing CI/CD pipelines to use the new build configuration.
  • Introduced a new CI/CD pipelines for running tests across win, mac, linux and py37, py38, py39 with github actions.
  • Removed the obsolete requirements files.
  • Include wheels build for macosx on arm64.

Removed

  • ❌ Drop support for python3.6
  • ❌ Drop appveyor and azure pipelines

High level diagram for the CI/CD workflows

sktime-cicd

Does your contribution introduce a new dependency? If yes, which one?

It does not introduce any new runtime/install dependencies but it could introduce an optional development depencency/build system such as poetry to simplify development/build/distribution.

What should a reviewer concentrate their feedback on?

This PR introduces changed to the CI/CD pipelines that would hopefully allow to drop appveyor and azure pipelines in favor of github actions which would simplify the current build and maintenance process.

Any other comments?

PR checklist

For all contributions
  • I've added myself to the list of contributors.
  • Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
  • I've added unit tests and made sure they pass locally.

@lmmentel
Copy link
Contributor Author

@mloning if this is a potential way forward it would be great to trigger the CI runs in order to check that everything runs as expected. I tried that locally with act but some of the steps fail since it's hard to reproduce the build env exactly.

@lmmentel
Copy link
Contributor Author

My initial plan is to simplify the pipelines:

  • try to run all on github actions since it support the main os hosts
  • use cibuildwheel to build wheels across the matrix of win, linux, mac and python 3.6, 3.7, 3.8, 3.9
  • publish to PyPI

After testing locally it seems that cibuildwheel does not work well with pure python distributions, an alternative would be to simply use python -m build --sdist --wheel.

Are there any C extensions for sktime?

@lmmentel lmmentel changed the title Add pyproject.toml to manage dependencies Add pyproject.toml and refactor CI pipelines Nov 20, 2021
@fkiraly
Copy link
Collaborator

fkiraly commented Nov 21, 2021

minor comment: is all (morally) the same as all_extras so far? If so, I'd suggest to keep the old name all_extras to avoid to break users' CI/CD which may use pip install sktime[all_extras] or similar.

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 22, 2021

it would be great to trigger the CI runs

Slightly weird, the familiar button to run the CI does not appear. Not sure why, is everything correct?
This is how it looks like to me:
image

@lmmentel
Copy link
Contributor Author

Hmmm, I'm not sure but maybe the unresolved file conflicts are still blocking? Couldn't find any relevant information in the docs about it so I'm guessing here.

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 22, 2021

might be - would you mind merge/resolving?

@lmmentel
Copy link
Contributor Author

I rebased on the latest main and resolved the conflict.

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 22, 2021

I rebased on the latest main and resolved the conflict.

Yes, that was it - now I can "approve and run". Good luck with the tests...

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 22, 2021

uh-oh, you made the linter unhappy, since there are no newlines at the end of the files!

I corrected your terrible mistake 😁 - I hope you don't mind, just so it runs through the interesting part.

@lmmentel
Copy link
Contributor Author

uh-oh, you made the linter unhappy, since there are no newlines at the end of the files!
I corrected your terrible mistake grin - I hope you don't mind, just so it runs through the interesting part.

Thanks!

@lmmentel
Copy link
Contributor Author

lmmentel commented Nov 22, 2021

I'm having some challenges understanding the build process and in particular the contents on the setup.py where a lot is happening. Any chance there exists documentation about it?

To be clear I understand the code but I'm missing the why.

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 22, 2021

To be clear I understand the code but I'm missing the why.

Same here - I think that only @mloning can answer, he created the original script a couple years ago.
Might be best to ping him on slack if you expect there to be a back/forth of questions.

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 22, 2021

I think some of it might be copy-pasted from scikit-learn, so it may be worth investigating there, to.

@lmmentel
Copy link
Contributor Author

It seems that a lot of the build code is quite incompatible with tools like build specifically

It's quite challenging to piece together what's really happening. On top of that both extensions

  • sktime/distances/elastic_cython.pyx
  • sktime/classification/shapelet_based/mrseql/mrseql.pyx

that seem to require all this extra configuration are marked for removal, and the share the following header:

# TODO remove in v0.10.0
# the functionality in this file is depreciated and to be replaced with a version
# based on numba.

Maybe it is worth to deal with that first since then the build pipeline might be simplified. Correct me if I'm wrong but replacing the current extensions with numba will leave only pure python code without necessity for compilation.

I'll join slack and try to initiate the discussion about this over there. I only joined discord but it seems a bit quiet. Is slack the preferred communication channel?

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 23, 2021

I'll join slack and try to initiate the discussion about this over there. I only joined discord but it seems a bit quiet. Is slack the preferred communication channel?

We've been using slack for developer communication, and discord for events mostly, like community meet-up or dev meetings.
User questions and troubleshooting comes in through three channels currently: GitHub discussions, gist, and discord.

We may want to clean up this proliferation of communication tools at some point, but that's just my opinion...

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 23, 2021

FYI, here's the "remove cython" project: https://github.com/alan-turing-institute/sktime/projects/24
@TonyBagnall runs the board. Once on slack, I'd recommend to ping @mloning, @freddyaboulton (for CI/CD) and @TonyBagnall (for cython removal)

@fkiraly fkiraly mentioned this pull request Nov 23, 2021
5 tasks
@lmmentel
Copy link
Contributor Author

FYI, here's the "remove cython" project: https://github.com/alan-turing-institute/sktime/projects/24 @TonyBagnall runs the board. Once on slack, I'd recommend to ping @mloning, @freddyaboulton (for CI/CD) and @TonyBagnall (for cython removal)

Great! There is already a thread on slack about this issue (you're tagged as well, provided I used the right handle 😉 )

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 24, 2021

There is already a thread on slack about this issue

Yes, I saw this after I posted only.

@mloning
Copy link
Contributor

mloning commented Nov 25, 2021

Hi @lmmentel, sorry for the slow reply, in case you're not aware of this already, we're having a CI/CD session tomorrow, would be good to discuss your suggested changes there.

I may have missed discussions elsewhere, a few general comments:

  • A lot of the setup.py related code is quite old and was originally copied from scikit-learn.
  • Moving everything to GitHub Actions would simplify things, but may slow down CI/CD due to the limit on parallel jobs. Moving away from appveyor would be good.
  • Using cibuildwheel seems like a good idea.

@lmmentel
Copy link
Contributor Author

@mloning thanks for the context! I wasn't aware of the CI/CD session, but It would be great if I could join the discussion. Although I'm trying to go step by step, the changes I'm exploring here are rather severe so it would definitely be useful to discuss and get broader perspective.

I don't have any numbers yet in terms of timing of different CI workflows but that would be interesting to collect.

@fkiraly
Copy link
Collaborator

fkiraly commented Nov 25, 2021

but It would be great if I could join the discussion

on discord, tomorrow as part of the community collaboration session starting at 1pm UTC. We'll might have some introductions to new community members and a stand-up first, so I'd expect the CI/CD overview to start around 1:15pm.

@mloning
Copy link
Contributor

mloning commented Nov 26, 2021

There's one other reason why relying on multiple platforms may be preferable over using a single one: sometimes CI platforms break down or make changes which break our pipelines and it may be easier to have some other platform to fall back on in these cases (makes it easier to isolate issues, prevents total standstill of dev work, etc).

@lmmentel
Copy link
Contributor Author

@mloning that is true although I would not underestimate the maintenance effort required to support multiple different technologies for essentially doing the same job. I like to think about it in terms of tradeoffs rather that optimal solutions since with a particular technology choice you gain some desired feature while incurring some debt.

@mloning
Copy link
Contributor

mloning commented Nov 27, 2021

@lmmentel yes good point! One other thing that would be nice to have is to use reusable actions (https://docs.github.com/en/actions/learn-github-actions/reusing-workflows), so that for example the CI pipelines use the same jobs for testing as the release pipeline.

@lmmentel
Copy link
Contributor Author

@fkiraly I rebased the branch on top of latest main. I had to do manual edits in several places and unfortunately cannot guarantee that the documentation has no errors since it was not a straighforward change given that same files we modified and then modified content was moved to different files. That makes it hard to track changes. Before merging consider build the docs locally to make sure they are as expected.

@lmmentel
Copy link
Contributor Author

@fkiraly should be ready to merge 🙂

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright!

Changes to the docs look good too, so ready to go.
I'll give it another skim, and then I'll press the button 😄

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, some minor things:

  • it should be "Martin Walter", not "Marting Walter"
  • while you're there, can you write out my first name (Franz) in the author/maintainer fields?
  • the binder set of dependencies includes LunarCalendar - why is that even there? It does not seem to be used for anything.

fkiraly
fkiraly previously approved these changes Jan 22, 2022
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, looks all good now!

pyproject.toml Outdated Show resolved Hide resolved
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just removed convertdate too - same problem, didn't seem to be used anywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants