-
Notifications
You must be signed in to change notification settings - Fork 549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(ci): move to poetry #2937
Conversation
Hello @cpcloud! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2021-10-15 18:02:14 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to add poetry.lock
to the repo. This will freeze versions, we won't be aware if installing Ibis with the last dependencies is broken. And will also force us to regenerate the file. Or does the linguist-generated
means some magic I'm unaware of?
|
Yes, that is their entire reason for existence :)
we can check this in CI with another job
Yes, that's a good thing. It prevents two people from having different sets of dependencies. Additionally, we can use dependabot to handle the boring process of upgrading existing dependencies on a regular basis. Lock files are a really great way of preventing a large swath of "work on my machine" problems. |
On the other hand, |
I fail to see your point. How I see things, there are two different kind of projects:
I guess I'm missing something, but |
I think this is a misunderstanding of how lock files work. Dependencies are constrained by With that in mind, how does checking in |
As easy as |
@jreback |
my main concern here is to have a lower bound of deps that are tested in the CI that are the oldest supported version. we don't have a reason to bump pyarrow for example from 1.* i think (which is where we had it). ibis likely just works on the oldest versions so we should allow as liberal as possible. now in a future version of course we should bump, but i don't see why we would do that now. |
@jreback I've added some minimum version testing jobs, let's see what shakes out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't understand the new division of builds/backends. Maybe I just got used to what we've got. But those only_backends
, linux_only
... builds are not very clear to me. Do you mind explaining a bit why these new builds are better than what we've got now please?
.github/workflows/main.yml
Outdated
python-version: | ||
- "3.7" | ||
- "3.9" | ||
pyspark: | ||
- "2.4.3" | ||
- "3" | ||
pyarrow: | ||
- "0.12.1" | ||
- "5" | ||
exclude: | ||
- python-version: "3.7" | ||
pyspark: "3" | ||
pyarrow: "0.12.1" | ||
- python-version: "3.9" | ||
pyspark: "3" | ||
pyarrow: "0.12.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally find this a step backewards to what we've got now. If these versions can be obtained from the pyproject.toml or poetry files, my preference would be to keep what we've got now (files in ci/deps/...
).
There are two very different things in question here, let's discuss them individually:
There are a handful of other jobs whose role is hopefully clear from their name, e.g.,
|
Thanks for all the information, and for all the work on this. While I agree with your points, and there are clearly some advantages to this approach, there is a trade-off, and we're losing several things:
We had lots of problems with the CI in the past, I spent monts to make it simple, robust and fast. And this PR to me seems to be adding a lot of complexity, and slower builds, for no gain. I think this PR is making several independent changes (I may be wrong, but I don't think removing the shell scripts in favor of github actions, or the new job division is related to poetry). Is any of this needed for the automation of packaging? |
We can keep
Can you spell out the logic of Generally this script seems to be doing a huge number of unnecessarily manual operations, despite it being only 77 lines. This is abuse of Bash in the name of concision. To my eye this is what's happening:
By my calculation I'm adding 611 (line count of How are you calculating 700+ additional lines?
Which one are you talking about? There are two.
Ultimately yes. I want to move to a model where we reuse github actions, even if that means repeating some of the configuration. That configuration hardly ever changes. Why should we continue to roll our own scripts that are clearly generating broken CI jobs? |
If we had a single job, I'd surely prefer your approach to setting up the conda environment. You don't like
Instead, we have that script implemented once, and in each job we simply have:
I'm open to your approach, even if I don't think it adds value, and it creates a lot of duplication. But I think this discussion will be much more efficient, if you propose each of the changes here in a separate PR. There are many unrelated things that are being mixed, and it's difficult to know what each of them involve with this huge PR. Would you mind opening separate PRs for:
I think discussions will be much more focussed, and reviewing much more efficient if we don't mix things. |
I will open separate PRs. Just to set expectations:
ok
ok
sounds good
sure
These need to be done together because poetry doesn't know anything about versioneer, and importlib metadata replaces the |
Small modification: the first two steps (yaml refactoring and setup_env.sh don't make sense to split because of the minimum deps issue) |
dev/poetry2recipe.py
Outdated
from ruamel.yaml import YAML | ||
|
||
|
||
def main( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpcloud could you consolidate the two poetry scripts into a single one?
Eventually I think we should have a python library containing all the required functionality for development. Including the poetry conversion scripts, datamgr, impalamgr and possibly other utilities in the future. It would make testing the utility scripts easier as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe but these are pretty different tools, so I'm not sure putting them into one place makes sense. I think making poetry2recipe a separate package outside of ibis would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After talking with @mariusvniekerk I think we can remove poetry2recipe
altogether after the next release of ibis, and depend on conda-forge machinery to discover dependencies.
.github/workflows/main.yml
Outdated
PYTEST_BACKENDS: ${{ join(fromJSON(steps.set_backends.outputs.backends), ' ') }} | ||
run: | | ||
pytest \ | ||
ibis/backends/{${{ join(fromJSON(steps.set_backends.outputs.backends), ',') }}} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not pytest ibis/backends/
?
I assume PYTEST_BACKENDS
should automatically handle to skip the not requested backends - otherwise we should tune that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open up a new issue, I don't think it currently works that way unfortunately.
15065c9
to
2f7f248
Compare
52d111b
to
d7fbf13
Compare
96099c2
to
52b4b6f
Compare
54d0e4b
to
d3a4cb1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a question
cf56e5d
to
f09a204
Compare
thanks @cpcloud followups if you can verify that have sufficient documentation around how to use this (e.g. where / how to update the min versions of things). |
Yup there's some more automation to implement as well to avoid manual updating of minor/patch versions of dependencies for example. |
🎉 This PR is included in version 2.1.0 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
This PR is an extraction from #2924
of just the poetry/ci bits.
The piece of code here with the most changes is the main CI workflow
(
.github/workflows/main.yml
)This file is the main thing to focus on for review here.