Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to poetry for inspection #827

Merged
merged 43 commits into from Mar 7, 2024

Conversation

glatterf42
Copy link
Collaborator

@glatterf42 glatterf42 commented Mar 1, 2024

Closes #823 and #831.

Please confirm that this PR has done the following:

  • CI updated
  • Documentation Added
  • Name of contributors Added to AUTHORS.rst
  • Description in RELEASE_NOTES.md Added

Description of PR

This PR migrates the contents of ruff.toml, setup.cfg and setup.py to pyproject.toml to use poetry for package management. Please check that no information was lost and everything still works as usual.

@glatterf42
Copy link
Collaborator Author

glatterf42 commented Mar 1, 2024

Some notes about pyproject.toml:

  • Authors need to be listed with name and email, I'm not sure we can just include a file. I didn't do the detective work to find as many email addresses as possible, so there are some blanks in there.

  • Some classifiers like the supported Python version and license are added automatically.

  • The RELEASE_NOTES still need adapting

  • The CI workflows might also need adaptation to use poetry

  • The docs don't mention poetry yet

pyproject.toml Outdated Show resolved Hide resolved
@glatterf42
Copy link
Collaborator Author

At the moment, I did not migrate the legacy-pytest workflow to poetry since I'm not sure what the best way is to install these specific versions. We could define them for the oldest supported python version and then run the legacy-pytest workflow on precisely that python version as suggested here.

@danielhuppmann
Copy link
Member

When installing ixmp4 into my pyam-dev-environmemt, I was able to simply do pip install -e . in the ixmp4 directory - seems that pip can handle the pyproject toml directly (without needing to install poetry).

@danielhuppmann
Copy link
Member

Also, the nightly tests should not use the poetry-lock packages but should install the latest available versions of all packages. This test is helpful as a warning when e.g pandas releases a new version that is incompatible with the existing codebase. So this tests replicates the behavior when a user installs pyam from pypi with the latest available dependencies.

@glatterf42
Copy link
Collaborator Author

The nightly tests now update the dependencies before installing them, this should be the desired behaviour, if I understand correctly.

Also, I similarly think you can run pip install -e here, but you don't have to. Using poetry instead takes care of venvs and compatible dependency versions, etc.

@glatterf42

This comment was marked as resolved.

@glatterf42

This comment was marked as resolved.

Copy link

codecov bot commented Mar 5, 2024

Codecov Report

Attention: Patch coverage is 85.71429% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 95.1%. Comparing base (ddbb88e) to head (c7821de).

Files Patch % Lines
pyam/__init__.py 33.3% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main    #827     +/-   ##
=======================================
+ Coverage   95.0%   95.1%   +0.1%     
=======================================
  Files         64      63      -1     
  Lines       6134    6130      -4     
=======================================
+ Hits        5828    5831      +3     
+ Misses       306     299      -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@glatterf42
Copy link
Collaborator Author

@danielhuppmann @phackstock All tests are passing now, so please review if you like this setup. Please also consider the following notes:

Caching on Windows runners

For the pytest workflow, we might want to cache poetry's venv on all runners, including the Windows runners. If that doesn't work as expected, this is a known issue and a solution has been suggested here: https://github.com/marketplace/actions/install-poetry-action#caching-on-windows-runners

Activating poetry's venv in a CI workflow

To avoid having to specify different paths to activation scripts on different runners in the CI, the install poetry action we use offers an environment variable that we could use to activate the venv in an OS-agnostic way (see the second point here). However, for all current workflows, this is never necessary since we can also use poetry run to run command line tools inside the venv without activating it explicitly first, so e.g. poetry run pytest --mpl works perfectly fine.

Default version constraints for poetry

One of poetry's main downsides is that it doesn't support PEP 621 yet, which (among other things) appears in the default form that dependency requirements take:

(pyam-iamc-py3.12) fridolin@fridolin-Latitude-5520 ~/pyam (migrate/to-poetry)> (.venv) poetry add black --group dev
Using version ^24.2.0 for black

By default, poetry will try to use the latest version and add a caret specification to it, which allows limited updates. In particular, caret requirements will prevent poetry from picking up new major version releases.
Since you want the nightly workflow to test latest versions, I'm calling poetry update before poetry install and I specified all requirements as >=, but be careful when adding new requirements to explicitly use the >= if you don't want the default ^.
Please also note that the PR checks don't include the nightly workflow (which might be worth adapting just for one run once everything else is fine).

If you decide to merge this PR, we should of course still adapt the install instruction docs as well as the release notes.

Copy link
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall. One thing to make sure before merging is that pyam.__version__ still works correctly.
A few minor comments in line below.

.github/workflows/nightly.yml Outdated Show resolved Hide resolved
.github/workflows/pytest-legacy.yml Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
@glatterf42
Copy link
Collaborator Author

Okay, let me write up what I've learned about caching for GitHub Actions now :)
There are several ways to do it. The setup-python action already has caching ability implemented and also showcases some examples. This works for various programming languages and for various package managers for python, one of them explicitly being poetry.
However, this action says it uses toolkit/cache under the hood, which seems to be the same as the official cache action, so I figure we can use the original (as is also showcased by the install-poetry action).
When called, the complete cache action tries to restore a cache from the key provided to the path provided. It then provides an output option that we can use to decide if further steps should be run. After the final step of the workflow is run, the cache action tries to save a cache to the key provided.
By now, the action also offers both parts of this functionality as separate actions, so we can fine-tune when cache should be restored and saved. They even provide a whole document on caching strategies on how to utilize these options best.

This got me thinking: before 65a82af, we restored the cache in the pytest-legacy workflow from the poetry.lock file on main and then downgraded and installed some dependency versions. This had the possibility of still using some of the cached packages, but we could make better use of the cache. So I switched the order of adding packages to the lock file and restoring a cache.
The idea is now this: after installing poetry, we use poetry add --lock ... to add the pinned dependency versions to poetry.lock. This file will therefore change during the CI run, but this change will not be saved or committed to git or some such! We then call the cache action with a key that references the hash of the lock file. This ensures the cache will be updated if and only if our pytest-legacy-poetry.lock changes. After running the tests as usual, the cache action will finally save a cache based on our pytest-legacy-poetry.lock, which should be exactly what we want :)

@glatterf42

This comment was marked as outdated.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I installed this (using pip install -e .) in a clean conda environment and it worked as expected (including dynamically changing the version when checking out a different commit).

A few minor issues inline.

.github/workflows/build-docs.yml Outdated Show resolved Hide resolved
CONTRIBUTING.rst Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
@glatterf42
Copy link
Collaborator Author

I installed this (using pip install -e .) in a clean conda environment and it worked as expected (including dynamically changing the version when checking out a different commit).

Were you able to install the optional groups as well?
Because to clarify, I'm not sure you currently can install them via pip. We currently define the optional dependencies pyam requires as optional dependency groups, but in theory, we could also list them as extras. I think the current way is clearer, though.

@danielhuppmann
Copy link
Member

Were you able to install the optional groups as well?

You are right, this is ignored. Given that we only have a handful of packages per group, I think the extras-approach would still be clear enough, right?

@glatterf42
Copy link
Collaborator Author

I'll convert the optional groups to extras, then.

@danielhuppmann
Copy link
Member

Thank you!

@glatterf42
Copy link
Collaborator Author

No problem!
Some notes for handling extras:

  • Generally speaking, you should be able to equivalently use either --extras or -E
  • To add a new extra, use poetry add xyz --optional --extras <names of extras>. I'm not sure which form <names of extras> needs to take, I'm hoping a single extra will do fine by itself, but it might need "extra_name" or "extra_1 extra_2", if it's supposed to go into several extras
  • poetry update seems to target all packages, even optional ones
  • Use poetry install --all-extras to install all extras or specify desired extras via poetry install --extras "extra_1 extra_2"
  • pip install -e .[...] should now work as expected

@glatterf42
Copy link
Collaborator Author

Just noticing myself: poetry install (without -E) after poetry install -E will remove packages installed via -E.

@glatterf42
Copy link
Collaborator Author

The tests for python 3.12 will fail again for this reason: pandas-datareader is still using distutils, which was deprecated and removed with python 3.12. Setuptools still contains it, that's why it was found earlier when we still included setuptools as a dependency, but I'm not sure we should do that. We could instead install it from somewhere else, e.g. here, but I'm also not sure this is the best course of action.
Looking at pandas-datareader, the last version was published in 2021, even though their main branch seems somewhat active with a commit five months ago. Is this package integral to pyam's functionality or could we replace it with something more modern?

@danielhuppmann
Copy link
Member

Looks good, pip install -e.[...] now install the extras as expected. Note that -e for pip means "editable", meaning that source code will not be move to site-packages. Makes it easier to develop code because you don't have to run pip install every time...

Th pandas-datareader is used to access the World Bank data sources, but it could probably be replaced by https://pypi.org/project/wbdata/. Simply remove the test and the installation in the GitHub Actions and start an issue so that I don't forget to migrate to the new package (or a good alternative).

@glatterf42
Copy link
Collaborator Author

A small correction first: poetry add wbdata --extras optional_io_formats --optional will add wbdata = {version = "^1.0.0", optional = true, extras = ["optional_io_formats"]} to pyproject.toml, which is not quite what we want. It looks like there's currently no CLI option to immediately add a dependency to one of pyam's extras, so this needs to be done manually.

@glatterf42
Copy link
Collaborator Author

Please note: my mypy is not happy with the recent changes to

logging.configure_logging()

It claims that logging has no attribute configure_logging.

@danielhuppmann
Copy link
Member

Just remove the worldbank-feature entirely from the tests for now, and start an issue as a reminder to re-insert later.

@glatterf42
Copy link
Collaborator Author

glatterf42 commented Mar 7, 2024

Just remove the worldbank-feature entirely from the tests for now, and start an issue as a reminder to re-insert later.

But I'm pretty sure it's fixed here.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, looks great!

@danielhuppmann danielhuppmann merged commit e77080c into IAMconsortium:main Mar 7, 2024
14 checks passed
@glatterf42 glatterf42 deleted the migrate/to-poetry branch March 7, 2024 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrating to Poetry for project management
3 participants