Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Remove old scipy-wheels-nightly uploads to ensure space #23349

Merged

Conversation

matthewfeickert
Copy link
Contributor

@matthewfeickert matthewfeickert commented Jun 26, 2022

PR Summary

Resolves the space problem in #22757 (comment)

Remove all but the last "${N_LATEST_UPLOADS}" package version uploads to the matplotlib scipy-wheels-nightly index to ensure space. To do this, rely on the output form of anaconda show to be able to filter on the item delimiter character sequence for each version currently uploaded.

As an explicit example (from today 2022-06-26):

$ anaconda show scipy-wheels-nightly/matplotlib
Using Anaconda API: https://api.anaconda.org
Name:    matplotlib
Summary:
Access:  public
Package Types:  pypi
Versions:
   + 3.6.0.dev2553+g3245d395d9
   + 3.6.0.dev2569+g3522217386
   + 3.6.0.dev2573+g3eadeacc06

To install this package with pypi run:
     pip install -i https://pypi.anaconda.org/scipy-wheels-nightly/simple matplotlib

shows that by filtering on +

$ anaconda show scipy-wheels-nightly/matplotlib &> >(grep '+')
   + 3.6.0.dev2553+g3245d395d9
   + 3.6.0.dev2569+g3522217386
   + 3.6.0.dev2573+g3eadeacc06

and then stripping ' + '

$ anaconda show scipy-wheels-nightly/matplotlib &> >(grep '+') | \
    sed 's/.* + //'
3.6.0.dev2553+g3245d395d9
3.6.0.dev2569+g3522217386
3.6.0.dev2573+g3eadeacc06

one can obtain a newline separated list of all package uploads, where the most recent uploads are listed last. After stripping off the "${N_LATEST_UPLOADS}" lines that correspond to the package versions to keep for testing

$ anaconda show scipy-wheels-nightly/matplotlib &> >(grep '+') | \
    sed 's/.* + //' | \
    head --lines "-${N_LATEST_UPLOADS}" > remove-package-versions.txt

the remaining (older) package uploads can be removed with anaconda remove.

PR Checklist

Tests and Styling

  • [N/A] Has pytest style unit tests (and pytest passes).
  • [N/A] Is Flake 8 compliant (install flake8-docstrings and run flake8 --docstring-convention=all).

Documentation

  • [N/A] New features are documented, with examples if plot related.
  • [N/A] New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • [N/A] API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).
  • [N/A] Documentation is sphinx and numpydoc compliant (the docs should build without error).

- name: Remove old uploads to save space
shell: bash
run: |
N_LATEST_UPLOADS=5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from @ogrisel's suggestion #22757 (comment) to

only keep the 5 most recent dev wheels for a given project and platform spec

but the number here is arbitrary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good number, I could see a case for going up to like 14 (2 weeks), but given that other projects are replacing their wheels nightly 5 seems pretty good!

@matthewfeickert
Copy link
Contributor Author

Tagging @QuLogic and @tacaswell for review given their involvement in Issue #22757. @ogrisel it would similarly be good to know if you see this as a sufficient first stab at a shared script (#22757 (comment)) and if you see any hurdles for other projects.

@matthewfeickert matthewfeickert force-pushed the ci/remove-old-uploads-to-save-space branch from 0a20b7f to 1b00fa1 Compare June 26, 2022 05:14
Comment on lines +73 to +76
# N.B.: `anaconda show` places the newest packages at the bottom of the output
# of the 'Versions' section and package versions are preceded with a ' + '.
anaconda show scipy-wheels-nightly/matplotlib &> >(grep '+') | \
sed 's/.* + //' | \
Copy link
Contributor Author

@matthewfeickert matthewfeickert Jun 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat brittle as it relies on user facing output not changing, instead of consuming API output. Though this seems about as good/bad as trying to use

$ python -m pip index \
    --index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple \
    --pre \
    versions matplotlib | \
    grep 'Available versions'
WARNING: pip index is currently an experimental command. It may be removed/changed in a future release without prior warning.
Available versions: 3.6.0.dev2573+g3eadeacc06, 3.6.0.dev2569+g3522217386, 3.6.0.dev2553+g3245d395d9

to get the versions available. Also the current version of anaconda-client used is frozen given anaconda/anaconda-client#540.

@timhoffm
Copy link
Member

timhoffm commented Jun 26, 2022

Do I interpret this correctly: If the anaconda output format would change, we would not get any packages and simply not remove anything. There is no danger in removing everything in case of an output change.

That’s a reasonable defensive behavior.

@matthewfeickert
Copy link
Contributor Author

matthewfeickert commented Jun 26, 2022

Do I interpret this correctly: If the anaconda output format would change, we would not get any packages and simply not remove anything. There is no danger in removing everything in case of an output change.

That’s a reasonable defensive behavior.

@timhoffm It is correct that if the Version output format changed to be anything other then being prefaced with ' + ' then nothing would be found by the grep and so nothing would be removed. However, if there was an addition of other information being prefaced with ' + ' then there would be problems. Though this shouldn't be scary, as until anaconda/anaconda-client#540 is solved and there are new versions of anaconda-client up on PyPI, anaconda-client is being pinned to a specific Git commit

# c.f. https://github.com/Anaconda-Platform/anaconda-client/issues/540
python -m pip install git+https://github.com/Anaconda-Server/anaconda-client@be1e14936a8e947da94d026c990715f0596d7043

, and even after it can be installed from PyPI again it would be pinned at a stable version and not left to float on install. So any new behavior in new versions can be manually checked before getting bumped.

An idea I had this morning is that if @ogrisel was okay with the idea this could get setup as a GitHub Action, and so all of the scipy-wheels-nightly org repos could just hit the GHA with something like

    - name: Remove old uploads
      uses: scipy-wheels-nightly/remove-wheels@v1.2.3
      with:
        token: ${{ secrets.ANACONDA_ORG_UPLOAD_TOKEN }}
        package: matplotlib
        keep_latest: 5

so people wouldn't need to worry about the specifics of what is being checked for, as that's now internally abstracted away against a fixed version of anaconda-client.

# Remove all _but_ the last "${N_LATEST_UPLOADS}" package versions
# N.B.: `anaconda show` places the newest packages at the bottom of the output
# of the 'Versions' section and package versions are preceded with a ' + '.
anaconda show scipy-wheels-nightly/matplotlib &> >(grep '+') | \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using &> >(grep '+') here because anaconda show doesn't output to stdout.

Remove all but the last "${N_LATEST_UPLOADS}" package version uploads
to the matplotlib scipy-wheels-nightly index to ensure space. To do this,
rely on the output form of `anaconda show` to be able to filter on the item
delimiter character sequence for each version currently uploaded.

As an explicit example:

```
$ anaconda show scipy-wheels-nightly/matplotlib
Using Anaconda API: https://api.anaconda.org
Name:    matplotlib
Summary:
Access:  public
Package Types:  pypi
Versions:
   + 3.6.0.dev2553+g3245d395d9
   + 3.6.0.dev2569+g3522217386
   + 3.6.0.dev2573+g3eadeacc06

To install this package with pypi run:
     pip install -i https://pypi.anaconda.org/scipy-wheels-nightly/simple matplotlib
```

shows that by filtering on '+' and then stripping ' + ' one can obtain a
newline separated list of all package uploads, where they _most recent_
uploads are listed last. After stripping off the "${N_LATEST_UPLOADS}" lines
that correspond to the package versions to keep for testing, the remaining (older)
package uploads can be removed with `anaconda remove`.

Resolves the space problem in https://github.com/matplotlib/matplotlib Issue 22757
@matthewfeickert matthewfeickert force-pushed the ci/remove-old-uploads-to-save-space branch from 1b00fa1 to 5c66fe9 Compare June 27, 2022 21:16
Comment on lines +81 to +83
anaconda --token ${{ secrets.ANACONDA_ORG_UPLOAD_TOKEN }} remove \
--force \
"scipy-wheels-nightly/matplotlib/${package_version}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While @tacaswell is (AFAIK) the only person to have removed package versions before with this API (c.f. #22757 (comment)), this is the correct syntax to remove all wheels for a given dev release on the package index given that from https://github.com/Anaconda-Platform/anaconda-client/blob/be1e14936a8e947da94d026c990715f0596d7043/binstar_client/commands/remove.py we can see that Package written as <user>[/<package>[/<version>[/<filename>]]] and for version there is the question (when run sans --force) of Are you sure you want to remove the package release %s ? (and all files under it?).

Also you can just try this and note the same clarifying prompt

$ anaconda remove scipy-wheels-nightly/matplotlib/3.6.0.dev2553+g3245d395d9
Using Anaconda API: https://api.anaconda.org
Are you sure you want to remove the package release scipy-wheels-nightly/matplotlib/3.6.0.dev2553+g3245d395d9 ? (and all files under it?) [y|N]:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been removing things via mind-less clicking on the website (which is how I took out all of the uploads at one point).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I "tested" this by running the commands locally with the token (had to trim to 3 as we only and 4 release up!) and it deleted files as expected.

@tacaswell tacaswell added this to the v3.6.0 milestone Jun 28, 2022
@tacaswell
Copy link
Member

I verified by hand that the bash does what we think it does and am going to merge this on one review as this is not affecting our tests or the code we ship to users.

@tacaswell tacaswell merged commit d276edb into matplotlib:main Jun 28, 2022
@matthewfeickert matthewfeickert deleted the ci/remove-old-uploads-to-save-space branch June 28, 2022 21:20
@matthewfeickert
Copy link
Contributor Author

I verified by hand that the bash does what we think it does and am going to merge this on one review as this is not affecting our tests or the code we ship to users.

Thanks @tacaswell! We can check on 2022-06-30 if the first run of this goes as expected, but given that you've manually tested it then seems like things are good to go. 👍

@matthewfeickert
Copy link
Contributor Author

matthewfeickert commented Jul 1, 2022

Hm, so now that it has run for the first time, apparently the only output that you get from

if [ -s remove-package-versions.txt ]; then
while LANG=C IFS= read -r package_version ; do
anaconda --token ${{ secrets.ANACONDA_ORG_UPLOAD_TOKEN }} remove \
--force \
"scipy-wheels-nightly/matplotlib/${package_version}"
done <remove-package-versions.txt
fi

is

Using Anaconda API: https://api.anaconda.org/

where we can see from

$ anaconda show scipy-wheels-nightly/matplotlib &> >(grep '+') | \
    sed 's/.* + //'
3.6.0.dev2573+g3eadeacc06
3.6.0.dev2576+gd2f87e8ce3
3.6.0.dev2591+gd276edb4c2
3.6.0.dev2608+g4f9ac38c8c
3.6.0.dev2618+g9fb42c9d5c

to previous output that 3.6.0.dev2569+g3522217386 was removed.

Maybe I should have had that be

          if [ -s remove-package-versions.txt ]; then
              while LANG=C IFS= read -r package_version ; do
                  echo "# Removing scipy-wheels-nightly/matplotlib/${package_version}"
                  anaconda --token ${{ secrets.ANACONDA_ORG_UPLOAD_TOKEN }} remove \
                    --force \
                    "scipy-wheels-nightly/matplotlib/${package_version}"
              done <remove-package-versions.txt
          fi

so it would be more explicit. I guess that's something for either further maintenance or for the GitHub Action suggestion.

@matthewfeickert matthewfeickert mentioned this pull request Aug 10, 2022
11 tasks
@ogrisel
Copy link

ogrisel commented Aug 10, 2022

LGTM!

@ogrisel
Copy link

ogrisel commented Aug 10, 2022

Also +1 for more explicit outputs as suggested above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants