Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Set environment cache key based on current week of year to avoid outdated cache in the "Setup Micromamba" step #3234

Merged
merged 8 commits into from
May 12, 2024

Conversation

seisman
Copy link
Member

@seisman seisman commented May 7, 2024

Take the latest run of the "Tests" workflow on the main branch as an example. The "Setup Micromamba" step in the "Ubuntu + Python 3.10" job uses the cache with key micromamba-environment--linux-64-pygmt-args-4fc7725-root-dcc80ee, which was cached last month as shown here (https://github.com/GenericMappingTools/pygmt/actions/caches).

image

The setup-micromamba action says (https://github.com/mamba-org/setup-micromamba#caching):

Note that the specified cache key is only the prefix of the real cache key. setup-micromamba will append a hash of the environment file and the custom-args as well as the environment name and OS to the cache key.

And GitHub says (https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#usage-limits-and-eviction-policy):

GitHub will remove any cache entries that have not been accessed in over 7 days.

So, the cached environment won't be refreshed unless we make changes to the packages listed in the create-args argument. We're using outdated packages in our workflows, which explains why sometimes we see old packages installed in CI.

The solution is to define the cache key based on the current date.

We can define the cache key like micromamba-environment-2024-05-07, which means the cache will expire on the next day so the first CI run on the next day will be slower since the cache is missing. I feel one day is a little too short for PyGMT, so in this PR, the cache key is defined like micromamba-environment-2024-W19, in which 2024-W19 means the 19th week of 2024. The cache will persist for one week.

Patches #2946.

@seisman seisman marked this pull request as ready for review May 7, 2024 06:11
@seisman seisman changed the title WIP: CI: Avoid using outdated cache in the "Setup Micromamba" step CI: Set environment cache key based on current week of year to avoid outdated cache in the "Setup Micromamba" step May 7, 2024
@seisman
Copy link
Member Author

seisman commented May 7, 2024

With the latest packages installed, now we can also see the Fiona errors reported in #3180 (comment)

______________ test_geopandas_info_shapely[multipolygon-desired0] ______________

geojson = <MULTIPOLYGON (((0 0, 20 0, 10 20, 0 0), (3 2, 10 16, 17 2, 3 2)), ((6 4, 14...>

    @contextmanager
    def tempfile_from_geojson(geojson):
        """
        Saves any geo-like Python object which implements ``__geo_interface__`` (e.g. a
        geopandas.GeoDataFrame or shapely.geometry) to a temporary OGR_GMT text file.
    
        Parameters
        ----------
        geojson : geopandas.GeoDataFrame
            A geopandas GeoDataFrame, or any geo-like Python object which
            implements __geo_interface__, i.e. a GeoJSON.
    
        Yields
        ------
        tmpfilename : str
            A temporary OGR_GMT format file holding the geographical data.
            E.g. '1a2b3c4d5e6.gmt'.
        """
        with GMTTempFile(suffix=".gmt") as tmpfile:
            import geopandas as gpd
    
            Path(tmpfile.name).unlink()  # Ensure file is deleted first
            ogrgmt_kwargs = {"filename": tmpfile.name, "driver": "OGR_GMT", "mode": "w"}
            try:
                # OGR_GMT only supports 32-bit integers. We need to map int/int64
                # types to int32/float types depending on if the column has an
                # 32-bit integer overflow issue. Related issues:
                # https://github.com/geopandas/geopandas/issues/967#issuecomment-84287[770](https://github.com/GenericMappingTools/pygmt/actions/runs/8981820061/job/24669172952?pr=3234#step:8:771)4
                # https://github.com/GenericMappingTools/pygmt/issues/2497
>               if geojson.index.name is None:
E               AttributeError: 'MultiPolygon' object has no attribute 'index'

../pygmt/helpers/tempfile.py:142: AttributeError

During handling of the above exception, another exception occurred:

>   ???

fiona/ogrext.pyx:136: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   fiona._err.CPLE_OpenFailedError: <fiona.io.MemoryFile object at 0x7f843fe3d410>: No such file or director

@seisman
Copy link
Member Author

seisman commented May 7, 2024

Ping @GenericMappingTools/pygmt-maintainers for comments before I apply the same changes to other workflows.

@seisman seisman added maintenance Boring but important stuff for the core devs needs review This PR has higher priority and needs review. skip-changelog Skip adding Pull Request to changelog labels May 7, 2024
@seisman seisman added this to the 0.13.0 milestone May 7, 2024
@@ -58,8 +58,6 @@ jobs:
channels:
- conda-forge
- nodefaults
cache-downloads: false
cache-environment: true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow is only scheduled to run weekly. We don't care about the speed of the workflow, so no need to cache the enviroment.

@seisman seisman added final review call This PR requires final review and approval from a second reviewer and removed needs review This PR has higher priority and needs review. labels May 10, 2024
@seisman seisman requested a review from a team May 10, 2024 05:09
.github/workflows/ci_docs.yml Outdated Show resolved Hide resolved
.github/workflows/ci_tests.yaml Outdated Show resolved Hide resolved
.github/workflows/ci_tests_dev.yaml Outdated Show resolved Hide resolved
seisman and others added 2 commits May 10, 2024 18:02
Co-authored-by: Yvonne Fröhlich <94163266+yvonnefroehlich@users.noreply.github.com>
@seisman seisman merged commit 5d1a8b4 into main May 12, 2024
14 of 18 checks passed
@seisman seisman deleted the ci/mamba-cache branch May 12, 2024 11:13
@seisman seisman removed the final review call This PR requires final review and approval from a second reviewer label May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Boring but important stuff for the core devs skip-changelog Skip adding Pull Request to changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants