Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Support multiple feedstock directories #27

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

jbusecke
Copy link
Contributor

@jbusecke jbusecke commented Dec 12, 2023

Absolute WIP to support a reorganization of https://github.com/leap-stc/data-management

This would enable the user to just run the action over a bunch of subdir feedstocks (each one marked by the presence of a meta.yaml file) and still have fine grained control over which recipe is executed.

The current implementation is just looping dumb over all subdirectories, but maybe there is a more clever way to quickly check all the meta.yaml files to see which of the feedstocks is actually tagged by the labels.

@jbusecke
Copy link
Contributor Author

I think most importantly to get this running, I needed to remove this code block

        # # if calling `pangeo-forge-runner` directly, `--feedstock-subdir` can be passed as a CLI arg.
        # # in the action context, users do not compose their own `pangeo-forge-runner` CLI calls, so if
        # # they want to use a non-default value for feedstock-subdir, it must be passed via the long-form
        # # name in the config JSON (i.e, `{"BaseCommand": "feedstock_subdir": ...}}`).
        # feedstock_subdir = (
        #     config["BaseCommand"]["feedstock_subdir"]
        #     if "BaseCommand" in config and "feedstock_subdir" in config["BaseCommand"]
        #     else "feedstock"
        # )

But I would maybe like to challenge this thinking: This assumes that there will only ever be one feedstock directory? Or how would the user provide multiple feedstocks here?

@jbusecke
Copy link
Contributor Author

Just adding some thoughts here:

  • I think we might also want to customize the config files for each feedstock (e.g. use dataflow prime for some recipes?). Perhaps in the future we also would like to use different target buckets. This could also be helpful in the CMIP context....
  • Would in that case a parsing at the gh workflow level (spawning a job per feedstock) be better? That would quickly get complicated with the labels though..

Just thinking out loud and writing down for later.

@jbusecke
Copy link
Contributor Author

jbusecke commented Mar 21, 2024

Just noted that this will install dependencies from the root folder if requirements.txt is present there. We might consider explicitly checking that (no meta.yaml/requirments.txt should exist except in the 'most nested' folders).

This might not be a problem anymore with #30

@jbusecke jbusecke closed this Mar 21, 2024
@jbusecke jbusecke reopened this Mar 21, 2024
@jbusecke
Copy link
Contributor Author

jbusecke commented Mar 22, 2024

Just noted that when running from this branch + not setting autodetect_feedstock_folders: true , the singular feedstock repo is iterating through the characters, because feedstock_subdirs is a str not a list. See here.

Ill fix that really quick, but just wanted to make a note to myself here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant