Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option for using auto-generated names for kedro runs instead of pipeline name #426

Closed
AlexandreOuellet opened this issue Jun 16, 2023 · 7 comments · Fixed by #481
Closed
Labels
enhancement New feature or request

Comments

@AlexandreOuellet
Copy link

Description

When running the same kedro pipeline multiple time in parallel (doing experiments in parallel), I would like an option to generate random name for those runs, instead of always having __default__ or pipelinename

Context

In my scenario, I have a lot of parameters to test, and I want to test them in parallel (1 kedro run per parameters), and I don't really care about an experiment name (although I do want to be able to refer to them quickly). Instead of spending time naming my runs pipelinename_flipr and coming up with names and/or exposing the values there, I would like to simply have the run name be automatically generated (so that I can still refer to it in the future if needed).

Possible Implementation

Have a setting in mlflow.tml's tracking->run to specify wether to use a predefined name (like it is currently) or to have a randomly generated run name.

Possible Alternatives

Right now, I would need to modify mlflow.yml's tracking->run->name parameter for each run for each parameter values I would like to modify.

@Galileo-Galilei
Copy link
Owner

Galileo-Galilei commented Jun 16, 2023

Hi, I agree such an option would be great! I thought that explictly writing:

server: 
    tracking:
        run:
            name: null

but it uses __default__ by default. I need to think about the best syntax since I guess this is the best default but I agree kedro-mlflow should support your use case.

@Galileo-Galilei
Copy link
Owner

Note for myself: I have found a way to make this possible and consistent with the interactive workflow (which is the hardest part): I'll create a custom resolver which will import the _generate_string from mlflow.

I'll add this after 0.19 is officically out.

@MosaicMan
Copy link

I would love to see this merged. Any way I can help?

Galileo-Galilei added a commit that referenced this issue Feb 9, 2024
…426)

* ✨ Add a km.random_name resolver to enable auto-generated names in configuration (#426)

* add template example, add tests, idempotency still fails

* add tests, idempotency still fails

* add doc, ignore idempotency test, remove unused argument

* add syntax with resolver in mlflow.yml

* fix typo in doc

* add changelog

* add nested kedy agin in mlflow.yml

* fix typo in changelog
@Galileo-Galilei
Copy link
Owner

Galileo-Galilei commented Feb 9, 2024

I would love to see this merged. Any way I can help?

Sorry for the late reply. I had little time recently + there is a very subtle bug in how kedro handles resolvers that needs to be addressed on the core framework side, but I release this feature since the bug should very likely be hardly noticeable, and this is a commonly required feature.

@MosaicMan
Copy link

Thank you so much for all your great work into this package!

@Galileo-Galilei
Copy link
Owner

Galileo-Galilei commented Feb 9, 2024

Thanks!
This is already on PyPI, you can try with kedro-mlflow==0.12.1

For the record the syntax is :

#mlflow.yml
tracking: 
    run: 
        name: ${km.random_name:} # don't forget the trailing ":" at the end ! 

@MosaicMan
Copy link

It is a beautiful thing. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
3 participants