Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compose parameters for DEBUG mode #605

Closed
noklam opened this issue Nov 12, 2020 · 5 comments
Closed

Compose parameters for DEBUG mode #605

noklam opened this issue Nov 12, 2020 · 5 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@noklam
Copy link
Contributor

noklam commented Nov 12, 2020

Description

I found it due to the config file, I cannot do dynamic stuff like

if DEBUG:
  epochs=1
else:
  epochs=100

One way of doing so is leverage the environment and create a debug environment. However, it is only composable for additional key, if identical keys exist, the debug enviornment will overwrite completely.

Context

This is very common usage to have a debug mode running fast development

Possible Implementation

Instead of overwrite the key, it should merge them, the provided environment should have higher priority. It also avoids duplicate the parameters, especially for ML project the # of parameters can be large. We should only specify the parameters that is changed.

What I wanted:

base

model:
   layer: 32
   epochs: 100

debug

model:
   epochs: 1

result

model:
  layer: 32 
  epochs: 1

Current behavior

base

model:
   layer: 32
   epochs: 100

debug

model:
   epochs: 1

result

model:
  epochs: 1

Possible Alternatives

A library like hydra provides the ability to compose two configuration file. We just need to merge two dictionary instead of overwriting it.

@noklam noklam added the Issue: Feature Request New feature or improvement to existing feature label Nov 12, 2020
@noklam noklam changed the title Compose parameters Compose parameters for DEBUG mode Nov 12, 2020
@noklam
Copy link
Contributor Author

noklam commented Nov 12, 2020

Seems that currently it only supports top-level key. Is there a reason to restrict it at the top-level only?

https://github.com/quantumblacklabs/kedro/blob/25fd242faafd1740f37e5094cd2e51f2120a0313/kedro/config/config.py#L182

@lorenabalan
Copy link
Contributor

lorenabalan commented Nov 19, 2020

Hi @noklam
I think you can achieve this currently using the TemplatedConfigLoader. Assuming you have a base config

model:
    layer: 32
    epochs: ${model_epochs}

Then you can dedicate globals.yml in each environment:

debug/globals.yml

model_epochs: 1

base/globals.yml

model_epochs: 100

You can find an example in the documentation.

@noklam
Copy link
Contributor Author

noklam commented Nov 20, 2020

Thx @lorenabalan, this would be a valida workaround.

I am not sure if TemplatedConfig is designed for this purpose, I feel it is designed for string templat. Being able to compose configuration directly is still useful, as now we have 4 configuration files.
i.e.
local/parameters -> local/globals -> debug/parameters -> debug/paramters, it will become harder to lookup for the parameters since it is possible defined in any of these 4 files.

@lorenabalan
Copy link
Contributor

lorenabalan commented Nov 20, 2020

Yeah I agree with you, I had made a note of this for a future team discussion. :)
There might've been an "explicit is better than implicit" motivation back when this was implemented, though.

@lorenabalan
Copy link
Contributor

Closing this as resolved, with configuration environments to be looked at holistically for a future release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
None yet
Development

No branches or pull requests

2 participants