Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling param limit exceeded error #69

Closed
crypdick opened this issue Sep 23, 2020 · 6 comments · Fixed by #140
Closed

Handling param limit exceeded error #69

crypdick opened this issue Sep 23, 2020 · 6 comments · Fixed by #140
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@crypdick
Copy link

I have a node which takes an S3 artifact URL as an input. This broke my pipeline because of an mlflow.exceptions.MlflowException: Param value 's3://very/long/string' had length 593, which exceeded length limit of 250.

IMO there should at least be the option to truncate long param strings.

@turn1a
Copy link
Contributor

turn1a commented Sep 23, 2020

@crypdick This is a limitation imposed by MLflow and I think we shouldn't change that behaviour.

@Galileo-Galilei
Copy link
Owner

Galileo-Galilei commented Sep 23, 2020

I slightly disagree with @kaemo here: I agree we should obviously align with mlflow and we will not support a specific trick to enable logging above this limit, but we still have to manage this situation in the plugin. Indeed, if someone still use a parameter above this limit I don't think the best solution is "do not use the plugin at all".

I can see 2 situations (in my personal experience) where too long parameters are used:

  • when you pass a dict which contains a lot of keys as a parameter (typically, all the hyperparameters of a ml model like xgboost). => This situation is already managed with the flatten_dict_params argument of MlflowNodeHook
  • when you pass a path to a file / an url to a node (your situation here), which can be very long. This situation is seen as a bad practice in Kedro, because it means you want to perform some I/O operations and they should be addressed in the DataCatalog (if you want to implement your own logic and no Kedro dataset suits you, they recommend to create your own dataset for clear separation with compute / better portability. However, I acknowledge that I sometimes encountered situations where I preferred doing this operation inside a node because the logic was very specific and not reusable elsewhere.

Potential solution

A possible solution I would support would be adding a long_parameters_strategy in the mlflow.yml, in the section:

hooks:
    node:
        long_parameters_strategy

This could take the following values:
- fail: raise an error in the pipeline like current situation.
- truncate: truncate the string by logging param[0:250]
- tag: log the string with set_tag instead of log_param if it is above the limit, as it seems to be the recommended way in mlflow: mlflow/mlflow#1976

And then:

  1. Add the same key long_parameters_strategy in the KedroMlflowConfig class:
    NODE_HOOK_OPTS = {"flatten_dict_params": False, "recursive": True, "sep": "."}

    and modify accordingly the methods calls
  2. modify the before_pipeline_run method of the MlflowNodeHook to replace this line and add the corresponding logical tests to address all possible situations.

mlflow.log_params(params_inputs)

Some points to have in mind:

  • a logger should put a warning in the log because it prevents reproducibility
  • the logical test on length must use MAX_PARAM_VAL_LENGTH (and not hardcoded 250 value to ensure we will adapt in the future.

I don't have time right now and it is not in my top priorities for the plugin, but I will definitely address this in the coming months (llikely by the end of november). If you are in hurry @crypdick, feel free to open a PR 😉. I am open to discussion about the best way to address this, so do not hesitate to suggest alternative possibilities.

@Galileo-Galilei Galileo-Galilei added the enhancement New feature or request label Sep 23, 2020
@crypdick
Copy link
Author

@Galileo-Galilei ty for the ideas! Indeed, I had to disable the MlflowNodeHook, which was a bummer.

In the interest of time, I'm going to port the full URIs into a YAMLDataSet, and keep just the pointers in the parameters.yml.

@turn1a
Copy link
Contributor

turn1a commented Sep 24, 2020

@crypdick @Galileo-Galilei I might have assesed this issue a bit too early. I agree with the possible solution you have outlined.

@Galileo-Galilei Galileo-Galilei self-assigned this Sep 29, 2020
@Galileo-Galilei Galileo-Galilei added this to the Release 0.4.0 milestone Sep 29, 2020
@Galileo-Galilei Galileo-Galilei moved this from To do to Planned for next release in Ongoing development Oct 12, 2020
@Galileo-Galilei Galileo-Galilei moved this from Planned for next release to To do in Ongoing development Oct 26, 2020
@Galileo-Galilei Galileo-Galilei moved this from To do to Planned for next release in Ongoing development Nov 4, 2020
@Galileo-Galilei Galileo-Galilei moved this from Planned for next release to To do in Ongoing development Nov 25, 2020
@Galileo-Galilei Galileo-Galilei moved this from To do to Planned for next release in Ongoing development Nov 28, 2020
@Galileo-Galilei
Copy link
Owner

Galileo-Galilei commented Nov 30, 2020

Hello @crypdick, I am struggling a little with this issue. I have implemented a PR for this, and while I am adding tests I can't reproduce the issue. Actually the following code works without any issue:

import mlflow
with mlflow.start_run():
    mlflow.log_param("fake_param", "this is a very long string"*1000)

This logs and render in the UI. I suspect your issue might be related to remote logging to a server. I created a server on my localhost, and it stil works (logs without error).

I have some additionals questions:

  • What version of mlflow do you have?
  • Can you provide a minimal reproducible example? Does above code works for you?
  • Can you check if it works locally (wih a mlruns folder) and remotely (on your S3 tracking)?

I'll try to make somme additional tests on my side.

@crypdick
Copy link
Author

crypdick commented Dec 5, 2020

@Galileo-Galilei I'm super sorry but I am under an avalanche of work right now and don't have any spare bandwidth

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging a pull request may close this issue.

3 participants