Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid maximum limit when mlflow saves information #3651

Merged
merged 8 commits into from Jun 20, 2022

Conversation

nzw0301
Copy link
Member

@nzw0301 nzw0301 commented Jun 9, 2022

Motivation

Takeover #3530

Description of the changes

  • Split information sent to MLflow into chunks of 100 or 1000 (as in the original PR
  • Add test

@github-actions github-actions bot added the optuna.integration Related to the `optuna.integration` submodule. This is automatically labeled by github-actions. label Jun 9, 2022
@nzw0301 nzw0301 added the enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. label Jun 9, 2022
@xadrianzetx xadrianzetx self-requested a review June 9, 2022 17:38
@xadrianzetx xadrianzetx self-assigned this Jun 9, 2022
@xadrianzetx
Copy link
Collaborator

I've assigned myself as reviewer of #3530. Hope that's fine @nzw0301.

@nzw0301
Copy link
Member Author

nzw0301 commented Jun 9, 2022

@xadrianzetx Thanks. It's perfect!

@nzw0301 nzw0301 marked this pull request as ready for review June 9, 2022 18:54
@xadrianzetx
Copy link
Collaborator

Cool. Will check it out over the weekend.

@codecov-commenter
Copy link

Codecov Report

Merging #3651 (5be165c) into master (a58db66) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #3651      +/-   ##
==========================================
- Coverage   90.66%   90.65%   -0.01%     
==========================================
  Files         162      162              
  Lines       12585    12597      +12     
==========================================
+ Hits        11410    11420      +10     
- Misses       1175     1177       +2     
Impacted Files Coverage Δ
optuna/integration/mlflow.py 100.00% <100.00%> (ø)
optuna/integration/botorch.py 97.81% <0.00%> (-0.88%) ⬇️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@HideakiImamura
Copy link
Member

@contramundum53 Could you review this PR if you have time?

Copy link
Collaborator

@xadrianzetx xadrianzetx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks great, but I was wondering if we (or anyone else using mlflow APIs) should care about implementation details such as batch sizes or max tag length. This makes us tightly coupled and might result in additional maintenance cost. Imo, ideally all this should be handled by mlflow internally, and at quick glance it seems like there are some existing solutions.

Approach from this PR is fine for now, but it could be worth to reconsider later on. What do you think @nzw0301?

run_dict = run.to_dictionary()

# The `tags` contains param's distributions and other information too, such as trial number.
print(run_dict["data"]["tags"])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this line should be removed.

@nzw0301
Copy link
Member Author

nzw0301 commented Jun 12, 2022

Thank you for sharing your opinion and I agree with you; this might be able to be handled by mlflow's python API. So let me create a feature request issue on mlflow repo. If the issue sounds reasonable to them, I'll add a TODO comment to this PR (or as a follow-up).

Copy link
Collaborator

@xadrianzetx xadrianzetx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Seems like mlflow maintainers agree with your proposition, so I guess this one could be merged to unblock and removed later once changes are published mlflow side. LGTM!

Copy link
Member

@contramundum53 contramundum53 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@contramundum53 contramundum53 merged commit 49ca4df into optuna:master Jun 20, 2022
@contramundum53 contramundum53 added this to the v3.0.0-rc0 milestone Jun 20, 2022
@nzw0301 nzw0301 deleted the mlflow-chunk branch June 20, 2022 03:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. optuna.integration Related to the `optuna.integration` submodule. This is automatically labeled by github-actions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants