Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move user to tag #1230

Merged
merged 21 commits into from May 13, 2019
Merged

Move user to tag #1230

merged 21 commits into from May 13, 2019

Conversation

acroz
Copy link
Contributor

@acroz acroz commented May 9, 2019

What changes are proposed in this pull request?

Following #1188 and further discussion with @aarondav, this PR migrates storage of the user who started a run to tags.

In particular:

  • The tag mlflow.user has been added and is set on creation of a run.
  • The user_id field on runs has been marked as deprecated and subject to removal in a future release, as was done previously for source_name, source_type etc.
  • user_id has been removed as an argument from the Python and R create_run methods (it was not exposed on the Java client createRun method).
  • Clients continue to pass the user_id attribute, which is read from tags on run creation, for the duration of the deprecation period.
  • Documentation has been updated to reflect that user_id is deprecated.

I've also changed the logic for detecting system username from pwd.getpwuid(os.getuid())[0] to the standard library helper getpass.getuser, which should support Windows systems in most cases.

How is this patch tested?

These changes affect a number of existing unit and integration tests, which have been updated to the change as appropriate.

Release Notes

  • The run field user_id has been deprecated in favour of the new tag mlflow.user.
  • user_id has been removed as an argument from the create_run methods of the Python and R clients.

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

I've selected 'no' as the Python fluent interface and Java client interfaces are unchanged, and it seems likely as users would not often be using the user_id argument to mlflow_start_run in the R client.

What component(s) does this PR affect?

  • UI
  • CLI
  • API
  • REST-API
  • Examples
  • Docs
  • Tracking
  • Projects
  • Artifacts
  • Models
  • Scoring
  • Serving
  • R
  • Java
  • Python

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

@@ -454,6 +454,8 @@ message RunInfo {
optional string experiment_id = 2;

// User who initiated the run.
// This field is deprecated, and will be removed in a future MLflow release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe include a since/as of MLflow 1.0 in this comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@aarondav
Copy link
Contributor

aarondav commented May 9, 2019

I think we may want to go ahead and make sqlalchemy_store and file_store auto-upgrade the user attribute into a tag, so that any runs created in MLflow 1.0 will not require migration once we drop the attribute.

This is similar what we did for RUN_NAME here: https://github.com/mlflow/mlflow/pull/1188/files#diff-73334c51ec814d7e8b40d09d7d828f82L365

@@ -581,6 +583,8 @@ message CreateRun {
optional string experiment_id = 1;

// ID of the user executing the run.
// This field is deprecated, and will be removed in a future MLflow release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jfyi - we are planning on adding a doc page specific to MLflow tags, as opposed to documenting within the REST API docs. I think someone may have accidentally deleted your previous docs, but those will be resurrected there.

@acroz
Copy link
Contributor Author

acroz commented May 9, 2019

I think we may want to go ahead and make sqlalchemy_store and file_store auto-upgrade the user attribute into a tag, so that any runs created in MLflow 1.0 will not require migration once we drop the attribute.

Do you mean to set it for new runs or to migrate existing runs?

In this PR, the user ID is being passed to the stores as both an attribute and a tag, so all runs created in MLflow 1.0 onwards would have both.

@aarondav
Copy link
Contributor

aarondav commented May 9, 2019

Hmm, yeah, guess I was thinking for old clients, but you're right. Seems unnecessary.

@akshaya-a
Copy link
Contributor

@akarloff @namikhai for awareness - i think we already started ignoring this in our server, thanks for the clarity!

@aarondav aarondav merged commit f23648f into mlflow:master May 13, 2019
@sueann sueann added the rn/breaking-change Mention under Breaking Changes in Changelogs. label May 30, 2019
avflor pushed a commit to avflor/mlflow that referenced this pull request Aug 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rn/breaking-change Mention under Breaking Changes in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants