Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store run info in tags #926

Merged
merged 23 commits into from Mar 8, 2019
Merged

Store run info in tags #926

merged 23 commits into from Mar 8, 2019

Conversation

@acroz
Copy link
Contributor

acroz commented Feb 26, 2019

It's planned to have less information about runs be stored as top-level attributes on runs and push it into tags. This will make the models simpler and remove some inconsistencies with how data is currently structured.

This PR adds additional tags to the system-reserved mlflow. namespace and updates mlflow.tracking.fluent to write them on run creation. Specifically, the following tags have been added:

  • mlflow.source.type - currently stored as the source_type attribute of the run
  • mlflow.source.name - currently stored as the source_name attribute of the run
  • mlflow.source.git.commit - currently stored as the source_version attribute of the run
  • mlflow.project.entryPoint - currently stored as the entry_point_name attribute of the run

In addition, the following tags have been renamed to match the above pattern:

  • mlflow.source.git.branch - previously mlflow.gitBranchName
  • mlflow.source.git.repoURL - previously mlflow.gitRepoURL

The existing top-level attributes of the runs continue to be set, as do the older versions of the renamed tags, for preservation of existing functionality. It is anticipated that a future PR (before the v1.0 release) will remove these top-level attributes in favour of reading from tags, at which time users will hopefully have all or most of their data in tags for use by the UI and other tools.

This PR is dependent on PR #913, which implements tests of the existing mlflow.tracking.fluent.start_run functionality. See facultyai/mlflow@fluent-start-run-tests...facultyai:run-info-in-tags for a preview of the changes introduced by this PR in isolation.

@acroz

This comment has been minimized.

Copy link
Contributor Author

acroz commented Feb 26, 2019

@aarondav Can we rerun the tests on this - AFAICT it looks like they've failed for an unrelated reason.

acroz added a commit to facultyai/mlflow that referenced this pull request Feb 27, 2019
Currently, tests are failing in CI because the latest version of tensorflow on
PyPI (1.13.1) is ahead of that on Anaconda (1.12.0).

This is because the Keras tests dump out a model which bakes in the version of
tensorflow installed as a conda dependency. However, as tensorflow is
installed from PyPI when testing, this results in an unsatisfiable dependency
being written out with the serialised model. When the Keras tests
re-read the generated model, we attempt to create a new conda
environment with tensorflow==1.13.1 as a conda dependency, which
crashes.

It should probably be reconsidered whether relevant test dependencies should
just be installed from conda in the future, or alternatively if we
should create a default environment using pip depedencies rather than
conda dependances in `mlflow.keras` (and `mlflow.tensorflow` and other
model types as appropriate).

For the moment, though, this version issue is causing tests to fail (see
at least PRs mlflow#926 and mlflow#930), so I suggest pinning the tensorflow version
as in this PR to facilitate other development.
@acroz acroz mentioned this pull request Feb 27, 2019
acroz added a commit to facultyai/mlflow that referenced this pull request Feb 27, 2019
Currently, tests are failing in CI because the latest version of tensorflow on
PyPI (1.13.1) is ahead of that on Anaconda (1.12.0).

This is because the Keras tests dump out a model which bakes in the version of
tensorflow installed as a conda dependency. However, as tensorflow is
installed from PyPI when testing, this results in an unsatisfiable dependency
being written out with the serialised model. When the Keras tests
re-read the generated model, we attempt to create a new conda
environment with tensorflow==1.13.1 as a conda dependency, which
crashes.

It should probably be reconsidered whether relevant test dependencies should
just be installed from conda in the future, or alternatively if we
should create a default environment using pip depedencies rather than
conda dependances in `mlflow.keras` (and `mlflow.tensorflow` and other
model types as appropriate).

For the moment, though, this version issue is causing tests to fail (see
at least PRs mlflow#926 and mlflow#930), so I suggest pinning the tensorflow version
as in this PR to facilitate other development.
dbczumar added a commit that referenced this pull request Feb 27, 2019
* Pin tensorflow to 1.12.0 in test-requirements.txt

Currently, tests are failing in CI because the latest version of tensorflow on
PyPI (1.13.1) is ahead of that on Anaconda (1.12.0).

This is because the Keras tests dump out a model which bakes in the version of
tensorflow installed as a conda dependency. However, as tensorflow is
installed from PyPI when testing, this results in an unsatisfiable dependency
being written out with the serialised model. When the Keras tests
re-read the generated model, we attempt to create a new conda
environment with tensorflow==1.13.1 as a conda dependency, which
crashes.

It should probably be reconsidered whether relevant test dependencies should
just be installed from conda in the future, or alternatively if we
should create a default environment using pip depedencies rather than
conda dependances in `mlflow.keras` (and `mlflow.tensorflow` and other
model types as appropriate).

For the moment, though, this version issue is causing tests to fail (see
at least PRs #926 and #930), so I suggest pinning the tensorflow version
as in this PR to facilitate other development.

* Experiment: See if pinning azureml fixes tests on CI

* Revert "Experiment: See if pinning azureml fixes tests on CI"

This reverts commit 0eb9188.

* Pin cryptography package
@acroz acroz force-pushed the facultyai:run-info-in-tags branch from 6b7358c to 31ae4fe Feb 28, 2019
@acroz acroz force-pushed the facultyai:run-info-in-tags branch from 31ae4fe to 5daed4d Feb 28, 2019
mlflow/utils/mlflow_tags.py Outdated Show resolved Hide resolved
MLFLOW_RUN_NAME = "mlflow.runName"
MLFLOW_GIT_BRANCH_NAME = "mlflow.gitBranchName"
MLFLOW_GIT_REPO_URL = "mlflow.gitRepoURL"
MLFLOW_SOURCE_TYPE = "mlflow.source.type"

This comment has been minimized.

Copy link
@aarondav

aarondav Mar 6, 2019

Contributor

Currently, source type is typed -- NOTEBOOK / JOB / PROJECT / LOCAL / UNKNOWN. With this change, we will lose the documentation on what "built-in" source types are, and what expected values look like. It seems like we should probably have a docs page where we define the default tags for MLflow, similar to our REST API docs.

This comment has been minimized.

Copy link
@acroz

acroz Mar 7, 2019

Author Contributor

As discussed, I've added this to the REST API docs for now. I'm not sure of meaning of some of the Databricks-specific tags, though - can you have a look and provide descriptions?

tests/store/test_sqlalchemy_store.py Show resolved Hide resolved
acroz added 8 commits Mar 7, 2019
Currently, all other values in `mlflow.utils.mlflow_tags` are tag keys,
and there is nothing to indicate in that module that `MLFLOW_DOCKER` and
`MLFLOW_CONDA` are actually intended for use as tag values.

At some point, an enumeration of valid values for this tag could be
created, but since that seems outside the scope of this PR and these tag
values are only used in a single location, I've just baked them in place
for the time being.
Waiting for @aarondav to confirm the meaning of some of the
Databricks-specific tags before completing the second table.
docs/source/rest-api.rst Outdated Show resolved Hide resolved
@@ -1029,6 +1029,53 @@ Tag for a run.
| value | ``STRING`` | The tag value. |
+------------+------------+----------------+

Tag keys that start with ``mlflow.`` are reserved for internal use. The following tags are set

This comment has been minimized.

Copy link
@aarondav

aarondav Mar 8, 2019

Contributor

cc @stbof Here we are documenting MLflow system tags. We are currently planning on putting them in the REST API documentation as part of RunTag. Any thoughts on this location, or places where we should link to these? (Also any comments on the docs themselves?)

This comment has been minimized.

Copy link
@stbof

stbof Mar 8, 2019

Contributor

@aarondav I think there should be a link from Set Tag to the list of disallowed tag keys.

@aarondav

This comment has been minimized.

Copy link
Contributor

aarondav commented Mar 8, 2019

LGTM - let's just remove the databricks-specific tags for now, and we can merge. If there are other docs follow-ups, can probably do those in a follow-up.

@aarondav

This comment has been minimized.

Copy link
Contributor

aarondav commented Mar 8, 2019

Btw, went ahead and removed the databricks-specific tags to kick off a build.

@aarondav aarondav merged commit b423804 into mlflow:master Mar 8, 2019
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@acroz acroz deleted the facultyai:run-info-in-tags branch Mar 8, 2019
eedeleon added a commit to eedeleon/mlflow that referenced this pull request Mar 13, 2019
@sueann sueann mentioned this pull request Mar 28, 2019
sueann added a commit that referenced this pull request Mar 28, 2019
The current rest-api.rst file in the repo seems out of sync with the published docs (https://www.mlflow.org/docs/latest/rest-api.html), so trying to synchronize them. The file currently is generated from PB + some manually added information (see #926). We should try and make it consistent with the PB without having to manually edit it, but for now putting it in a workable state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.