Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve string representations of entities #371

Merged
merged 2 commits into from Aug 27, 2018

Conversation

smurching
Copy link
Collaborator

@smurching smurching commented Aug 25, 2018

Currently, MLflow entities use default Python string serialization, resulting in hard-to-interpret output like:

>>> from  mlflow.tracking import get_service
>>> service = get_service()
>>> experiments = service.list_experiments()
[<mlflow.entities.experiment.Experiment at 0x7f93a8902a90>]

This PR improves entities' string representations to include a listing of their fields (and recursive serialization of lists of entities, e.g. lists of metrics/params within a run). The above now results in:

>>> from  mlflow.tracking import get_service
>>> service = get_service()
>>> experiments = service.list_experiments()
[<mlflow.entities.experiment.Experiment: 'artifact_location'='/Users/sid/code/mlflow/mlruns/0', 'experiment_id'=0, 'name'='Default'>]

Runs now appear as:

>>> r = service.get_run(run_id)
>>> r
<mlflow.entities.run.Run: 'data'=<mlflow.entities.run_data.RunData: 'metrics'=[<mlflow.entities.metric.Metric: 'key'='random_int', 'timestamp'=1534987517, 'value'=75.0>, <mlflow.entities.metric.Metric: 'key'='foo', 'timestamp'=1534987517, 'value'=7.0>], 'params'=[<mlflow.entities.param.Param: 'key'='param1', 'value'='5'>], 'tags'=[]>, 'info'=<mlflow.entities.run_info.RunInfo: 'artifact_uri'='/Users/sid/code/mlflow/mlruns/0/145d84e1b29c4fe9a3b191f27e63a79f/artifacts', 'end_time'=1534987517756, 'entry_point_name'='', 'experiment_id'=0, 'name'='Run 6', 'run_uuid'='145d84e1b29c4fe9a3b191f27e63a79f', 'source_name'='example/advanced/test_remote_server.py', 'source_type'=4, 'source_version'='e375c52679ab74b3f6e864d6bcab0118125e31e4', 'start_time'=1534987517633, 'status'=3, 'user_id'='sid'>>

Feedback is welcome - run serialization is still somewhat unwieldy due to the number of fields in a run.

Copy link
Contributor

@mparkhe mparkhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool implementation. Minor comment about property ordering in output.

return self.printer.pformat(obj)

def serialize_entity(self, entity):
return ", ".join(["%s=%s" % (self.serialize(key), self.serialize(value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we iterate this in the order specified _properties() method for each entity.

Here's why with an example. Experiment would read more intuitively this way --

[<mlflow.entities.experiment.Experiment: 'experiment_id'=0, 'name'='Default', 'artifact_location'='/Users/sid/code/mlflow/mlruns/0'>]

Rather than hash sorted order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, that would be nice. Also don't put quotes around the key -- basically change the self.serialize(key) to just key.

super(_MLflowObjectPrinter, self).__init__()
self.printer = pprint.PrettyPrinter()

def serialize(self, obj):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 very neat and tight implementation!

return serialize(self)


def serialize(obj):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a note, I think "serialize" can mean converting to formats other than string representation, so I would call this to_string or pretty_print or something.


def serialize(self, obj):
if isinstance(obj, _MLflowObject):
return "<%s: %s>" % (get_classname(obj), self.serialize_entity(obj))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd include just the last part of the class name: Experiment instead of mlflow.entities.experiment.Experiment.

* rename serialize -> to_string
* order fields based on self._properties
* don't use quotes in keys
* show final segment of classname
@mateiz mateiz added the LGTM label Aug 26, 2018
@smurching
Copy link
Collaborator Author

Thanks @mateiz @mparkhe, merging to master!

@smurching smurching merged commit 33060e4 into mlflow:master Aug 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants