Best commit in replicate ls/ps using primary metric #44

andreasjansson · 2020-07-31T22:21:03Z

Looks something like

experiment  started             status   host      user     param-1  latest   step  label-1  best     step  label-1
1eeeeee     10 seconds ago      running  10.1.1.1  andreas  100      3cccccc  20    0.02     2cccccc  20    0.01
2eeeeee     about a second ago  stopped  10.1.1.2  andreas  200      4cccccc  5              N/A

In order to support --storage-url and config metrics we need to fetch config from storage. This means that when you update metrics in your local replicate.yaml it won't affect replicate ls. I added a TODO to re-visit this.

andreasjansson · 2020-08-03T09:08:00Z

@bfirsh I assume from your comment in Slack that this is okay to merge? Happy to help with merge conflict resolution when your branch is ready to land.

bfirsh · 2020-08-03T18:34:32Z

Haha, no, I commented in Slack specifically so it wasn't confused as a code review. I'll take a look now.

bfirsh

Looks good generally! Comments inline.

bfirsh · 2020-08-03T18:38:55Z

python/replicate/commit.py

+        missing_keys = metric_keys - label_keys
+        if missing_keys:
+            sys.stderr.write(
+                "Warning: Missing metric{} in commit: {}".format(


bfirsh · 2020-08-03T18:40:50Z

cli/pkg/list/list.go

+
+// pull out the saved config from the commits list
+// TODO(andreas): this is a temporary hack, refactor once
+// we've migrated to the new data format.


Would you mind elaborating on this? How do you imagine this working with the new data model?

I don't know :) it depends on the data model. But I imagine that there will be some sort of Experiment object you can retrieve by experiment ID. That Experiment has a Config inside it, since config is saved by replicate.init().

List now seems to read the config from the latest commit. What is the reasoning behind this?

Surely when you update the primary metric in replicate.yaml, you want replicate list to change immediately? In the current implementation, it seems it won't change until you run another experiment.

bfirsh · 2020-08-03T18:46:00Z

python/replicate/experiment.py

@@ -66,6 +68,7 @@ def get_metadata(self) -> Dict[str, Any]:
            "params": self.params,
            "user": self.get_user(),
            "host": self.get_host(),
+            "config": self.config,


Is this intended as temporary, as hinted at in your TODO in the CLI, or do we want to keep this?

I think I like the idea of the point-in-time config being in the experiment (following our principle of just gathering as much information as possible) but it raises an interesting semantic issue: the storage URL is now in the experiment metadata. You can imagine moving storages around, and it's a bit weird that the old storage URL is then baked inside the experiment. A actual problem? Perhaps not. A bit messy? Yes.

It's a bit like a Git commit embedding the URL of the default remote. It just feels a bit conceptually weird.

Does the storage URL need to be in here? Maybe we can store everything except the storage URL. Or, maybe this is an early hint that the storage URL should be stored/configured elsewhere.

It's an interesting question I've been going back and forth on. You can imagine scenarios in either direction.

Let's say you're working on a search model, and you use precision as the primary metric. Then you realize that the business cares more about recall, so you start tracking recall, and make that the primary metric. You didn't track recall before, but you still want to save your old experiments. In this scenario it's good that the config is saved along the experiment metadata, so that "best" means best precision for the old experiments and best recall for the new experiments.

The other scenario is the one we've discussed, where you start training a model without defining metrics in replicate.yaml, and tack on metrics later. In that scenario you want to use the local replicate.yaml.

I can't think of a way we can support both these, so we probably have to choose one. I'm leaning towards the first option, to save config along the experiment, so that it's frozen at the point of replicate run.

Yeah, I think it makes sense to store the config alongside the experiment, as per our principle of gathering as much information as possible. You could argue this duplicated though: the replicate.yaml file is in the commit data. It being in the metadata could be thought of as an optimization.

I think the interesting problem is the storage URL being in there. There is a semantic mismatch of some kind going on.

andreasjansson · 2020-08-03T19:37:51Z

Haha, no, I commented in Slack specifically so it wasn't confused as a code review. I'll take a look now.

Sorry about that, I misunderstood. I'll hold off on merging until I get an Approved.

bfirsh · 2020-08-03T21:03:20Z

No biggy, don't worry! At this stage it doesn't matter if we review before or after it gets into master, just want to make sure I get my eyes on it somehow. :)

bfirsh · 2020-08-03T22:23:09Z

More discussion, for future readers of this issue. https://replicatehq.slack.com/archives/CPRGK33J5/p1596491607005100

andreasjansson requested a review from bfirsh July 31, 2020 22:21

andreasjansson force-pushed the andreas/best-commit-in-ls branch from 4e8734a to 8f5c031 Compare July 31, 2020 22:41

best commit in replicate ls/ps using primary metric

658f0de

andreasjansson force-pushed the andreas/best-commit-in-ls branch from 8f5c031 to 658f0de Compare August 3, 2020 08:15

andreasjansson merged commit e3ecea0 into master Aug 3, 2020

andreasjansson deleted the andreas/best-commit-in-ls branch August 3, 2020 09:15

bfirsh reviewed Aug 3, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best commit in replicate ls/ps using primary metric #44

Best commit in replicate ls/ps using primary metric #44

andreasjansson commented Jul 31, 2020

andreasjansson commented Aug 3, 2020

bfirsh commented Aug 3, 2020

bfirsh left a comment

bfirsh Aug 3, 2020

bfirsh Aug 3, 2020

andreasjansson Aug 3, 2020

bfirsh Aug 3, 2020

bfirsh Aug 3, 2020

andreasjansson Aug 3, 2020

bfirsh Aug 3, 2020

andreasjansson commented Aug 3, 2020

bfirsh commented Aug 3, 2020

bfirsh commented Aug 3, 2020

Best commit in replicate ls/ps using primary metric #44

Best commit in replicate ls/ps using primary metric #44

Conversation

andreasjansson commented Jul 31, 2020

andreasjansson commented Aug 3, 2020

bfirsh commented Aug 3, 2020

bfirsh left a comment

Choose a reason for hiding this comment

bfirsh Aug 3, 2020

Choose a reason for hiding this comment

bfirsh Aug 3, 2020

Choose a reason for hiding this comment

andreasjansson Aug 3, 2020

Choose a reason for hiding this comment

bfirsh Aug 3, 2020

Choose a reason for hiding this comment

bfirsh Aug 3, 2020

Choose a reason for hiding this comment

andreasjansson Aug 3, 2020

Choose a reason for hiding this comment

bfirsh Aug 3, 2020

Choose a reason for hiding this comment

andreasjansson commented Aug 3, 2020

bfirsh commented Aug 3, 2020

bfirsh commented Aug 3, 2020