New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best commit in replicate ls/ps using primary metric #44
Conversation
4e8734a
to
8f5c031
Compare
8f5c031
to
658f0de
Compare
@bfirsh I assume from your comment in Slack that this is okay to merge? Happy to help with merge conflict resolution when your branch is ready to land. |
Haha, no, I commented in Slack specifically so it wasn't confused as a code review. I'll take a look now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good generally! Comments inline.
missing_keys = metric_keys - label_keys | ||
if missing_keys: | ||
sys.stderr.write( | ||
"Warning: Missing metric{} in commit: {}".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
|
||
// pull out the saved config from the commits list | ||
// TODO(andreas): this is a temporary hack, refactor once | ||
// we've migrated to the new data format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind elaborating on this? How do you imagine this working with the new data model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know :) it depends on the data model. But I imagine that there will be some sort of Experiment
object you can retrieve by experiment ID. That Experiment
has a Config
inside it, since config is saved by replicate.init()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List now seems to read the config from the latest commit. What is the reasoning behind this?
Surely when you update the primary metric in replicate.yaml, you want replicate list
to change immediately? In the current implementation, it seems it won't change until you run another experiment.
@@ -66,6 +68,7 @@ def get_metadata(self) -> Dict[str, Any]: | |||
"params": self.params, | |||
"user": self.get_user(), | |||
"host": self.get_host(), | |||
"config": self.config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended as temporary, as hinted at in your TODO in the CLI, or do we want to keep this?
I think I like the idea of the point-in-time config being in the experiment (following our principle of just gathering as much information as possible) but it raises an interesting semantic issue: the storage URL is now in the experiment metadata. You can imagine moving storages around, and it's a bit weird that the old storage URL is then baked inside the experiment. A actual problem? Perhaps not. A bit messy? Yes.
It's a bit like a Git commit embedding the URL of the default remote. It just feels a bit conceptually weird.
Does the storage URL need to be in here? Maybe we can store everything except the storage URL. Or, maybe this is an early hint that the storage URL should be stored/configured elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an interesting question I've been going back and forth on. You can imagine scenarios in either direction.
Let's say you're working on a search model, and you use precision as the primary metric. Then you realize that the business cares more about recall, so you start tracking recall, and make that the primary metric. You didn't track recall before, but you still want to save your old experiments. In this scenario it's good that the config is saved along the experiment metadata, so that "best" means best precision for the old experiments and best recall for the new experiments.
The other scenario is the one we've discussed, where you start training a model without defining metrics in replicate.yaml, and tack on metrics later. In that scenario you want to use the local replicate.yaml.
I can't think of a way we can support both these, so we probably have to choose one. I'm leaning towards the first option, to save config along the experiment, so that it's frozen at the point of replicate run
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it makes sense to store the config alongside the experiment, as per our principle of gathering as much information as possible. You could argue this duplicated though: the replicate.yaml file is in the commit data. It being in the metadata could be thought of as an optimization.
I think the interesting problem is the storage URL being in there. There is a semantic mismatch of some kind going on.
Sorry about that, I misunderstood. I'll hold off on merging until I get an Approved. |
No biggy, don't worry! At this stage it doesn't matter if we review before or after it gets into master, just want to make sure I get my eyes on it somehow. :) |
More discussion, for future readers of this issue. https://replicatehq.slack.com/archives/CPRGK33J5/p1596491607005100 |
Looks something like
In order to support
--storage-url
and config metrics we need to fetch config from storage. This means that when you update metrics in your local replicate.yaml it won't affectreplicate ls
. I added a TODO to re-visit this.