-
-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] Standard API endpoints? #845
Comments
Hey! I don't believe there is such a standard, which says a lot about the maturity of the field. But I may be wrong :) I'm sure it's not too hard to work out some specs. Is there an established format to write down such a spec? Like an RFC? |
I've never gone through creating a formal RFC, although I did create an RFC template for opencontainers! https://specs.opencontainers.org/image-spec/?v=v1.0.1. I think early work probably wouldn't need to be official - my thinking is I'll write up a spec.md doc alongside what I'm testing and see if anyone else is interested. |
heyo! So I started a very basic spec, and it's based off of chantilly and then django-river-ml. I tried to keep it as simple as possible since it's the first shot 👉 https://vsoch.github.io/riverapi/getting_started/spec.html please provide any feedback / point people in this direction that might be interested to help or think more about it! Closing the issue since my question is answered and resolved. |
Very cool! I take it this overlaps with tools like OpenAPI and Swagger. But those are generated once the implementation is done; they're not specs. The routes looks good to me. One thing though: for me there should also be a |
Oh indeed! Yes I can add that view to the django plugin - there are easy ways to do that.
So you are saying |
So this I what I try to explain in my talks, but it's not an easy concept. Basically:
Under the hood, the features and the label can be joined to make the model learn. This is helpful because it avoids stuttering: the features are passed once in It is up to the system to decide what to do when Does that make more sense? |
I can give it a shot! If you updated the example in your test.py for chantilly with this approach what would that look like? |
Ah well something like this I suppose: x = {...}
uuid = ...
requests.post('/predict', json={'features': x, 'id': uuid})
label = True
requests.post('/label', json={'id': uuid, 'label': label}) |
Gotcha, so you would store one or more labels with a model name and identifier? E.g.,:
|
and a label != a ground truth provided in /learn ? |
Yes you could store it like that. But then once you consider the case of having multiple models being updated in parallel, then this storage scheme might not make much sense. And yes label and ground truths are synonyms. |
Agree, so just to clarify the use cases:
Where exactly does this identifier come from? I see it's optional to add for variious endpoints, but it's not clear how it's generated. Shouldn't the server be generating it (and returning it somewhere) for the user, and then the user could do something like update a previous identifier? I also think if ground truth == label the API should use them consistently, either choosing ground truth or label (but not both). What do you think? |
Indeed, when you have a ground truth, it usually means you made a prediction beforehand.
It depends. Ideally the user should provide this. But you could ask generate one for each prediction as a convenience for the user.
Yes, I suppose so. I would go with ground truth, as label is usually only used for classification. |
Follow up question for the label here - instead of trying to store it, can we not just use it to update the metrics from the previous prediction (and then delete the identifier from the cache since we've labeled it and reflected the accuracy etc. in the model? It looks like for the current implementation when we get ground truth for a label we:
So I'm inclined for label to do the same, and not actually save/cache it anywhere - it's basically the same as predict minus doing the prediction because we get it from the cache. Does that sound ok? |
Yes of course, you can do that. I'm only saying that doing the learning in the background might be desirable for performance reasons. |
Hi! Is there any work in the online ML community to derive a standard set of API endpoints / interactions for a service (and then implementations can use and add extra as needed?) An example in the containers community would be the OCI distribution-spec: https://github.com/opencontainers/distribution-spec and I've made one for workflows too: https://github.com/panoptes-organization/monitor-schema/blob/main/spec.md.
I want to ask because if there are a bunch of us making similar servers, it might make sense to go off of the same or similar design. Thank you!
The text was updated successfully, but these errors were encountered: