-
Notifications
You must be signed in to change notification settings - Fork 757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Model Metadata API #1179
Conversation
Hello @jackyzha0, Thanks for updating this PR. There are currently no PEP 8 issues detected in this PR. Cheers! 🍻 Comment last updated at 2020-11-13 00:57:08 UTC |
still need to add coverage tests and web UI support, but would to hear your thoughts |
Codecov Report
@@ Coverage Diff @@
## master #1179 +/- ##
==========================================
+ Coverage 66.03% 66.08% +0.04%
==========================================
Files 135 141 +6
Lines 8553 9157 +604
==========================================
+ Hits 5648 6051 +403
- Misses 2905 3106 +201
Continue to review full report at Codecov.
|
@@ -104,7 +104,7 @@ def _load_from_directory(self, path): | |||
tokenizer = getattr( | |||
import_module("transformers"), self._tokenizer_type | |||
).from_pretrained(path) | |||
self._model = {"model": transformers_model, "tokenizer": tokenizer} | |||
return {"model": transformers_model, "tokenizer": tokenizer} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed this to actually just return model information after looking at how all the other frameworks currently set their self._model
property
# Protos | ||
gen-protos: ## Build protobufs for Python and Node | ||
@./protos/generate-docker.sh | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
bentoml/saved_bundle/config.py
Outdated
s = Struct() | ||
s.update(artifact_config["metadata"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add validation and type check for the metadata?
@jackyzha0 let's fix the test and we should be good to go |
I've got some other work I need to wrap up today but I'll try to find some time to wrap it up this week |
…nto metadata-api
@@ -204,6 +210,7 @@ def setup_routes(self): | |||
/healthz Health check ping | |||
/feedback Submitting feedback | |||
/metrics Prometheus metrics endpoint | |||
/metadata BentoService Artifact Metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also update the get_open_api_spec_json
method in the open_api.py
file to include this new endpoint?
""" | ||
Pack the in-memory trained model object to this BentoServiceArtifact | ||
|
||
Note: add "# pylint:disable=arguments-differ" to child class's pack method | ||
""" | ||
if metadata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be cleaner to move this and similarly the code in load
below to the __getattribute__
method in this class. That way we can avoid modifying all the other artifact classes.
Everything else looks great! Thanks for the PR @jackyzha0! Sorry about the delay in getting to help with the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jackyzha0 - found a few small issues I didn't notice on the first pass, could you take a look at the comments above?
@@ -48,7 +56,7 @@ def name(self): | |||
""" | |||
return self._name | |||
|
|||
def pack(self, model): | |||
def pack(self, model, metadata=None): # pylint: disable=unused-argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a type annotation here? e.g. metadata: dict=None
@@ -125,6 +146,16 @@ def wrapped_load(*args, **kwargs): | |||
"`load` an artifact multiple times may lead to unexpected " | |||
"behaviors" | |||
) | |||
|
|||
# load metadata if exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do the load lazily in the metadata
property call?
@property
def metadata(self):
if not self._metadata:
# load
return self._metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would imagine that introducing two other properties to the class like _metadata_path
and _metadata_loaded
is adding a lot of extra complexity for just a simple YAML load, what would be the main benefit of lazy loading here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackyzha0 got it, that makes sense. I think we would just need to add a property _load_path
, and if self._metadata is None
can be used to check if it's loaded.
You are right there isn't much benefit besides avoiding loading this file. I would assume in most use cases, the metadata yaml file will be very small so it should no be a problem, this LGTM!
if isinstance(kwargs['metadata'], dict): | ||
self._metadata = kwargs['metadata'] | ||
else: | ||
logger.warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for doing a warning here instead of raising a TypeError here? It seems to me it's better to raise an error explicitly, and help the user to discover this mistake early
Thanks for updating the PR, looking great! I will re-trigger the test run to reflect a fix just pushed to master, will merge once all tests are passing. |
Just added a fix for the new TypeError exception in the test |
* basic implementation of metadata saving * make consistent format across all frameworks and artifact types * update proto defs and include metadata in info command * formatting fixes * added tests and improved error messages * more complete tests for invalid artifact metadata * add /metadata endpoint * add basic api_server test * fix test for restoring metadata from saved bundle * update metadata load/save for frameworks * run lint+format, move to ruamel instead of regular yaml * remove comment * bring tf file up to date w master * fix naming and broken tf2 test * refactor load, save, pack back to service>artifacts>__init__ * whitespace fixes and more cleanup * update open_api spec * address review comments * fix raise exception test * remove unused vars
Description
.yml
file containing metadata for each artifactMotivation and Context
closes #881
How Has This Been Tested?
Types of changes
Component(s) if applicable
Checklist:
./dev/format.sh
and./dev/lint.sh
script have passed(instructions).