New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kfserver add load/unload endpoint #1082
kfserver add load/unload endpoint #1082
Conversation
Hi @wengyao04. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @cliveseldon |
@@ -0,0 +1,39 @@ | |||
# Copyright 2019 kubeflow.org. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 2020
you can run the script under hack/boilterplate.sh
model_type, model_full_path = get_kfmodel_type(name, self.models_dir) | ||
|
||
model = KFModelFactory.create_model(name, self.models_dir, model_type) | ||
model.set_full_model_path(model_full_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call set_full_model_path
in create_model
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto can line 47 to 50 be aggregated into one line like
model = KFModelFactory.create_model(name, self.models_dir, model_type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also should full_model_path be passed in init?
path = os.path.join(model_dir, model_name + extension) | ||
if os.path.exists(path): | ||
return model_type, path | ||
return None, "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest throw exception instead of returning None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, def throw an exception
async def post(self, name: str): | ||
try: | ||
loop = asyncio.get_running_loop() | ||
ready = loop.run_until_complete(self.models.load(name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we wanted to make this async and use await here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to get the future till it is finished to ensure model is loaded successfully
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wengyao04 effectively await does that and probably that's also how await is implemented under the hood, the main benefit of async is that it gives the chance for other handlers or tasks to run while itself is on IO. alibi uses that mainly because it is calling a async predict in the sync explainer handler.
except Exception as e: | ||
ex_type, ex_value, ex_traceback = sys.exc_info() | ||
raise tornado.web.HTTPError( | ||
status_code=503, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this sounds more like 500 instead of 503
status_code=404, | ||
reason="Model with name %s does not exist." % name | ||
) | ||
self.write(f"succeed to unload model {name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest a json response with model name instead of string
status_code=503, | ||
reason=f"Model with name {name} is not ready." | ||
) | ||
self.write(f"succeed to load model {name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest a json response with model name instead of string
python/sklearnserver/setup.py
Outdated
@@ -21,7 +21,7 @@ | |||
] | |||
setup( | |||
name='sklearnserver', | |||
version='0.4.0', | |||
version='0.4.1', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not change this for now
logging.error(f"fail to load model {args.model_name} from dir {args.model_dir}. " | ||
f"exception type {ex_type}, exception msg: {ex_value}") | ||
model.ready = False | ||
# if fail to load model, start kfserver with an empty model list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here we might want to check if the model repository is empty and start with empty list
@wengyao04 Liked how you structure the model repo code, thanks for the contribution! |
/ok-to-test |
Failed at sdk-test
But I can pass ./test/scripts/sdk-test.sh locally, checking ... |
|
||
def load(self): | ||
def load_from_model_dir(self): | ||
model_path = kfserving.Storage.download(self.model_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need to download model? I think the model agent will handle the download from bcs/gcs and the model server only need to load from local file system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to use storage initializer for the single mode, let's keep this logic for single model inference service.
The storage initializer download files
https://github.com/kubeflow/kfserving/blob/master/python/storage-initializer/scripts/initializer-entrypoint
The load_from_model_dir will not download again, it does some sanity check and create symlink
def __init__(self, name: str, model_dir: str, nthread: int, booster: \ | ||
XGBModel = None): | ||
def __init__(self, name: str, model_dir: str, nthread: int = DEFAULT_NTHREAD, | ||
booster: XGBModel = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just remove the booster
argument? We should always load the model not init a XGBoostModel using an existing booster.
pkg/agent/storage/provider.go
Outdated
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this change to a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(these header comment changes)
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wengyao04 I think we should leave the storage uri as model repository path in the control plane, the change assumes the single model case but it won't work for MMS, for example triton always expects a model repository path so we should keep the same behavior for sklearn and xgboost. When we start the sklearn/xgboost server, the --model-name
is always passed in so you can derive the model path for single model case.
@@ -162,14 +162,19 @@ func (mi *StorageInitializerInjector) InjectStorageInitializer(pod *v1.Pod) erro | |||
storageInitializerImage = mi.config.Image | |||
} | |||
|
|||
modelLocalMountPath := constants.DefaultModelLocalMountPath | |||
if modelName, ok := pod.ObjectMeta.Labels[constants.InferenceServicePodLabelKey]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modelName
maybe not be always same as service name, custom container can pass modelName
as an argument
securityContext := userContainer.SecurityContext.DeepCopy() | ||
// Add an init container to run provisioning logic to the PodSpec | ||
initContainer := &v1.Container{ | ||
Name: StorageInitializerContainerName, | ||
Image: storageInitializerImage, | ||
Args: []string{ | ||
srcURI, | ||
constants.DefaultModelLocalMountPath, | ||
modelLocalMountPath, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is likely a non backwards compatible change as some user might depend on the old model path
class KFServer: | ||
def __init__(self, http_port: int = args.http_port, | ||
def __init__(self, registered_models: KFModelRepository = KFModelRepository(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move registered_models to the last on the argument list to ensure backwards compatibility?
/retest |
@wengyao04 Thanks for the awesome contribution! /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: yuzisun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
add load/unload endpoint for KFModel which are used from Multi Model Sharing
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes # #1043
Special notes for your reviewer:
The load/unload methods only work for single process single thread, we have an issue #1074 to investigate parallelism in KFServing python model server
class KFModelRepository
with the following methodsclass KFModelFactory
, which is used to create KFModel based on model suffixLoadHandler
for endpointv1/models/${MODEL_NAME}/load
UnloadHandler
for endpointv1/models/${MODEL_NAME}/unload
Release note: