Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kfserver add load/unload endpoint #1082

Merged
merged 22 commits into from Sep 16, 2020
Merged

kfserver add load/unload endpoint #1082

merged 22 commits into from Sep 16, 2020

Conversation

wengyao04
Copy link
Contributor

@wengyao04 wengyao04 commented Sep 4, 2020

What this PR does / why we need it:
add load/unload endpoint for KFModel which are used from Multi Model Sharing

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes # #1043

Special notes for your reviewer:

The load/unload methods only work for single process single thread, we have an issue #1074 to investigate parallelism in KFServing python model server

  • move sklearn and xgboost model to kfserving package
  • add class KFModelRepository with the following methods
    get_models()
    get_model()
    update()
    async load()
    unload()
    
  • add class KFModelFactory, which is used to create KFModel based on model suffix
  • add LoadHandler for endpoint v1/models/${MODEL_NAME}/load
  • add UnloadHandler for endpoint v1/models/${MODEL_NAME}/unload
  • Test sklearn/xgboost (run docker locally)
    • single model case: does not break
    • multi model case:
      • able to load/unload model
      • able to list models
      • able to predict if model exist, and return expected error when model does not exist
  • Test pytorch (we don't support pythorch in MMS POC):
    • single model case: does not break

Release note:

add load/unload endpoint for KFModel which are used from Multi Model Sharing

@kubeflow-bot
Copy link

This change is Reviewable

@k8s-ci-robot
Copy link
Contributor

Hi @wengyao04. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yuzliu
Copy link
Contributor

yuzliu commented Sep 4, 2020

/assign @cliveseldon

@@ -0,0 +1,39 @@
# Copyright 2019 kubeflow.org.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 2020
you can run the script under hack/boilterplate.sh

model_type, model_full_path = get_kfmodel_type(name, self.models_dir)

model = KFModelFactory.create_model(name, self.models_dir, model_type)
model.set_full_model_path(model_full_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call set_full_model_path in create_model?

Copy link
Contributor

@yuzliu yuzliu Sep 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto can line 47 to 50 be aggregated into one line like

 model = KFModelFactory.create_model(name, self.models_dir, model_type)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should full_model_path be passed in init?

path = os.path.join(model_dir, model_name + extension)
if os.path.exists(path):
return model_type, path
return None, ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest throw exception instead of returning None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, def throw an exception

async def post(self, name: str):
try:
loop = asyncio.get_running_loop()
ready = loop.run_until_complete(self.models.load(name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we wanted to make this async and use await here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to get the future till it is finished to ensure model is loaded successfully

Copy link
Member

@yuzisun yuzisun Sep 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wengyao04 effectively await does that and probably that's also how await is implemented under the hood, the main benefit of async is that it gives the chance for other handlers or tasks to run while itself is on IO. alibi uses that mainly because it is calling a async predict in the sync explainer handler.

except Exception as e:
ex_type, ex_value, ex_traceback = sys.exc_info()
raise tornado.web.HTTPError(
status_code=503,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sounds more like 500 instead of 503

status_code=404,
reason="Model with name %s does not exist." % name
)
self.write(f"succeed to unload model {name}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest a json response with model name instead of string

status_code=503,
reason=f"Model with name {name} is not ready."
)
self.write(f"succeed to load model {name}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest a json response with model name instead of string

@@ -21,7 +21,7 @@
]
setup(
name='sklearnserver',
version='0.4.0',
version='0.4.1',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not change this for now

logging.error(f"fail to load model {args.model_name} from dir {args.model_dir}. "
f"exception type {ex_type}, exception msg: {ex_value}")
model.ready = False
# if fail to load model, start kfserver with an empty model list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we might want to check if the model repository is empty and start with empty list

@yuzisun
Copy link
Member

yuzisun commented Sep 4, 2020

@wengyao04 Liked how you structure the model repo code, thanks for the contribution!

@yuzisun
Copy link
Member

yuzisun commented Sep 4, 2020

/ok-to-test

@wengyao04
Copy link
Contributor Author

Failed at sdk-test

tornado.httpclient.HTTPClientError: HTTP 503: Model with name model is not ready. Error type: <class 'AttributeError'> error msg: module 'asyncio' has no attribute 'get_running_loop'

But I can pass ./test/scripts/sdk-test.sh locally, checking ...


def load(self):
def load_from_model_dir(self):
model_path = kfserving.Storage.download(self.model_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to download model? I think the model agent will handle the download from bcs/gcs and the model server only need to load from local file system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to use storage initializer for the single mode, let's keep this logic for single model inference service.
The storage initializer download files
https://github.com/kubeflow/kfserving/blob/master/python/storage-initializer/scripts/initializer-entrypoint

The load_from_model_dir will not download again, it does some sanity check and create symlink

def __init__(self, name: str, model_dir: str, nthread: int, booster: \
XGBModel = None):
def __init__(self, name: str, model_dir: str, nthread: int = DEFAULT_NTHREAD,
booster: XGBModel = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just remove the booster argument? We should always load the model not init a XGBoostModel using an existing booster.

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this change to a separate PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(these header comment changes)

@yuzisun
Copy link
Member

yuzisun commented Sep 10, 2020

/retest

@k8s-ci-robot k8s-ci-robot removed the lgtm label Sep 10, 2020
Copy link
Member

@yuzisun yuzisun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wengyao04 I think we should leave the storage uri as model repository path in the control plane, the change assumes the single model case but it won't work for MMS, for example triton always expects a model repository path so we should keep the same behavior for sklearn and xgboost. When we start the sklearn/xgboost server, the --model-name is always passed in so you can derive the model path for single model case.

pkg/apis/serving/v1beta1/predictor_sklearn.go Outdated Show resolved Hide resolved
pkg/apis/serving/v1alpha2/framework_triton.go Outdated Show resolved Hide resolved
@@ -162,14 +162,19 @@ func (mi *StorageInitializerInjector) InjectStorageInitializer(pod *v1.Pod) erro
storageInitializerImage = mi.config.Image
}

modelLocalMountPath := constants.DefaultModelLocalMountPath
if modelName, ok := pod.ObjectMeta.Labels[constants.InferenceServicePodLabelKey]; ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modelName maybe not be always same as service name, custom container can pass modelName as an argument

securityContext := userContainer.SecurityContext.DeepCopy()
// Add an init container to run provisioning logic to the PodSpec
initContainer := &v1.Container{
Name: StorageInitializerContainerName,
Image: storageInitializerImage,
Args: []string{
srcURI,
constants.DefaultModelLocalMountPath,
modelLocalMountPath,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is likely a non backwards compatible change as some user might depend on the old model path

pkg/apis/serving/v1alpha2/framework_onnx.go Outdated Show resolved Hide resolved
pkg/apis/serving/v1alpha2/explainer_alibi.go Outdated Show resolved Hide resolved
class KFServer:
def __init__(self, http_port: int = args.http_port,
def __init__(self, registered_models: KFModelRepository = KFModelRepository(),
Copy link
Member

@yuzisun yuzisun Sep 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move registered_models to the last on the argument list to ensure backwards compatibility?

@yuzisun
Copy link
Member

yuzisun commented Sep 16, 2020

/retest

@yuzisun
Copy link
Member

yuzisun commented Sep 16, 2020

@wengyao04 Thanks for the awesome contribution!

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit d883c34 into kserve:master Sep 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants