Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve KServe compatibility #190

Merged
merged 15 commits into from
May 18, 2023
Merged

Conversation

varunsh-xilinx
Copy link
Member

@varunsh-xilinx varunsh-xilinx commented May 17, 2023

Summary of Changes

  • Change STRING type to BYTES
  • Add versioning support
  • Change shape type from uint64 to int64

Closes #5

Motivation

Compatibility with the KServe is a key feature of the inference server.

Implementation

Changing STRING to BYTES is a nomenclature change. It could still use some implementation changes because internally it's dealt with as char as opposed to std::byte.

Versioning support is a more substantial feature. Versioning is only supported for modelLoad where it's loading from a repository. The other caveat is that if you load a versioned model, then you can only interact with the model with versioned APIs. The standard non-versioned APIs continue to exist. For versioned models, a version string is suffixed to the endpoint to allow for multiple versions of the model to be active at a time. Non-versioned models omit the version suffix entirely. The implementation uses an empty version string to indicate no version.

The KServe spec is inconsistent with its definition for the type of shape. In text, it says uint64 but the gRPC proto definition uses int64. Using int64 for shape allows the server to specify dynamic shape sizes (e.g. using -1920 indicates the shape could be up to 1920) for models that don't have fixed sizes.

Notes

N/A

Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
Signed-off-by: Varun Sharma <varun.sharma@amd.com>
@gbuildx
Copy link
Collaborator

gbuildx commented May 17, 2023

Build successful!

Signed-off-by: Varun Sharma <varun.sharma@amd.com>
@gbuildx
Copy link
Collaborator

gbuildx commented May 18, 2023

Build successful!

@varunsh-xilinx varunsh-xilinx merged commit f1bafff into main May 18, 2023
@varunsh-xilinx varunsh-xilinx deleted the improve-kserve-compatibility branch May 18, 2023 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve compatibility with KServe API
2 participants