Skip to content

Commit

Permalink
Roadmap changes (kubeflow#554)
Browse files Browse the repository at this point in the history
* Updated roadmap for 0.3 and 0.4

* more changes

* blah

* merged

* typo
  • Loading branch information
ellistarn authored and k8s-ci-robot committed Nov 20, 2019
1 parent 9d89afb commit 42a6abd
Showing 1 changed file with 58 additions and 43 deletions.
101 changes: 58 additions & 43 deletions ROADMAP.md
@@ -1,56 +1,47 @@
# KF Serving 2019 Roadmap
## Q3 2019
### v0.2 Integrate with the ML Ecosystem (ETA: August 15, 2019)
Objective: "Continue to simplify the user experience by deeply integrating with the Kubeflow Ecosystem."
* Kubeflow Integration
* Prepare KFServing to release v0.2 and v0.3 alongside Kubeflow v0.7.
* Integrate with `kfctl generate` and `kfctl apply`.
* Deploy as a [Kubernetes Application](https://github.com/kubernetes-sigs/application).
* Integrate with Kubeflow Pipelines to enable model deployment from a Pipeline.
* Integrate with Fairing to enable model deployment from a Notebook.
* Achieve 20% End-to-End Test Coverage of Supported Features. (See v0.3 for 80%).
* Support PVCs to enable integration with on-prem Kubeflow installations.
* Document Installation for various cloud providers (GCP, IBM Cloud, Azure, AWS).
# KF Serving Roadmap
## 2019 - 2020

Objective: "Empower users to deeply understand their predictions and validate KFServing's static graph architecture."
* Explainability
* Deploy a predictor and explainer, powered by Alibi.
* Deploy a predictor and explainer, powered by user specified explainer container.
### v0.4 Performance (ETA: Jan 31, 2019)
Objective: "Prevent performance regressions across a known set of representative models."
* Automated Performance Tests
* Define a set of Models to test covering a wide array of usecases and frameworks.
* Publish performance results over time to enable regression tracking.

Objective: "Increase coverage of ML frameworks to support previously unsupported customer workloads."
* Frameworks
* Deploy a ONNX model
* Explore supporting other model serialization mechanisms for certain frameworks (e.g. saving PyTorch models with dill)
Objective: "Enable users to deploy latency sensitive models with KFServing."
* High Performance Dataplane
* Enable support for GRPC or similar.
* Continue to support existing HTTP Dataplane.

Objective: "Reduce Total Cost of Ownership when deploying multiple underutilized models."
* GPU Sharing
* Reduce TCO by enabling models of the same framework and version to be co-hosted in a single model server.

### v0.3 Stability (ETA: Dec 15, 2019)
Objective: "Improve practices around dependency management."
* Migrate to Kubebuilder 2.0.
* Use Go Modules.
* Stop Vendoring dependencies.
* Avoid the extremely heavy dependency on Tensorflow.
* Migrate to Kubernetes 1.15.
* Enable LabelSelectors for the Pod Mutation Webhook.

### v0.3 Performance and Stability (ETA: September 1, 2019)
Objective: "Prevent feature regressions with 80% end-to-end test coverage against a live Cluster."
Objective: "Prevent feature regressions with greater end-to-end test coverage against a live cluster."
* Automated End-to-End Tests
* Execute against a Kubeflow maintained GKE Cluster.
* Execute against a Kubeflow maintained AKS Cluster.
* Achieve >80% Test Coverage of Supported Features.

Objective: "Prevent performance regressions across a known set of representative models."
* Automated Performance Tests
* Define a set of Models to test covering a wide array of usecases and frameworks.
* Publish performance results in a temporally comparable way.

Objective: "Improve the Serverless Experience by reducing cold starts/stops to 10 seconds on warmed models."
* Model Caching
* Reduce model download time by caching models from cloud storage on Persistent Volumes.
* Image Caching
* Reduce container download time by ensuring images are cached in all cloud environments.
* Server Shutdown
* Ensure that all model servers shutdown within 10 seconds of not receiving traffic.

Objective: "Simplify user experience with handling credentials for storage backends."
* Secure storage mechanisms
* Implement a simplified user experience with storage backends protected by credentials (e.g. S3/GCS accounts with credentials)
Objective: "Improve build and release processes to improve the developer experience and avoid regressions."
* Improve build reliability
* Implement build retries.
* Reduce PyTorch build time.
* Automated Image Injection for Model Servers.
* Implement new developer commands to deploy kfserving with local images.
* Improve versioning of XGBoost, SKLearn, and PyTorch
* Replace KFServing version with the corresponding framework version.

# Future
## Unscheduled Work
* Multi-Model Serving.
* Multiple InferenceServices share resources in the backend.
* GPU Sharing.
* Flexible Inference Graphs [MLGraph CRD](https://github.com/SeldonIO/mlgraph).
* Model Experimentation.
* Ensembling.
Expand All @@ -66,8 +57,32 @@ Objective: "Simplify user experience with handling credentials for storage backe
* Queue and batch requests to increase throughput.

# Historical
### v0.2 Integrate with the ML Ecosystem (Oct 31, 2019)
Objective: "Continue to simplify the user experience by deeply integrating with the Kubeflow Ecosystem."
* Kubeflow Integration
* Prepare KFServing to release v0.2 and v0.3 alongside Kubeflow v0.7.
* Integrate with `kfctl generate` and `kfctl apply`.
* Deploy as a [Kubernetes Application](https://github.com/kubernetes-sigs/application).
* Integrate with Kubeflow Pipelines to enable model deployment from a Pipeline.
* Integrate with Fairing to enable model deployment from a Notebook.
* Achieve 20% End-to-End Test Coverage of Supported Features. (See v0.3 for 80%).
* Support PVCs to enable integration with on-prem Kubeflow installations.
* Document Installation for various cloud providers (GCP, IBM Cloud, Azure, AWS).

Objective: "Empower users to deeply understand their predictions and validate KFServing's static graph architecture."
* Explainability
* Deploy a predictor and explainer, powered by Alibi.
* Deploy a predictor and explainer, powered by user specified explainer container.

Objective: "Increase coverage of ML frameworks to support previously unsupported customer workloads."
* Frameworks
* Deploy a ONNX model
* Explore supporting other model serialization mechanisms for certain frameworks (e.g. saving PyTorch models with dill)



## Q2 2019
### v0.1: InferenceService Minimum Viable Product (ETA: June 30, 2019)
### v0.1: InferenceService Minimum Viable Product (June 30, 2019)
Objective: "Simplify the user experience and provide a low barrier to entry by minimizing the amount of YAML necessary to deploy a trained model."
* High Level Interfaces
* Deploy a Tensorflow model without specifying a Tensorflow Serving Technology.
Expand Down

0 comments on commit 42a6abd

Please sign in to comment.