Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA inference server docs need to be updated to not use ksonnet #959

Closed
jlewi opened this issue Jul 22, 2019 · 21 comments
Closed

NVIDIA inference server docs need to be updated to not use ksonnet #959

jlewi opened this issue Jul 22, 2019 · 21 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Jul 22, 2019

https://www.kubeflow.org/docs/components/serving/trtinferenceserver/

Docs still use ksonnet.

For serving, I think its simpler than rather than provide kustomize manifests, our docs just contain example YAML specs that people can copy and paste in order to create a deployment.

@pdmack Any interest in picking this up?

@issue-label-bot
Copy link

Issue Label Bot is not confident enough to auto-label this issue. See dashboard for more details.

@jlewi jlewi added this to New in 0.6.0 via automation Jul 22, 2019
@jlewi jlewi added this to To do in ksonnet-turndown via automation Jul 22, 2019
@sarahmaddox sarahmaddox added the release/v0.6 Doc updates reqd before we can archive the v0.6 doc branch label Jul 30, 2019
@sarahmaddox
Copy link
Contributor

We've added a comment in the doc, saying that it still uses ksonnet and needs updating:
https://www.kubeflow.org/docs/components/serving/trtinferenceserver/#kubernetes-generation-and-deploy

It'd be good to get this updated before we archive the v0.6 branch of the docs.

@pdmack is this something you could take on?

@asispatra
Copy link

Do this document available for kustomize ?

@svnoesis
Copy link

We are trying to setup Nvidia TensorRT Inference Server Image on Kubernetes/Kubeflow. Looking at the readme instructions in the github, it talks about ksonnet.
https://github.com/kubeflow/kubeflow/tree/master/kubeflow/nvidia-inference-server#kubernetes-generation-and-deploy

Is there any updated steps for using kustomize for deploying nvidia-inference-server?
Could anyone please help ?

@jlewi
Copy link
Contributor Author

jlewi commented Sep 8, 2019

@svnoesis Here's a sample Kubernetes spec (I generated it using the ksonnet commands) for deploying the NVIDIA inference server.

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: inference-server
    ksonnet.io/component: iscomp
  name: inference-server
  namespace: default
spec:
  ports:
  - name: http-inference-server
    port: 8000
    targetPort: 8000
  - name: grpc-inference-server
    port: 8001
    targetPort: 8001
  - name: metrics-inference-server
    port: 8002
    targetPort: 8002
  selector:
    app: inference-server
  type: LoadBalancer
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: inference-server
    ksonnet.io/component: iscomp
  name: inference-server-v1
  namespace: default
spec:
  template:
    metadata:
      labels:
        app: inference-server
        version: v1
    spec:
      containers:
      - args:
        - trtserver
        - --model-store=gs://inference-server-model-store/tf_model_store
        image: nvcr.io/nvidia/inferenceserver:18.08.1-py2
        imagePullPolicy: IfNotPresent
        livenessProbe:
          httpGet:
            path: /api/health/live
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 5
        name: inference-server
        ports:
        - containerPort: 8000
        - containerPort: 8001
        - containerPort: 8002
        readinessProbe:
          httpGet:
            path: /api/health/ready
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 5
        resources:
          limits:
            nvidia.com/gpu: 1
        securityContext:
          fsGroup: 1000
          runAsUser: 1000
      imagePullSecrets:
      - name: ngc

You should just modify that spec as needed to make it work for your use case; i.e change model-store to point to your model.

It would be great if someone would open up a PR updating the docs.
https://www.kubeflow.org/docs/components/serving/trtinferenceserver/

@jlewi jlewi added help wanted Extra attention is needed good first issue Good for newcomers labels Sep 8, 2019
@sarahmaddox sarahmaddox added good first issue Good for newcomers and removed good first issue Good for newcomers labels Oct 30, 2019
@jlewi jlewi added this to To Do in Needs Triage Nov 26, 2019
@sarahmaddox sarahmaddox added the doc-sprint Issues to work on during the Kubeflow Doc Sprint label Jan 1, 2020
@sarahmaddox sarahmaddox added this to Sprint backlog in doc-sprint Jan 3, 2020
@jtfogarty
Copy link

jtfogarty commented Jan 5, 2020

/area docs
/kind feature

@jtfogarty jtfogarty moved this from To Do to Assigned to Area Owner For Triage in Needs Triage Jan 6, 2020
@jlewi jlewi added this to To do in KF1.0 via automation Jan 6, 2020
@jlewi
Copy link
Contributor Author

jlewi commented Jan 6, 2020

Bump to P0 we should get the doc updated for KF 1.0

@kubeflow-bot kubeflow-bot removed this from Assigned to Area Owner For Triage in Needs Triage Jan 13, 2020
@sarahmaddox
Copy link
Contributor

/assign @pdmack

@pdmack I'm assigning this to you, based on an earlier Slack conversation. (The issue includes an example of a YAML spec, but that may be out of date by now.) This issue is a P0 for Kubeflow v1.0. Let me know if you can't address it. In that case, I'll remove the content of the page (leaving the page in place) and say that NVIDIA Inference Server is not supported beyond Kubeflow v0.6.

@sarahmaddox sarahmaddox removed doc-sprint Issues to work on during the Kubeflow Doc Sprint good first issue Good for newcomers labels Jan 29, 2020
@jlewi jlewi moved this from To do to Docs in KF1.0 Jan 29, 2020
@mapsacosta
Copy link

Any progress on this? Can someone provide hints on how to deploy this through a non-deprecated tool, namely, kustomize? This is a critical step for us.

@jlewi
Copy link
Contributor Author

jlewi commented Feb 3, 2020

@mapsacosta see my comment above
#959 (comment)

You can just create K8s YAML specs to deploy the NVIDIA inference server.

@sarahmaddox
Copy link
Contributor

/unassign @pdmack

Unassigning this issue in case someone else can pick it up.

@hefedev
Copy link
Contributor

hefedev commented Feb 5, 2020

@mapsacosta There's also a chart repository you can look into https://github.com/NVIDIA/tensorrt-inference-server/tree/master/deploy/single_server

@jlewi
Copy link
Contributor Author

jlewi commented Feb 7, 2020

@sarahmaddox I think we can close this after #1608 is merged. I don't think there is any more immediate work.

@sarahmaddox sarahmaddox added priority/p2 doc-sprint Issues to work on during the Kubeflow Doc Sprint and removed priority/p0 release/v1.0 labels Feb 7, 2020
@sarahmaddox
Copy link
Contributor

Thanks @jlewi Agreed that there's no immediate work required. But I think it would be good to add more info to this page if we can. I've labeled this issue for the doc sprint, in case anyone wants to pick it up then.

For people interested in picking up this issue: There's useful info in the comments on the issue. In addition, you may find useful content (though out of date) in the Kubeflow v0.6 docs.

@hefedev
Copy link
Contributor

hefedev commented Feb 7, 2020

This is what you need to get this working with yaml spec posted above, I'd wait for kustomize build since it offers better integration with kubeflow

export NVIDIA_API_KEY="your_nvidia_api_key"
export NVIDIA_API_EMAIL="your_nvidia_api_email"
export NVIDIA_IMAGE_TAG="18.08.1-py2"
export NVIDIA_EXAMPLE_MODEL_URL="github.com/NVIDIA/tensorrt-inference-server"

kubectl create secret \
    docker-registry ngc \
    --docker-server=nvcr.io \
    --docker-username="\$oauthtoken" \
    --docker-password=$NVIDIA_API_KEY \
    --docker-email=$NVIDIA_API_EMAIL

@sarahmaddox
Copy link
Contributor

@deadeyegoodwin Do you have bandwidth to update the Triton Inference Server docs to include the info provided in the comments on this issue?

If you don't have bandwidth, I'll close this issue now, since the docs no longer reference ksonnet. (They just link to the NVIDIA docs without providing additional info.)

@mengdong
Copy link

@sarahmaddox would you be more specific on what you request for Triton Inference Server docs? Thanks. Would the helm chart the sufficient to you or you would prefer a different example?

@sarahmaddox
Copy link
Contributor

@mengdong I'd be happy for someone from NVIDIA to confirm that the Triton Inference Server deployment using Helm is the best one for integration with Kubeflow. If that's the case, then no further work needed. In that case, I'll close this issue.

Otherwise, if someone is able to develop and document a kustomize deployment, I'll leave this issue open. (I'd actually thought that the comments above were referring to a kustomize YAML file, but I see the YAML uses ksonnet, which Kubeflow no longer uses.)

@mengdong
Copy link

Okay, thanks for clarifying. I am from NVIDIA. We can document a basic kustomize deployment for Triton Inference server. I will keep you posted

@sarahmaddox
Copy link
Contributor

Thanks, that's excellent news.

@stale
Copy link

stale bot commented Jun 30, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot closed this as completed Jul 7, 2020
ksonnet-turndown automation moved this from To do to Done Jul 7, 2020
doc-sprint automation moved this from Sprint backlog to Done Jul 7, 2020
KF1.0 automation moved this from Docs to Done Jul 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docs doc-sprint Issues to work on during the Kubeflow Doc Sprint help wanted Extra attention is needed kind/feature lifecycle/stale priority/p2
Projects
No open projects
0.6.0
  
New
KF1.0
  
Done
doc-sprint
  
Done
Development

No branches or pull requests

10 participants