- Verify the Hardware and Software Requirements for NeMo Microservices
- Follow the prerequisites instruction, and create
.envfile with the following fields:NGC_API_KEY="<your_Nvidia_key>" HF_Token="<your_HF_key>" - Clone this git repository to this working directory.
- Connect to your OpenShift cluster.
- Run the commands:
chmod +x clear_namespace.sh chmod +x nemo_prerequisites.sh chmod +x deploy_microservices.sh chmod +x run.sh - Update the
NAMESPACEvariable in scriptrun.shscript. - Run
bash run.sh - Track the installation on your cluster.
- Enable NIM oprertor on OpenShift AI.
- Verify the existance of GPUs with
oc get nodes -o json | jq -r '.items[] | select(.spec.taints != null) | {name: .metadata.name, taints: .spec.taints}' - Use
llama-num.yamlto deploy the LLM using this might take about 10-15 minutes to complete, track the pod's events to make sure that there are no errors (for example authentication error)oc apply -f llama-nim.yaml
- Expose service:
or expose the pod:
oc expose svc jupyter-serviceoc expose pod jupyter-notebook-b7d5479dd-rx8v7 --port=8888 --name=jupyter-notebook-service - Get the route (use http no https)
oc get route - Get token via the pod:
the output would look like:
oc exec jupyter-notebook-b7d5479dd-rx8v7 -- jupyter server listso in my case the token is simply "token".Currently running servers: http://jupyter-notebook-b7d5479dd-rx8v7:8888/?token=token :: /home/jovya