![Notebook Banner](cosmos-predict1_banner.png)

## Set Up Docker Login (Executing on A Terminal Locally)

Replace <your api key> with your own NGC API Key

```bash
export NGC_CLI_API_KEY="nvapi-xxx"
echo "$NGC_CLI_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin
```

## Deploy NIM (Executing on this Jupyter Notebook)

In [21]:
%%bash

export NGC_API_KEY="nvapi-xxx"

export CONTAINER_NAME="cosmos-predict1-7b-text2world"
export IMG_NAME="nvcr.io/nim/nvidia/cosmos-predict1-7b-text2world:1.0.0"

# A path on your system to cache the downloaded models
export LOCAL_NIM_CACHE=/ephemeral/.cache/nim
mkdir -p $LOCAL_NIM_CACHE

# Start the NVIDIA NIM for Cosmos
docker run -d --rm --name=$CONTAINER_NAME \
    --runtime=nvidia \
    --gpus all \
    --shm-size=16GB \
    -e NGC_API_KEY \
    -v $LOCAL_NIM_CACHE:/opt/nim/.cache \
    -u $(id -u) \
    -p 8000:8000 \
    $IMG_NAME

37e4944c0567b7f6057dff2e67f8ac22e5e6d8abe7520808b3221b18f36eb595


### Make Sure the Contianer is Up and Running
The following message indicates a successful startup.

{'message': 'Starting HTTP Inference server', 'port': 8000, 'workers_count': 1, 'host': '0.0.0.0', 'log_level': 'info', 'SSL': 'disabled'}

In [17]:
!docker logs cosmos-predict1-7b-text2world


== Triton Inference Server ==

NVIDIA Release 25.01 (build 136230209)
Triton Server Version 2.54.0

Copyright (c) 2018-2024, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: CUDA Forward Compatibility mode ENABLED.
  Using CUDA 12.8 driver version 570.86.10 with kernel driver version 535.183.06.
  See https://docs.nvidia.com/deploy/cuda-compatibility/ for details.

INFO 2025-07-30 11:56:07.598] Starting nimlib 0.8.3 nim_sdk 0.7.6
INFO 2025-07-30 11:56:07.598] NIM VERSION:
1.0.0
INFO 2025-07-30 11:56:07.598] NIM NOTICE:
GOVERNING TERMS: The NIM container is governed by the NVIDIA Software License Agreement and Product-Speci

In [19]:
!curl -i -X GET 'http://0.0.0.0:8000/v1/health/live'

HTTP/1.1 503 Service Unavailable
[1mdate[0m: Wed, 30 Jul 2025 12:41:11 GMT
[1mserver[0m: uvicorn
[1mcontent-length[0m: 233
[1mcontent-type[0m: application/json

{"description":"Triton liveness check","status":"error","error":"[StatusCode.UNAVAILABLE] failed to connect to all addresses; last error: UNKNOWN: ipv6:%5B::1%5D:8001: Failed to connect to remote host: Timeout occurred: FD Shutdown"}

## Start Inferencing


In [18]:
import requests
import base64

prompt = "first person view from a camera in a car driving down a two lane neighborhood street, " \
    "viewed from the dashcam as we drive down the street. " \
    "The camera faces forward. There are nice houses and sidewalks " \
    "in this suburban area with green grass front yards and flower gardens and large oak trees. " \
    "It is a rainy day and there are grey clouds overhead. " \
    "The road has puddles on it, reflecting the sky overhead. The windshield wipers flash by."

response = requests.post(
    "http://0.0.0.0:8000/v1/infer",
    json=dict(prompt=prompt),
    headers={
        "Accept": "application/json",
        "Content-Type": "application/json",
    }
)
response.raise_for_status()

data = response.json().get("b64_video")
video_bytes = base64.b64decode(data)

with open("video.mp4", "wb") as video_file:
    video_file.write(video_bytes)

HTTPError: 500 Server Error: Internal Server Error for url: http://0.0.0.0:8000/v1/infer

## Stop the Container

In [None]:
!docker ps -a && \
docker stop cosmos-predict1-7b-text2world

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
Error response from daemon: No such container: Llama3-8B-Instruct
