 #  🚀 Get started with LLM Inference using Cortex REST API

Welcome! This notebook will guide you through how to use the Cortex REST API to get access to top frontier models and enterprise-grade inference.

**Step 1:** Install OpenAI SDK.

**Step 2:** Get your Programmatic Access Token (PAT) [here](https://app.snowflake.com/_deeplink/settings/authentication)

**Step 3:** Configure the SDK by creating the client with Cortex base url and PAT.

**Step 4:** Model inference, including streaming and function calling.

**Step 5:** [Track your usage](https://app.snowflake.com/_deeplink/#/account/usage/consumption?usageType=compute&consumptionServiceType=WAREHOUSE_METERING&usageServiceType=allServices)

For trial users: [Add your credit card for on-demand capacity](https://app.snowflake.com/_deeplink/snowflake-billing)

## Step 1: Install OpenAI SDK

Install the `openai` sdk from pypi:

```bash
pip install openai
```

## Step 2: Get your Programmatic Access Token (PAT)

You can retrieve your PAT by creating a new token in the Snowsight interface [here](https://app.snowflake.com/_deeplink/settings/authentication).

## Step 3: Configure the SDK

Next, you will create the OpenAI client with your Snowflake account URL set as the `base_url` and the PAT you already created.

You can also retrieve the account URL using the Snowsight interface by:

1. Click on your user profile in the bottom left
2. Choose `Connect a tool to Snowflake`
3. Copy the value for your `Account URL` and paste it in the placeholder below.

```python
from openai import OpenAI

client = OpenAI(
   api_key=pat,
   base_url="https://ZZFDHZJ-DUB89364.snowflakecomputing.com/api/v2/cortex/openai"
)
```

## Step 4: Model inference

Check out [the list of Cortex supported models](https://docs.snowflake.com/user-guide/snowflake-cortex/aisql#regional-availability).

Replace `claude-sonnet-4-5` with the name of the model you want to use.

```python
response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user",
            "content": "How does a snowflake get its unique pattern?"
        }
    ]
)

print(response.choices[0].message)
```

## Streaming responses

The Cortex AI endpoint also supports streaming responses for any supported model.

```python
stream = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user",
            "content": "How does a snowflake get its unique pattern?"
        }
    ],
    stream = True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")
```

## Function calling

Cortex AI also supports function calling. In this example, we show how to use function calling with a `get_weather` function.

```python
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current temperature for a given location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City and country e.g. Bogotá, Colombia"
                }
            },
            "required": [
                "location"
            ],
            "additionalProperties": False
        },
        "strict": True
    }
}]

completion = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[{"role": "user", "content": "What is the weather like in Paris today?"}],
    tools=tools
)

print(completion.choices[0].message.tool_calls)
```

## Get more from your data with AI

*   Go beyond inference and **[build with Cortex Agents](https://quickstarts.snowflake.com/guide/getting_started_with_cortex_agents/index.html#0)** to query across structured data (like metrics and KPIs) and unstructured data (such as customer conversations and meeting transcripts).
*  **[Try more templates](https://app.snowflake.com/templates)** to quickly explore guided Worksheets, Notebooks & Streamlit apps — no setup required.
*   Join the **[Snowflake Marketplace](https://www.snowflake.com/en/product/features/marketplace/)** to access to more than 12,000 potential customers to unlock new monetization opportunities.



Perfecto, Mario 💪
Aquí tienes **solo el paso a paso que funcionó**, limpio y en orden:

---

### ✅ **Configuración y conexión**

1️⃣ **Lanzar una EC2** (Amazon Linux)

* Activa **Auto-assign public IPv4**
* Asocia una **Key Pair (.pem)**
* Guarda el `.pem` en tu Mac.

2️⃣ **Mover y asegurar la llave**

```bash
mv ~/Downloads/mggkeypair.pem ~/.ssh/
chmod 400 ~/.ssh/mggkeypair.pem
```

3️⃣ **Conectarte desde el Mac a la EC2**

```bash
ssh -i ~/.ssh/mggkeypair.pem ec2-user@<EC2_IP>
```

4️⃣ **Obtener tu IP pública**

```bash
curl -4 ifconfig.me
```

---

### ✅ **Configurar Snowflake**

5️⃣ **Crear la Network Policy (Snowsight)**

```sql
CREATE OR REPLACE NETWORK POLICY CORTEX_SDK_POLICY
  ALLOWED_IP_LIST = ('<TU_IP_LOCAL>/32', '<EC2_IP>/32')
  BLOCKED_IP_LIST = ();
ALTER ACCOUNT SET NETWORK_POLICY = CORTEX_SDK_POLICY;
```

*(Reinicia sesión en Snowsight)*

---

### ✅ **Subir el script**

6️⃣ **Copiar tu archivo Python a la EC2**

```bash
scp -i ~/.ssh/mggkeypair.pem /Users/mariogalvis/Documents/mggsnowflake/API/mgg.py ec2-user@<EC2_IP>:/home/ec2-user/
```

---

### ✅ **Instalar dependencias en EC2**

7️⃣

```bash
pip install openai
```

---

### ✅ **Ejecutar el script de prueba**

8️⃣ **Archivo `mgg.py`:**

```python
from openai import OpenAI

client = OpenAI(
    api_key="TU_PAT_SNOWFLAKE",
    base_url="https://ZZFDHZJ-DUB89364.snowflakecomputing.com/api/v2/cortex/openai"
)

resp = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[
        {"role": "user", "content": "How does a snowflake get its unique pattern?"}
    ]
)

print(resp.choices[0].message.content)
```

9️⃣ **Correrlo**

```bash
python3 mgg.py
```

---

### ✅ **Crear API con FastAPI**

🔹 Instalar dependencias:

```bash
pip install fastapi uvicorn openai
```

🔹 Crear `app.py`:

```python
from fastapi import FastAPI, Request
from openai import OpenAI

app = FastAPI()

client = OpenAI(
    api_key="TU_PAT_SNOWFLAKE",
    base_url="https://ZZFDHZJ-DUB89364.snowflakecomputing.com/api/v2/cortex/openai"
)

@app.post("/ask")
async def ask(request: Request):
    data = await request.json()
    prompt = data.get("prompt", "")
    response = client.chat.completions.create(
        model="claude-sonnet-4-5",
        messages=[{"role":"user","content":prompt}]
    )
    return {"answer": response.choices[0].message.content}
```

🔹 Ejecutar el server:

```bash
uvicorn app:app --host 0.0.0.0 --port 8000
```

---

### ✅ **Abrir el puerto**

🔹 En AWS Console → **EC2 → Security Groups → Inbound rules**
Agrega:

* Type: **Custom TCP**
* Port: **8000**
* Source: **My IP**

---

### ✅ **Probar la API desde tu Mac**

```bash
curl -X POST http://<EC2_IP>:8000/ask \
     -H "Content-Type: application/json" \
     -d '{"prompt":"Qué son las dynamic tables en Snowflake?"}'
```

---

### ✅ **Opcional**

* Detener server:

  ```bash
  sudo pkill -f uvicorn
  ```
* Apagar la instancia en AWS:
  EC2 → Instances → **Stop instance**

---

💥 Y listo.
Eso fue el flujo completo que funcionó de punta a punta:
**Mac → EC2 → Snowflake Cortex → FastAPI → Cliente (curl / VSCode).**

