# OpenAI OSS Model Getting Started Guide on Amazon Bedrock

This notebook provides a comprehensive introduction to using gpt-oss-120b & gpt-oss-20b on Amazon Bedrock, including how to leverage the familiar OpenAI SDK interface with Amazon Bedrock. We'll cover how to make API requests, explore available parameters and payload structures, and examine use cases for both model variants. 

## Model Variants

### GPT-OSS-120b

Parameters: 120 billion

Use Cases: Complex reasoning tasks, agentic use cases

### GPT-OSS-20b

Parameters: 20 billion

Use Cases: Faster inference, cost-effective deployments, simpler tasks

## Core Capabilities

Both OpenAI model variants share the following characteristics:

**Input/Output:** Text-in, text-out 

**Context Window:** 128,000 tokens  

**Model Type:** Advanced reasoning models

**Region:** us-west-2

**Tool Calling:** ✅ Supported

**Bedrock Guardrails** ✅ Supported

**Converse API** ✅ Supported

**OpenAI Chat Completions API** ✅ Supported

**Web Search:** ❌ Not available at this time

## What You'll Learn in this getting started guide

- Setting up the OpenAI SDK to work with Amazon Bedrock
- Understanding request parameters and response structures
- Choosing between Large and Small models for your use case
- Implementing tool calling capabilities

## Model Access on Amazon Bedrock

Ensure you have the correct IAM permission in order to access OpenAI's models on Amazon Bedrock. In order to ensure you have model access, follow these steps: 

- Go to Amazon Bedrock --> model access
- Click Modify model access
- Scroll to OpenAI
- Click the checkbox next to the models you would like access to
- Click next & accept any EULAs
- Click submit

## IAM Permissions

To use Bedrock models, your AWS credentials need the following permissions:


In [None]:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    }
  ]
}


## Step 1: Environment Configuration

First, we need to install the required packages and tell the OpenAI SDK to talk to Bedrock instead of OpenAI's servers.

### Required Imports:
- `os` → For environment variables
- `boto3` → For native Bedrock Converse API  
- `openai` → For OpenAI SDK compatibility with Bedrock
- `strands-agents` → For Amazon Strands agent framework

### Environment Variables:
We set two environment variables to redirect the OpenAI SDK:
- `AWS_BEARER_TOKEN_BEDROCK` → Your Bedrock API key  
- `OPENAI_BASE_URL` → Bedrock's OpenAI-compatible endpoint

In [None]:
%pip -U install boto3 openai strands-agents

In [None]:
import os
import boto3
from openai import OpenAI
from strands import Agent
from strands.models.openai import OpenAIModel
from strands.models import BedrockModel

### Model IDs

- **openai.gpt-oss-120b-1:0**
- **openai.gpt-oss-20b-1:0**

In [None]:
# Model Configuration - Change this to your desired model
MODEL_ID = "gpt-oss-120b"  

print(f"✅ Using model: {MODEL_ID}")

In [None]:
# Set environment variables to point to Bedrock
os.environ["AWS_BEARER_TOKEN_BEDROCK"] = "<insert your bedrock API key>"
os.environ["OPENAI_BASE_URL"] = "https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1"

print("✅ Environment configured for Bedrock!")

## Step 2: Import and Initialize OpenAI Client

Now we use the **exact same OpenAI SDK** you're familiar with. The client will automatically read the environment variables we just set.

**Key Point**: This is the same OpenAI library, but now it's talking to Amazon Bedrock.

In [None]:
# Initialize both clients
client = OpenAI()  # For chat completions API
bedrock_client = boto3.client('bedrock-runtime', region_name='us-west-2')  

print("✅ OpenAI client initialized (pointing to Bedrock)")
print(f"✅ Bedrock client initialized in region: {bedrock_client.meta.region_name}")

## Step 3: Make API Calls 

The API call structure is identical to OpenAI:
- Same `messages` format with `role` and `content`
- Same `model` parameter (but uses Bedrock model IDs)  
- Same `stream` parameter for real-time responses

In [None]:
response = client.chat.completions.create(
    model="gpt-oss-120b",                 
    messages=[
        {"role": "system", "content": "You are a concise, highly logical assistant."},
        {"role": "user",   "content": "What is the largest city in the southern hemisphere?"}
    ],
    temperature=0.2,
    max_tokens=1000,
    stream=True                         
)

print("✅ API request created")


## Step 4: Process Streaming Response

Handle the response exactly like you would with OpenAI. Each `item` in the response is a chunk of the model's output.

In [None]:
for item in response:
    print(item)

## Bedrock Converse Support


The Bedrock Converse API provides a consistent interface for working with all Bedrock models that support messages. This means you can write your code once and use it across different models without changes.

Key Benefits:
- Universal Interface: Same API structure works with Claude, Llama, Titan, and other models
- Model-Specific Parameters: Pass unique parameters when needed for specific models
- Privacy: Amazon Bedrock doesn't store any content you provide - data is only used for response generation
- Advanced Features: Built-in support for guardrails, tools/function calling, and prompt management 

In [None]:
response = bedrock_client.converse(
    modelId=MODEL_ID,
    messages=[
        {
            "role": "user",
            "content": [{"text": "Say this is a test"}]
        }
    ],
    system=[{"text": "You are a concise, highly logical assistant."}],
    inferenceConfig={
        "temperature": 0.2,
        "maxTokens": 1000
    }
)

print(response['output']['message']['content'][0]['text'])

## What's Happening Behind the Scenes?

When you use the OpenAI SDK with Bedrock, your requests are automatically translated to Bedrock's native `InvokeModel` API.

### Request Translation
- **OpenAI SDK Request** → **Bedrock InvokeModel** 
- The request body structure remains the same
- But there are some key differences in how parameters are handled:

| Parameter | OpenAI SDK | Bedrock InvokeModel |
|-----------|------------|-------------------|
| **Model ID** | In request body | Part of the URL path |
| **Streaming** | `stream=True/False` | Different API endpoints:<br/>• `InvokeModel` (non-streaming)<br/>• `InvokeModelWithResponseStream` (streaming) |
| **Request Body** | Full chat completions format | Same format, but `model` and `stream` are optional |

## Conclusion

You're now using the **OpenAI SDK** with **AWS Bedrock infrastructure**!

### What Just Happened?
1. **Same Code**: You wrote familiar OpenAI SDK code
2. **Different Backend**: Requests went to AWS Bedrock instead of OpenAI
3. **Same Experience**: API format, streaming

### Benefits
- ✅ Use familiar OpenAI SDK patterns
- ✅ Run on AWS infrastructure  
- ✅ Access to Bedrock's model catalog
- ✅ AWS security, logging, and monitoring

## Use Cases


Let's test reasoning with the Chat Completions API:

In [None]:
def run_prompt(prompt, temperature=0.02, max_tokens=1000):
    """Helper function to run prompts with chat completions"""
    response = client.chat.completions.create(
        model=MODEL_ID, # update your model id to either 120b or 20b
        messages=[{"role": "user", "content": prompt}],
        temperature=temperature,
        max_tokens=max_tokens,
        stream=False
    )
    return response.choices[0].message.content

print("✅ Helper function ready!")

## Reasoning Use Case Examples

Testing complex reasoning and problem-solving capabilities with the Chat Completions API.

### Exponential Growth Problem

Tests compound growth calculations and iterative reasoning

In [None]:
prompt = """Solve and verify.

A factory makes widgets. Day 1 output is 500. Each day after, output increases by 8%.
How many days until daily output first exceeds 1,000?

Return:
- Short reasoning outline (≤4 lines)
- Final answer (integer days)
- One-line self-check (plug back)."""

print(run_prompt(prompt, max_tokens=2000))

### Kinematics Problem

Tests multi-phase motion analysis and distance calculations

In [None]:
prompt = """A car accelerates uniformly from rest to 27 m/s in 9 s, then coasts at constant speed for 60 s, then brakes uniformly to rest in 6 s.
Compute total distance traveled.

Return:
- Piecewise distances
- Sum
- One-line unit/sanity check."""

print(run_prompt(prompt, max_tokens=1000))

### Scientific Problem Solving

In [None]:
science_prompt = """A ball is thrown vertically upward from ground level with an initial velocity of 20 m/s.
Calculate:
1. Maximum height reached
2. Time to reach maximum height  
3. Total time in the air
4. Velocity when it returns to ground level

Use g = 9.8 m/s². Show your physics reasoning and formulas."""

print("Scientific Problem Solving:")
print("=" * 50)
print(run_prompt(science_prompt, max_tokens=1000))

### System Design 

In [None]:
# System Design Example  
system_prompt = """Design a URL shortener service like bit.ly for 100M URLs per day.
Address:
1. Database schema design
2. Encoding/decoding algorithm choice
3. Scalability considerations (caching, load balancing)
4. One potential bottleneck and solution

Keep response concise but show your reasoning."""

print("System Design Example:")
print("=" * 50)
print(run_prompt(system_prompt, max_tokens=3000))

## Amazon Strands & OpenAI OSS on Amazon Bedrock

### Use-case — Two-agent Support-ticket JSON pipeline

Strands runs a mini-workflow in which:

- Summarizer turns raw Zendesk tickets into a compact JSON array:
[{id, summary, severity}]

- Triager reads that JSON and returns routing decisions:
[{id, route}] where route ∈ {escalate, backlog}

Because Strands is model-agnostic, you can swap Claude 3, Titan, or any future Bedrock model by changing one string.

Strands itself is an open-source SDK from AWS that gives you single-file agents, built-in multi-agent hand-offs, and first-class Bedrock support.

In [None]:
MODEL_ID = "gpt-oss-20b"

In [None]:
summarizer = Agent(
    model=bedrock,
    system_prompt=(
        "You are SupportSummarizer. "
        "For each ticket in the list you receive:\n"
        "  * Produce JSON with keys id, summary (≤25 words), severity (low|medium|high)\n"
        "Return **only** a JSON array."
    ),
)

triager = Agent(
    model=bedrock,
    system_prompt=(
        "You are TriageBot. You get a JSON array of tickets that already "
        "contain severity. For every item output:\n"
        "  * id (same as input)\n"
        "  * route = 'escalate' if severity is high, else 'backlog'\n"
        "Respond with **only** a JSON array."
    ),
)

In [None]:
tickets = [
    {"id": "1", "text": "User cannot log in after resetting password – URGENT, blocks payroll."},
    {"id": "2", "text": "Minor typo in footer on marketing site."},
]

# summarise
json_summaries = summarizer([t["text"] for t in tickets]).content   

# triage
routes = triager(json_summaries).content

print("Summaries:", json_summaries)
print("Routes   :", routes)
