# Bedrock DeepSeek-R1 Model Support
In this notebook we quickly explore how you can leverage the popular [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1#usage-recommendations) model via Amazon Bedrock. Within Bedrock this model is now availble as a fully managed serverless option that you can invoke via the standard [Runtime Client] InvokeModel API call(https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html) or even more simply through the [Converse API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html).

## Credits/References & Additional Resources
- [Offical AWS Blog](https://aws.amazon.com/blogs/aws/deepseek-r1-now-available-as-a-fully-managed-serverless-model-in-amazon-bedrock/)
- [DeepSeek with Bedrock Guardrails](https://aws.amazon.com/blogs/machine-learning/protect-your-deepseek-model-deployments-with-amazon-bedrock-guardrails/)

## Sample Invocation Via Invoke Model API Call
Docs: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/invoke_model.html

In [None]:
import json
import boto3

# Setup Bedrock runtime client, working in us-west-2
runtime = boto3.client("bedrock-runtime", region_name="us-west-2")
model_id = "us.deepseek.r1-v1:0"

# Sample payload, replace with your request
text_payload = {
    "prompt": "<｜begin_of_sentence｜><｜User｜>Who is Roger Federer?<｜Assistant｜><think>\n",
    "max_tokens": 1000,
    "temperature": 0.6,
    "top_p": 0.9
}
body_payload = json.dumps(text_payload)

# sample inference
response = runtime.invoke_model(
    body=body_payload,
    modelId=model_id,
    accept="application/json",
    contentType="application/json"
)
response_body = json.loads(response["body"].read())
generated_text = response_body.get("choices", [{}])[0].get("text", "")
print(generated_text)

## Invoke Via Converse API
Provides a standard format across different models where you don't have to adjust the payload structuring depending on the model provider, makes it simple to test the same payload across different model providers.

In [None]:
import boto3

# Structure payload, replace user_message with your request
user_message = "Who is Roger Federer"
conversation = [
    {
        "role": "user",
        "content": [{"text": user_message}],
    }
]

# sample inference via converse API, uses sample client different method
response = runtime.converse(
    modelId=model_id,
    messages=conversation,
    inferenceConfig={"maxTokens": 2000, "temperature": 0.6},
)
response_text = response["output"]["message"]["content"][0]["text"]
print(response_text)

## Sample Inference with HuggingFace (HF) Math Dataset
Here we iterate over a sample HF math dataset to showcase R1's advanced reasoning capabilities, you can play with different datasets here to test the capabilities. In coming sections we will also explore how we can evaluate LLMs across different tasks.

- <b>Dataset</b>: https://huggingface.co/datasets/microsoft/orca-math-word-problems-200k
- <b>License</b>: MIT

In [None]:
!pip install datasets --quiet

In [None]:
# load sample dataset to work with
from datasets import load_dataset
ds = load_dataset("microsoft/orca-math-word-problems-200k")

In [None]:
# sample payload/input question
ds['train'][1:10]['question']

In [None]:
# wrap invoke model API call into a function to simplify code a little as we iterate over the dataset
def inference(question: str) -> str:
    text_payload = {
        "prompt": f"<｜begin_of_sentence｜><｜User｜>{question}<｜Assistant｜><think>\n",
        "max_tokens": 1000,
        "temperature": 0.6,
        "top_p": 0.9
    }
    body_payload = json.dumps(text_payload)
    # invoke_model API call
    response = runtime.invoke_model(
        body=body_payload,
        modelId=model_id,
        accept="application/json",
        contentType="application/json"
    )
    #parse model output
    response_body = json.loads(response["body"].read())
    generated_text = response_body.get("choices", [{}])[0].get("text", "")
    return generated_text

# iterate over some samples in the dataset and run inference with the DeepSeek R-1 model
for i,question in enumerate(ds['train'][1:10]['question'], start = 1):
    print(f"Question {i}: {question}")
    model_output = inference(question)
    print(f"Generated Answer: {model_output}")