In [2]:
!pip install --upgrade cerebras_cloud_sdk

Collecting cerebras_cloud_sdk
  Downloading cerebras_cloud_sdk-1.50.1-py3-none-any.whl.metadata (19 kB)
Downloading cerebras_cloud_sdk-1.50.1-py3-none-any.whl (91 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m91.8/91.8 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: cerebras_cloud_sdk
Successfully installed cerebras_cloud_sdk-1.50.1


In [11]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
cerebras_api_key = user_secrets.get_secret("CEREBRAS_API")
gemini_api_key = user_secrets.get_secret("GOOGLE_API_KEY")
hf_api_key = user_secrets.get_secret("HF_API_KEY")

In [18]:
from cerebras.cloud.sdk import Cerebras
import time

# Initialize the Cerebras client
client = Cerebras(api_key=cerebras_api_key)

# Function to make the GPT-OSS call and retrieve only the content
def gpt_oss(prompt,
            model="gpt-oss-120b", 
            temperature=1.0, 
            max_tokens=1024,
            verbose=False,
            max_tries=3):
    
    # Structure the message for the API
    messages = [
        {
            "role": "user",
            "content": prompt,
        }
    ]
    
    # Show details if verbose mode is on
    if verbose:
        print(f"Prompt:\n{prompt}\n")
        print(f"Model: {model}")
        print(f"Temperature: {temperature}")
        print(f"Max Tokens: {max_tokens}")
    
    # Attempt to get the response from the Cerebras API
    for num_tries in range(max_tries):
        try:
            # Call the chat completions API of Cerebras
            chat_completion = client.chat.completions.create(
                messages=messages,
                model=model,
                temperature=temperature,
                max_tokens=max_tokens
            )

            # If verbose, print the full response for debugging
            if verbose:
                print(f"Full Response Object: {chat_completion}")

            # Check if we have choices and return just the content
            if hasattr(chat_completion, 'choices') and len(chat_completion.choices) > 0:
                # Return the content directly from the response
                return chat_completion.choices[0].message.content

            # If the structure is unexpected, print the response and handle gracefully
            print("Unexpected response structure:", chat_completion)
            return None

        except Exception as e:
            # Catch exceptions and retry if needed
            print(f"Error: {e}")
            print(f"Attempt {num_tries + 1}/{max_tries}")
            
            # Simple retry logic
            wait_time = 2 ** num_tries  # Exponential backoff
            print(f"Waiting for {wait_time} seconds before retrying...")
            time.sleep(wait_time)
    
    print(f"Tried {max_tries} times, but failed to get a valid response.")
    return None

# Example usage
response = gpt_oss("Why is fast inference important?", verbose=False)
print(response)

## In a nutshell  

Fast inference – the ability of a model to produce predictions in a very short amount of time – is a **must‑have** property for most deployed machine‑learning systems.  It impacts everything that matters to a product or a service: user experience, hardware cost, energy consumption, business revenue, and even the feasibility of the solution itself.

Below is a structured breakdown of **why fast inference matters**, illustrated with concrete examples and practical implications for engineers, product managers, and executives.

---

## 1. User‑Facing Reasons (Latency, UX, Trust)

| Reason | What it means for the user | Example |
|--------|---------------------------|---------|
| **Low latency → smoother interaction** | Humans start noticing delays around 100 ms; above ~300 ms the experience feels “sluggish”. | Voice assistants (Alexa, Siri) must turn speech → text → response in < 200 ms to feel conversational. |
| **Real‑time feedback** | Users need immediate answers to

In [19]:
# define the prompt
prompt = "Help me write a birthday card for my dear friend Andrew."

In [20]:
# pass prompt to the llama function, store output as 'response' then print
response = gpt_oss(prompt)
print(response)

**Front of the Card**  
*“Another year brighter, wiser, and more awesome—just like you, Andrew!”*  

---

**Inside the Card**

> Dear Andrew,
> 
> Happy Birthday! 🎉  
> 
> From the moment we first met, you’ve been a constant source of laughter, encouragement, and good‑vibes. Whether we’re pulling all‑nighters on crazy projects, sharing a cold beer after a long day, or just hanging out and swapping stories, every moment with you feels a little bit brighter.
> 
> This year, I hope you get everything you’ve been wishing for—more adventures, endless opportunities, and plenty of time to relax and do the things you love. May your day be filled with all the things that make you smile: great company, tasty cake, and maybe a surprise or two (because you deserve it!).
> 
> Thank you for being the kind of friend who’s always there, no matter what. Here’s to another amazing year of memories, success, and good times.  
> 
> Cheers to you, Andrew—may the year ahead be as fantastic as you are.
> 
> W

In [21]:
# Set verbose to True to see the full prompt that is passed to the model.
prompt = "Help me write a birthday card for my dear friend Andrew."
response = gpt_oss(prompt, verbose=True)

Prompt:
Help me write a birthday card for my dear friend Andrew.

Model: gpt-oss-120b
Temperature: 1.0
Max Tokens: 1024
Full Response Object: ChatCompletionResponse(id='chatcmpl-166d1f66-49b1-42bb-8c4d-e2d243ee67c8', choices=[ChatCompletionResponseChoice(finish_reason='stop', index=0, message=ChatCompletionResponseChoiceMessage(role='assistant', content='### A Sweet, Heart‑Felt Birthday Card for Andrew  \n\n#### 1. Classic & Warm  \n> **Front:** *“Happy Birthday, Andrew!”*  \n>   \n> **Inside:**  \n> \n> Dear Andrew,  \n> \n> From the moment we first met, I’ve been grateful for every laugh, every late‑night chat, and every adventure we’ve shared. You’re more than a friend—you’re a brother, a confidant, and a constant source of joy.  \n> \n> On your special day, I hope you’re surrounded by the same warmth and kindness you give to everyone around you. May this year bring you new dreams to chase, unforgettable moments, and all the happiness you deserve.  \n> \n> Cheers to many more birthd

In [23]:
### chat model
prompt = "What is the capital of France?"
response = gpt_oss(prompt, 
                 verbose=True)

Prompt:
What is the capital of France?

Model: gpt-oss-120b
Temperature: 1.0
Max Tokens: 1024
Full Response Object: ChatCompletionResponse(id='chatcmpl-2e63d112-c1b4-4e1a-9ce9-6d9ff3893384', choices=[ChatCompletionResponseChoice(finish_reason='stop', index=0, message=ChatCompletionResponseChoiceMessage(role='assistant', content='The capital of France is **Paris**.', reasoning='The user asks: "What is the capital of France?" Simple question. Answer: Paris. Probably also mention extra info.', tool_calls=None), logprobs=None)], created=1758377131, model='gpt-oss-120b', object='chat.completion', system_fingerprint='fp_5ab1f1b86af079c8f1af', time_info=ChatCompletionResponseTimeInfo(completion_time=0.026281932, prompt_time=0.00143548, queue_time=0.000318909, total_time=0.03572583198547363, created=1758377131), usage=ChatCompletionResponseUsage(completion_tokens=44, prompt_tokens=74, prompt_tokens_details=ChatCompletionResponseUsagePromptTokensDetails(cached_tokens=0), total_tokens=118), serv

In [24]:
print(response)

The capital of France is **Paris**.


In [25]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = gpt_oss(prompt, temperature=0.0)
print(response)

**Front of the Card**  
*(Printed on light‑blue cardstock with a subtle silhouette of a panda strolling along a shoreline)*  

> **“Happy Birthday, Andrew!”**  
> *May your day be as bright as a sunrise over the sea and as cozy as a quiet corner in your favorite bookstore.*

---

**Inside the Card**

Dear Andrew,

Another wonderful year has rolled in—just like the gentle tide you love to walk beside. I hope today you get to:

- **Stroll the beach** with the sand between your toes and the sound of waves cheering you on.  
- **Lose yourself in a bookstore**, flipping through pages that whisk you to new worlds (and maybe a few research papers that spark that brilliant mind of yours).  
- **Share your insights** on stage, because the world is better when you speak, and we’re all lucky to hear you.

May your birthday be painted in your favorite shade of light blue—calm, clear, and full of endless possibilities. And just for fun, here’s a little panda to keep you company:

```
   ʕ•ᴥ•ʔ
   Ha

In [27]:
# Run the code again - the output should be identical
response = gpt_oss(prompt, temperature=0.0)
print(response)

**Front of the Card**  
*(Printed on light‑blue cardstock with a subtle silhouette of a panda strolling along a shoreline)*  

> **“Happy Birthday, Andrew!”**  
> *May your day be as bright as a sunrise over the sea and as cozy as a quiet corner in your favorite bookstore.*

---

**Inside the Card**

Dear Andrew,

Another wonderful year has rolled in—just like the gentle tide you love to walk beside. I hope today you get to:

- **Stroll the beach** with the sand between your toes and the sound of waves cheering you on.  
- **Lose yourself in a bookstore**, flipping through pages that whisk you to new worlds (and maybe a few research papers that spark that brilliant mind of yours).  
- **Share your insights** on stage, because the world is better when you speak, and we’re all lucky to hear you.

May your birthday be painted in your favorite shade of light blue—calm, clear, and full of endless possibilities. And just for fun, here’s a little panda to keep you company:

```
   ʕ•ᴥ•ʔ
   Ha

In [28]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = gpt_oss(prompt, temperature=0.9)
print(response)

**Front Cover**  
*(Light‑blue background with a gentle illustration of a panda holding a tiny stack of books)*  

> **Happy Birthday, Andrew!**  
> *May your day be as bright as the sea‑foam and as cozy as a favorite bookstore nook.*

---

**Inside (Left Page)**  

*Picture of footprints leading along a sun‑kissed beach, ending at a little panda perched on a bench with a book.*

> “A walk on the beach clears the mind,  
>  A good book feeds the soul,  
>  And a panda—well, that’s just the perfect sprinkle of joy!”  

---

**Inside (Right Page)**  

Dear Andrew,

Another year of brilliant ideas, inspiring talks, and countless pages turned—what a wonderful journey you’ve charted!  

- May your **long walks on the beach** feel like a gentle tide of calm, each step a reminder of how far you’ve come.  
- May the **bookshelves** you love so much be filled with fresh discoveries, from the latest research paper to the next literary adventure.  
- May every **conference stage** you step onto s

In [29]:
# run the code again - the output should be different
response = gpt_oss(prompt, temperature=0.9)
print(response)

**Front of the Card**  
*(light‑blue background with a cute panda holding a tiny beach‑ball)*  

> **Happy Birthday, Andrew!**  
> *May your day be as bright as the sea and as cozy as a good book.*

---

**Inside (Left Page)**  

A little reminder of the things that make you *you*:

- 🌊 Long walks on the beach, where the sand tickles your feet and the waves whisper new ideas.  
- 📚 Getting lost among the aisles of a bookstore, where every shelf is a new adventure.  
- 🧠 Diving into research papers and emerging with fresh insights—your brain is always on a scholarly stroll.  
- 🎤 Owning the stage at conferences, turning complex thoughts into captivating stories.  
- 🐼 Sharing a smile with your favorite panda pals (they’re practically your spirit animals).  

---

**Inside (Right Page)**  

> **Dear Andrew,**  
> 
> On this special day, I wanted to thank you for the countless moments you’ve 
> brightened—whether it’s a thoughtful conversation after a beach walk, a 
> recommendation for a

### Changing the max tokens setting

In [31]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = gpt_oss(prompt, max_tokens=20)
print(response)

None


In [32]:
with open("/kaggle/input/prompt-engineering-with-gpt-oss/TheVelveteenRabbit.txt", "r", encoding='utf=8') as file:
    text = file.read()

In [33]:
prompt = f"""
Give me a summary of the following text in 50 words:\n\n
{text}
"""
response = gpt_oss(prompt)

In [34]:
print(response)

On Christmas, a velveteen rabbit is gifted, ignored, and befriended by the wise Skin Horse, who teaches that true love makes a toy “real.” The boy cherishes the rabbit, which endures wear, illness, and seasons, eventually becoming real through the boy’s unwavering love and deep emotional bond that heals all.


In [38]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = gpt_oss(prompt)
print(response)

**Front Cover**  
*In a soft wash of light‑blue, a gentle wave curls across the page…*  

**Happy Birthday, Andrew!**  

---

**Inside (left)**  

> “May your days be as endless as a walk on the shore,  
>  and as comforting as the quiet hum of a bookstore.”  

---

**Inside (right)**  

Dear Andrew,

Another wonderful year has rolled in—just like the tide you love to follow on those long beach walks. I hope today brings you moments as peaceful as the soft sand beneath your feet and as exciting as flipping through a fresh stack of research papers.

May your next conference stage be filled with eager listeners, sharp questions, and that unmistakable spark you bring to every discussion. And when the applause fades, may you find a cozy nook in a bookstore (maybe with a light‑blue bookmark tucked in) where you can lose yourself in a good story—just the way you like it.

I’ve also heard the pandas are planning a surprise party in your honor (they’re great at keeping things *bear*‑y friendly

In [39]:
prompt_2 = """
Oh, he also likes teaching. Can you rewrite it to include that?
"""
response_2 = gpt_oss(prompt_2)
print(response_2)

Sure thing! Could you please share the original passage (or a quick summary of it) that you’d like revised? Once I have the text, I’ll weave in his love of teaching and give you a polished rewrite.
