In [27]:
from mlx_lm import load, stream_generate


In [28]:
model_path = "../../huggingface-models/qwen3-8b-4bit/"
model, tokenizer = load(model_path)

In [57]:
prompt = "Write a story about a man climbing in the treacherous mountain of K2."
messages = [{"role": "user", "content": prompt}]

In [58]:
def raw_prompt_without_generation(tokenizer, messages):
    return tokenizer.apply_chat_template(
        messages, add_generation_prompt=False, enable_thinking=True
    )

def raw_prompt_with_generation(tokenizer, messages):
    return tokenizer.apply_chat_template(
        messages, add_generation_prompt=True, enable_thinking=True
    )

def raw_prompt_without_thinking(tokenizer, messages):
    return tokenizer.apply_chat_template(
        messages, add_generation_prompt=True, enable_thinking=False
    )

def raw_prompt_with_system(tokenizer, messages):
    updated_messages = [{"role": "system", "content" : 10 * "You are an AI!\n"}] + messages
    return tokenizer.apply_chat_template(
        updated_messages, add_generation_prompt=True, enable_thinking=True
    )

In [59]:
tok_without_generation = raw_prompt_without_generation(tokenizer, messages)
tok_with_generation = raw_prompt_with_generation(tokenizer, messages)
tok_without_thinking = raw_prompt_without_thinking(tokenizer, messages)
tok_with_system = raw_prompt_with_system(tokenizer, messages)

print(100 * "-")
print("Without Generation:\n" + 100 * "-" + "\n", tokenizer.decode(tok_without_generation))
print(100 * "-")
print("With Generation:\n" + 100 * "-" + "\n", tokenizer.decode(tok_with_generation))
print(100 * "-")
print("Without Thinking:\n" + 100 * "-" + "\n", tokenizer.decode(tok_without_thinking))
print(100 * "-")
print("With System:\n" + 100 * "-" + "\n", tokenizer.decode(tok_with_system))
print(100 * "-")

----------------------------------------------------------------------------------------------------
Without Generation:
----------------------------------------------------------------------------------------------------
 <|im_start|>user
Write a story about a man climbing in the treacherous mountain of K2.<|im_end|>

----------------------------------------------------------------------------------------------------
With Generation:
----------------------------------------------------------------------------------------------------
 <|im_start|>user
Write a story about a man climbing in the treacherous mountain of K2.<|im_end|>
<|im_start|>assistant

----------------------------------------------------------------------------------------------------
Without Thinking:
----------------------------------------------------------------------------------------------------
 <|im_start|>user
Write a story about a man climbing in the treacherous mountain of K2.<|im_end|>
<|im_start|>assistant

In [60]:
def next_token_preds(model, tokenizer, tokenized_prompt):
    for response in stream_generate(model, tokenizer, tokenized_prompt, max_tokens=2**15):
        print(response.text, end="", flush=True)
    print()


In [61]:
next_token_preds(model, tokenizer, tok_without_generation)

<|im_start|>
<|im_start|>
Okay, I need to write a story about a man climbing K2. Let me start by thinking about the key elements. First, K2 is known as the second-highest mountain in the world, right? It's also considered one of the most dangerous mountains to climb. So the story should highlight the danger and the challenges.

The main character is a man. Maybe he's an experienced climber, but perhaps there's a personal motivation. Maybe he's climbing for a loved one, or to prove something. That could add depth. Let me think... Maybe he lost someone before, and this climb is a way to honor them. That could give emotional stakes.

Setting the scene: The mountain is treacherous. I should describe the harsh environment—extreme cold, thin air, unpredictable weather. Maybe include some specific dangers like crevasses, avalanches, or the difficulty of the route. Also, the physical and mental toll on the climber.

I need to build tension. Maybe the climber faces a critical moment where he ha

In [62]:
next_token_preds(model, tokenizer, tok_with_generation)

<think>
Okay, the user wants a story about a man climbing K2. Let me start by thinking about the key elements. K2 is known for being one of the most dangerous mountains, so I need to highlight the challenges. Maybe focus on the physical and mental struggles. The protagonist should be someone relatable, maybe with a personal motivation. Perhaps he's an experienced climber with a past failure or a personal loss. 

I should include the harsh environment—extreme weather, thin air, avalanches. Maybe some specific dangers like crevasses or storms. The story needs tension and a climax. Maybe he faces a critical moment where he has to make a tough decision. 

Also, the user might want a realistic portrayal, so I should research some real aspects of K2 climbing. Maybe mention the altitude, the lack of oxygen, the teamwork involved. But since it's a story, I can take some creative liberties. 

I need a strong opening to hook the reader. Maybe start with him at the summit, then flashback to his j

In [63]:
next_token_preds(model, tokenizer, tok_without_thinking)

**Title: The Summit of Shadows**

The wind howled like a vengeful spirit, tearing at the edges of the sky. Above the jagged teeth of K2, the world seemed to hold its breath. At 8,611 meters, the air was thin, the cold relentless, and the path treacherous. But for Marcus Hale, the mountain was not just a challenge—it was a calling.

Marcus had spent years preparing for this moment. A former mountaineer turned wilderness guide, he had led expeditions across the Andes and the Himalayas, but K2 had always been the ultimate test. It was the second-highest mountain in the world, known for its unpredictable weather, deadly crevasses, and the ever-present specter of death. Yet, for Marcus, it was a place of redemption.

He had lost his wife, Clara, two years ago in a sudden avalanche on Denali. The grief had been a weight he carried, heavy and unrelenting. He had vowed to climb K2 not for glory, but to confront the silence of the mountains and find some measure of peace.

The team was small—fi