# Chapter 5: Formatting Output and Speaking for Claude

- [Lesson](#lesson)
- [Exercises](#exercises)
- [Example Playground](#example-playground)

## Setup

Run the following setup cell to load your API key and establish the `get_completion` helper function.

In [None]:
import re
from mlx_lm import load, stream_generate

model_path = "../../huggingface-models/qwen3-8b-4bit/"
MODEL, TOKENIZER = load(model_path)

In [6]:
from IPython.display import Markdown, display

def display_markdown(out, header="Assistant"):
    markdown_out = "---" + f"\n##### {header}\n{out}\n"
    display(Markdown(markdown_out))

def display_chat_exchange(user_prompt, assistant_response):
    markdown_out = "---\n#### Exchange (User/Assistant)\n" + f"### üê∂ \n\n{user_prompt} \n### ü§ñ \n{assistant_response}\n\n" + "---" 
    display(Markdown(markdown_out))

def display_chat_exchange_raw(user_prompt, assistant_response):
    raw_msg = f"üê∂\n{user_prompt}\n\nü§ñ\n{assistant_response}\n"
    markdown_msg = f"```text\n{raw_msg}\n```"
    display(Markdown(markdown_msg))

In [None]:
def get_completion(
        prompt: str, 
        system_prompt="", prefill="", 
        max_tokens=32768, 
        enable_thinking=False,
        print_stream=False):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt},
        {"role": "assistant", "content": prefill},
    ]
    tokenized_prompt = TOKENIZER.apply_chat_template(
        messages,
        add_generation_prompt=True,
        enable_thinking=enable_thinking
    )

    total_response = []

    for response in stream_generate(
        MODEL, 
        TOKENIZER, 
        tokenized_prompt, 
        max_tokens=max_tokens 
    ):
        total_response.append(response.text)
        if print_stream:
            print(response.text, end="")

    return TOKENIZER.decode(tokenized_prompt), "".join(total_response) 

---

## Lesson

**Claude can format its output in a wide variety of ways**. You just need to ask for it to do so!

One of these ways is by using XML tags to separate out the response from any other superfluous text. You've already learned that you can use XML tags to make your prompt clearer and more parseable to Claude. It turns out, you can also ask Claude to **use XML tags to make its output clearer and more easily understandable** to humans.


### Examples

Remember the 'poem preamble problem' we solved in Chapter 2 by asking Claude to skip the preamble entirely? It turns out we can also achieve a similar outcome by **telling Claude to put the poem in XML tags**.

In [14]:
# Prompt template with a placeholder for the variable content
ANIMAL = "Bat"
SYSTEM_PROMPT = ""
PROMPT = f"Please write a haiku about {ANIMAL}. Put it in <haiku> tags."

raw_prompt, assistant_response = get_completion(
    PROMPT, 
    system_prompt=SYSTEM_PROMPT
)
display_markdown('```\n' + raw_prompt + '\n```', "Raw Input Prompt")
display_chat_exchange_raw(PROMPT, assistant_response)

---
##### Raw Input Prompt
```
<|im_start|>system
<|im_end|>
<|im_start|>user
Please write a haiku about Bat. Put it in <haiku> tags.<|im_end|>
<|im_start|>assistant
<think>

</think>

<|im_end|>
<|im_start|>assistant
<think>

</think>


```


```text
üê∂
Please write a haiku about Bat. Put it in <haiku> tags.

ü§ñ
<haiku>
Silent in the dark,  
eyes glow like distant stars‚Äî  
shadow's gentle friend.

```