# Getting Started with Ollama: Few-Shot Prompting

In some cases, it's easier to show the model what you want rather than tell the model what you want.

One way to show the model what you want is with creating a few fake back-and-forth messages between user and assistant. This is called few-shot prompting. 

The opposite of few-shot prompting is zero-shot prompting (previous examples).

## Prerequisites
1. Make sure that python3 is installed on your system.
2. Make sure Ollama is installed and "running" on your system.
3. Create an .env file, and add the following line:
   ```
   OLLAMA_MODEL=<model_name>
   ```
   where model_name will be the name of the local model you want to use
4. Create and Activate a Virtual Environment:
   ```
   python3 -m venv venv
   source venv/bin/activate
   ```
5. The required libraries are listed in the requirements.txt file. Use the following command to install them:
   ```
   pip3 install -r requirements.txt
   ```

## Import Required Modules

In [1]:
from ollama import chat, ResponseError, pull    # chat API from Ollama. Think of OpenAI chat completion API equivalent
from dotenv import load_dotenv                  # The `dotenv` library is used to load environment variables from a .env file.
import os                                       # Used to get the values from environment variables.

## Load Environment Variables

In [2]:
# Load environment variables from .env file
load_dotenv()
MODEL = os.environ['OLLAMA_MODEL']

## Set the Assistant's Behavior with Few-Shot Examples

In this example, we are expecting the assistant to respond in Hindi.
The `conversation` list contains a series of messages that simulate such conversation.

To help clarify that the example messages are not part of a real conversation, 
and shouldn't be referred back by the model, 
set the message role as `system` followed by a `name` field.
The value of `name` field can either by `example_user` or `example_assistant`

In [3]:
conversation=[
        {"role": "system", "content": "You answer based on the pattern of the conversation."},
        {"role": "system", "name":"example_user", "content": "Hi, how are you?"},
        {"role": "system", "name": "example_assistant", "content": "Main accha hoon, aap kaise hain?"},
        {"role": "system", "name":"example_user", "content": "I am fine, can you tell me something?"},
        {"role": "system", "name": "example_assistant", "content": "Haan, bilkul! Aapko kya jaanana hai?"}
    ]

## Start Interactive Chat Loop

The loop will continue until you interrupt the kernel.
In each iteration:
- User will be prompted to enter a question
- The question will be added to the conversation history
- The AI will respond based on the entire conversation
- The conversation history is maintained in the `conversation` list

In [4]:
while True:
    # Get user input
    question = input("Enter your question (exit to quit): ").strip()
    print(f"Question: {question}")

    if question.lower() in ['exit', 'quit']:
        print("Exiting the chat. Goodbye!")
        break

    # Add user question to conversation history
    conversation.append({"role": "user", "content": question})

    # Wrap the question to ollama.chat() payload
    try:
        response = chat(
            model = MODEL,
            messages = conversation,
            options = {             
                "temperature": 0.7,  
                "seed": 42          
            }
        )

        # Print the response for debugging
        print(f"DEBUG:: Complete response from LLM:\n{response.model_dump_json(indent=4)}")

        # Extract answer and print it
        answer = response.message.content
        print("\nAnswer from AI:")
        print(answer)

        # Append the assistant's response to the conversation history
        conversation.append({"role": "assistant", "content": answer})

    # Handle if the provided model is not installed
    except ResponseError as e:
        print('Error getting answer from AI:', e)
        if e.status_code == 404: # Model not installed
            try:
                print('Pulling model:', MODEL)
                pull(MODEL) 
                print('Model pulled successfully:', MODEL)
                print('Please ask the question again.')

            except Exception as e:
                print('Error pulling model. Error:', e)

    # Catch any exceptions that occur during the request
    except Exception as e:
        print('Error getting answer from AI:', e)

Question: My name is Agni
DEBUG:: Complete response from LLM:
{
    "model": "qwen2.5:latest",
    "created_at": "2025-08-27T01:28:51.294083Z",
    "done": true,
    "done_reason": "stop",
    "total_duration": 2518867125,
    "load_duration": 63846041,
    "prompt_eval_count": 73,
    "prompt_eval_duration": 115238208,
    "eval_count": 46,
    "eval_duration": 2338493083,
    "message": {
        "role": "assistant",
        "content": "Achchi nām hai, Agni! Main aapko kyunki main ne apne namse ko zyada rukhana suna hai. Kuchh aur baat kar lenge?",
        "thinking": null,
        "images": null,
        "tool_name": null,
        "tool_calls": null
    }
}

Answer from AI:
Achchi nām hai, Agni! Main aapko kyunki main ne apne namse ko zyada rukhana suna hai. Kuchh aur baat kar lenge?
Question: What is your name?
DEBUG:: Complete response from LLM:
{
    "model": "qwen2.5:latest",
    "created_at": "2025-08-27T01:29:02.77011Z",
    "done": true,
    "done_reason": "stop",
    "total_