### Libraries Used

In [2]:
import ollama

### Understanding how to generate reponses using Ollama LLMs

In [2]:
# 1. Define the question
user_question = "What is the capital of India?"

# 2. Send it to the model (Generation only)
# We are NOT providing external context yet.
response = ollama.chat(
    model='gemma3:4b',  # Enter the model that you pull on ollama
    messages=[{'role': 'user', 'content': user_question}]
)

# 3. Print the answer
print("AI Answer:", response['message']['content'])

AI Answer: The capital of India is **New Delhi**. 

It was officially designated as the capital in 1911, replacing Calcutta (now Kolkata). ðŸ˜Š 

Do you want to know anything more about New Delhi or India in general?


We write,<br>
`response['message']['content']`<br>
because it is in JSON format:<br>

```json
{
  "model": "phi3:3.8b",
  "created_at": "2025-12-15T09:45:00.5105742Z",
  "done": true,
  "done_reason": "stop",
  "total_duration": 11129766800,
  "load_duration": 3728855400,
  "prompt_eval_count": 16,
  "prompt_eval_duration": 765201100,
  "eval_count": 94,
  "eval_duration": 6592721000,
  "message": {
    "role": "assistant",
    "content": "The capital of India is New Delhi. It became the political center in 1956 when the country's government moved from Kolkata (then called Calcutta) and now serves as a major hub for culture, education, and politics, housing Parliament House, which holds the Indian legislative bodies - Lok Sabha (House of the People), Rajya Sabha (Council of States), along with numerous other government offices.",
    "thinking": null,
    "images": null,
    "tool_name": null,
    "tool_calls": null
  },
  "logprobs": null
}



### Different Roles

| Role      | Analogy               | Purpose                                  |
|-----------|-----------------------|------------------------------------------|
| System    | The Boss / Director   | Defines behavior, tone, and rules.        |
| User      | The Customer          | Asks the question.                       |
| Assistant | The Transcript        | Provides conversation history/context.   |


```py
messages=[
    # 1. We set the context (optional but good)
    {'role': 'system', 'content': 'You are a helpful assistant.'},

    # 2. First turn
    {'role': 'user', 'content': 'Who wrote Harry Potter?'},
    {'role': 'assistant', 'content': 'J.K. Rowling wrote it.'}, # <--- WE INSERT THIS MANUALLY OR FROM PREVIOUS RESPONSE

    # 3. Current question (The AI now reads the line above and knows 'she' = Rowling)
    {'role': 'user', 'content': 'What year was she born?'}
]

### Chunking

In [4]:
# 1. The text we want to translate
text = "The cat watched carefully as the playful kitten chased a leaf across the driveway, only stopping when a car rolled slowly past the house."


# 2. Ask Ollama to create the embedding
# We use 'nomic-embed-text' because it's built for this.
response = ollama.embeddings(
    model='nomic-embed-text',
    prompt=text
)

# 3. Get the vector (The list of numbers)
vector = response['embedding']

# 4. Inspect the result
print(f"Text: {text}")
print(f"Vector Length: {len(vector)}") # How many dimensions?
print(f"First 5 numbers: {vector[:5]}") # Just a peek

Text: The cat watched carefully as the playful kitten chased a leaf across the driveway, only stopping when a car rolled slowly past the house.
Vector Length: 768
First 5 numbers: [1.0189101696014404, 0.19626553356647491, -2.8098843097686768, 0.09355314821004868, 1.6594434976577759]
