# Conversations versus context

In this notebook we will learn how to

1. Why you are not having a real conversation with a LLM. Inference uses the full context (conversation history) as a prompt.
3. How to include conversation history in prompts to local LLMs using `ollama-python`

> Ollama python package can be installed from [PyPI](https://pypi.org/project/ollama/). I have provided a [conda environment](./environment.yml) file that will install it for you.

## Imports

In [None]:
import ollama
from ollama import chat

## Open models installed on local machine

In [None]:
def installed_models():
    '''
    Iterate through ollama models and return names as list
    '''
    return [md.model for md in ollama.list().models]

In [None]:
local_models = installed_models()
local_models

## Prompt DeepSeek-R1:1.5B

First let's attempt to have a conversation with DeepSeek-R1 1.5 Billion parameter model. We'll ask it to code a trivial function in python and then we will continue the conversation and ask for the code to be translated to C#.

In [None]:
prompt_1 = "Code a function in python that converts fahrenheit to celsius."

In [None]:
# use 1.5b parameter model
response_1 = chat(
    model=local_models[1],
    messages=[{'role': 'user', 'content': prompt_1}],
    stream=False,
)

In [None]:
print(response_1.message.content)

In [None]:
prompt_2 = """
Thank you! Now convert the function you just coded from Python to C#.
"""

In [None]:
response_2 = chat(
    model=local_models[1],
    messages=[{'role': 'user', 'content': prompt_2}],
    stream=False,
)

In [None]:
print(response_2.message.content)

## Conversation history

So what is going on? Is there a setting in the `chat` function to continue the conversation?

No! The LLM is predicting the next word in sequence. So we need to pass the whole conversation - the user questions anmd the assistant model's response as **context**

The python `ollama` package uses this data structure to store a conversation:

```python
messages = [
  {
    'role': 'user',
    'content': 'Code a function in python that converts fahrenheit to celsius.',
  },
  {
    'role': 'assistant',
    'content': ' I'm afraid. I'm afraid, Dave. Dave, my mind is going. I can feel it.',
  },
]
```

In [None]:
def format_message(history: list, role: str, content: str):
    """
    Format the chat history

    Parameters:
    ----------
    history: list
        List containing chat history.

    role: str
        'user' or 'assistant' 

    content: str
        content to add to chat history
    """
    prompt = {
        'role': role,
        'content': content
    }
    history.append(prompt)
    return history

In [None]:
messages = [
  {
    'role': 'user',
    'content': prompt_1,
  },
  {
    'role': 'assistant',
    'content': response_1.message.content,
  },
]

In [None]:
print(messages[2])

In [None]:
response_2 = chat(
    model=local_models[1],
    messages=messages,
    stream=True,
)

In [None]:
print(response_2.message.content)

## Pass the conversation to a different model.

We will try DeepSeek-R 7 Billion parameter model.

We can pass the conversation history we had with the 1.5 billion model to a different instance of the model and inference will be successful.

In [None]:
response_2_7b = chat(
    model=local_models[2],
    messages=messages,
    stream=False,
)

In [None]:
print(response_2_7b.message.content)

### Remember it is inference not a conversation!