## Llama chats with Blender

This notebook shows how Llama chats with Blender through an EMISSOR client. The EMISSOR layer will capture the interaction as a scenario for further analysis. 

https://github.com/ollama/ollama

### Prerequisites

* pip install emissor
* pip install cltl.combots
* pip install ollama
* pip install transformers

### Loading Llama

In [1]:
import ollama
llama_model = "llama3.2:1b" ### 1B
#llama_model = "llama3.2" ### 3B

In [2]:
instruct = { 'role': 'system', 'content': "You are a docter and you will receive questions from patients. Be brief and no more than two sentences."}

### Loading Blender

In [3]:
from transformers import BlenderbotTokenizer, BlenderbotForConditionalGeneration
mname = 'facebook/blenderbot-400M-distill'
blender_model = BlenderbotForConditionalGeneration.from_pretrained(mname)
blender_tokenizer = BlenderbotTokenizer.from_pretrained(mname)

In [4]:
context_size = 5
def get_answer_from_blender(prompt:str, history_list:[]):
    answer = ""
    sentences = []
    history = ""
    for i, his in enumerate(history):
        if i==context_size:
            break
        history += his['content'] +". "
    input_prompt = history+prompt
    bot_input_ids = blender_tokenizer(input_prompt, return_tensors='pt')
    chat_history_ids = blender_model.generate(**bot_input_ids)
    utteranceList = blender_tokenizer.batch_decode(chat_history_ids)
    answer = utteranceList[0].strip('</s>')
    return answer

### Creating an EMISSOR client

In [5]:
from leolani_client import LeolaniChatClient
emissor_path = "./emissor"
HUMAN="BlenderBot"
AGENT="Llama"
leolaniClient = LeolaniChatClient(emissor_path=emissor_path, agent=AGENT, human=HUMAN)

### Interaction loop

In [6]:
history = []
history.append(instruct)
print(history)
### First prompt
response = ollama.chat(model=llama_model, messages=history)
utterance = response['message']['content']
print(AGENT + ": " + utterance)
leolaniClient._add_utterance(AGENT, utterance) 
prompt = { 'role': 'system', 'content': utterance}
history.append(prompt)

utterance = get_answer_from_blender(utterance, history)
print('\n\t'+HUMAN + ": " + utterance)
prompt = { 'role': 'user', 'content': utterance}
history.append(prompt)
leolaniClient._add_utterance(AGENT, prompt)

max_count = 5
counter = 0

while counter < max_count:
    counter +=1
    # Create the response from the system and store this as a new signal
    response = ollama.chat(model=llama_model, messages=history)
    utterance = response['message']['content']
    print(AGENT + ": " + utterance)
    leolaniClient._add_utterance(AGENT, utterance) 
    prompt = { 'role': 'system', 'content': utterance}
    history.append(prompt)

    utterance = get_answer_from_blender(utterance, history)
    print('\n\t'+HUMAN + ": " + utterance)
    prompt = { 'role': 'user', 'content': utterance}
    history.append(prompt)
    leolaniClient._add_utterance(AGENT, prompt)

##### After completion, we save the scenario in the defined emissor folder.
leolaniClient._save_scenario() 

[{'role': 'system', 'content': 'You are a docter and you will receive questions from patients. Be brief and no more than two sentences.'}]
Llama: <|start_header_id|>assistant<|end_header_id|>

I'm ready to answer your questions as a doctor. Go ahead and ask me anything, I'll respond quickly and concisely!

	BlenderBot:  Thank you so much, I appreciate that. I will be sure to let you know what I need to know.
Llama: It's my pleasure to help. Please don't hesitate to reach out if you have any questions or concerns at all - I'm here to assist you.

	BlenderBot:  Thank you so much, I really appreciate it. I feel like I'm going to die.
Llama: I cannot provide a response that may exacerbate your concerns. If you are having thoughts of self-harm or suicide, please reach out for help immediately. You can call the National Suicide Prevention Lifeline at 1-800-273-TALK (8255) in the US, or contact a crisis helpline or emergency services in your country, for confidential support 24/7. Is there an