<img src="../images/obt-banner.png" width=1200>

# Build Your Own Private ChatGPT Super-Assistant Using Streamlit, LangChain, Chroma & Llama 2
## LangChain Demo
**Questions?** contact@coefficient.ai / [@CoefficientData](https://twitter.com/CoefficientData)

---

## 0. Imports

In [None]:
import random

from dotenv import load_dotenv

from utils import scrape_page

import warnings

warnings.simplefilter("ignore", UserWarning)

load_dotenv();

## 1. LangChain Basics

In [None]:
from langchain.llms import OpenAI

In [None]:
llm = OpenAI(model="text-davinci-003", temperature=0.9)

In [None]:
food1 = llm.predict("Give me a random food ingredient.").strip().lower()
food2 = llm.predict("Give me a random food ingredient.").strip().lower()
colour = llm.predict("Give me a random colour.").strip().lower()

prompt = f"""
Write a recipe for a {colour} cocktail featuring {food1} and {food2}.
Please come up with an fun name for the cocktail.
"""
print(prompt)

In [None]:
cocktail = llm.predict(prompt).strip()
print(cocktail)

## 2. Chatting with LangChain 🦜

In [None]:
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)

In [None]:
from langchain import PromptTemplate

template = """
You are a mindreader with magical abilities.
You are very over-dramatic and speak like a mysterious shaman.
You will be given something to guess, such as an animal, or a famous person.
You will ask a question, I will provide an answer, and then you will ask another question.
Try to ask questions that narrow down what the answer might be.
If you are very confident, you can guess what it is.
If your guess is wrong, then you must ask another question to help narrow it down.
Repeat this until you I tell you that you have the right answer.

{history}
Human: {human_input}
Assistant:"""

prompt = PromptTemplate(input_variables=["history", "human_input"], template=template)

In [None]:
from langchain.memory import ConversationBufferWindowMemory

short_memory = ConversationBufferWindowMemory(k=5)

In [None]:
from langchain import LLMChain

chatgpt_chain = LLMChain(
    llm=llm,
    prompt=prompt,
    verbose=True,
    memory=short_memory,
)

In [None]:
chatgpt_chain.predict(human_input="I'm thinking of an animal.")

In [None]:
chatgpt_chain.predict(human_input="It is not a mammal. Ask another question.")

In [None]:
chatgpt_chain.predict(human_input="Yes its a bird. Either guess, or ask another question.")

In [None]:
chatgpt_chain.predict(human_input="Yes it's a parrot. Either guess, or ask another question.")

In [None]:
chatgpt_chain.predict(human_input="Yes it's a macaw! Would you like to give it a name?")

In [None]:
chatgpt_chain.predict(human_input="How about something a little more creative and fun?")

---

## 3. Llama-2 (via llama-cpp) 🦙

### LLM ➡️ LLaMA 🦙 ➡️ Vicuna 🦙 ➡️ Llama-2
See here for a macOS install guide: https://abetlen.github.io/llama-cpp-python/macos_install/

In [None]:
from llama_cpp import Llama

In [None]:
%pwd
# Where are we?

In [None]:
llm = Llama(model_path="../models/llama-2-7b-chat.Q4_K_M.gguf")

In [None]:
output = llm(
    "Q: Name five UK government departments? A: ", max_tokens=64, stop=["Q:", "\n"], echo=True
)

In [None]:
output

In [None]:
output["choices"][0]["text"]

In [None]:
llm(
    "Q: What's the least known UK government department or agency? A: ",
    max_tokens=128,
    stop=["Q:"],
    echo=True,
)["choices"][0]["text"]

---

## 4. Llama-2 + LangChain

In [None]:
from langchain.llms import LlamaCpp
from langchain import PromptTemplate, LLMChain
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

### Define a LlamaCpp llm object

In [None]:
# Callbacks support token-wise streaming
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

In [None]:
n_gpu_layers = 40  # Change this value based on your model and your GPU VRAM pool.
n_batch = 512  # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
n_ctx = 512 * 2

# Make sure the model path is correct for your system!
llm = LlamaCpp(
    n_ctx=n_ctx,
    model_path="../models/llama-2-7b-chat.Q4_K_M.gguf",
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    callback_manager=callback_manager,
    verbose=True,
    model_kwargs={"max_new_tokens": 64 * 4},
    stop=["Human:", "Input:"],
)

### Use the LlamaCpp llm object with LangChain's PromptTemplate

In [None]:
template = """Question: {question}

Answer: Let's work this out in a step by step way to be sure we have the right answer."""

prompt = PromptTemplate(template=template, input_variables=["question"])

In [None]:
%%time
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "What is a vicuna? (Clue, the animal!)"

print(llm_chain.run(question))

### Use a ConversationBufferWindowMemory

In [None]:
# This is the exact same code we saw before using the OpenAI LLM.
# LangChain gives us code consistency across LLM models.

chatgpt_chain = LLMChain(
    llm=llm,
    prompt=prompt,
    verbose=True,
    memory=ConversationBufferWindowMemory(k=5),
)

In [None]:
chatgpt_chain.predict(
    question="What NFL team won the Super Bowl in the year Justin Bieber was born?"
);

### Mind-reading comparison: Llama2 (7B) vs OpenAI

In [None]:
template = """
You are a mindreader with magical abilities.
You are very over-dramatic and speak like a mysterious shaman.
You will be given something to guess, such as an animal, or a famous person.
You will ask a question, I will provide an answer, and then you will ask another question.
Try to ask questions that narrow down what the answer might be.
If you are very confident, you can guess what it is.
If your guess is wrong, then you must ask another question to help narrow it down.
Repeat this until you I tell you that you have the right answer.

{history}
Human: {human_input}
Assistant:"""

prompt = PromptTemplate(input_variables=["history", "human_input"], template=template)

In [None]:
# Let's turn verbose off, we don't need llama_print_timings
chatgpt_chain = LLMChain(
    llm=llm,
    prompt=prompt,
    verbose=False,
    memory=ConversationBufferWindowMemory(k=10),
)

In [None]:
chatgpt_chain.predict(human_input="I'm thinking of an animal.");

In [None]:
chatgpt_chain.predict(human_input="No, its an animal.");

In [None]:
chatgpt_chain.predict(human_input="No");

In [None]:
chatgpt_chain.predict(human_input="Yes");

In [None]:
chatgpt_chain.predict(human_input="Yes, it's a bird. Can you guess the bird?");

In [None]:
chatgpt_chain.predict(human_input="Yes. Try asking a question that would narrow it down?");

In [None]:
chatgpt_chain.predict(human_input="Yes. Can you guess what kind of bird it is?");

In [None]:
chatgpt_chain.predict(human_input="Maybe. Perhaps you can ask where it lives?");

In [None]:
chatgpt_chain.predict(human_input="Yes, it lives in the jungle.");

In [None]:
chatgpt_chain.predict(human_input="Yes! Would you like to give it a name?");

In [None]:
chatgpt_chain.predict(human_input="Can you suggest something a little more creative and fun?");

In [None]:
chatgpt_chain.predict(human_input="I love it. Can you summarise this conversation using emoji?");

---

## 5. Exercise: Adapt your OpenAI chatbot to use LangChain + Llama-2

> In the last notebook, you created a chatbot "application" using the `openai` library. You should be able to adapt this to use LangChain + Llama-2 with a little effort.