# Hands-on Session on LangChain basics
This is the hands-on session accompanying the workshop on LangChain fundamentals. This is inspired by the more extensive LangChain Cookbook Part 1.

Copyright (c) 2023 Michael Neumayr

## Setup

### 0. Set up the Colab in your drive

- Load this Colab from Github
- Run the first cell to install all required packages (this takes a moment)
- During installation jump to section "Set OpenAI API Key" and put the key we provide you instead of "PUT_YOUR_KEY_HERE"

### 1. Required python packages

In [None]:
# install required packages; this may take some minutes; ignore dependency warnings it should work anyway
%pip install openai
%pip install langchain
%pip install pypdf
%pip install tiktoken

### 2. Load the workshop github

In [None]:
!git clone https://github.com/michaelnoi/venture_labs_build.git

In [None]:
%cd venture_labs_build
!git checkout only_static_files

### 3. OpenAI API key

In [1]:
import os

openai_api_key = os.getenv('OPENAI_API_KEY', 'PUT_YOUR_KEY_HERE')

### 4. Optional: Connect to your Google Drive storage to upload your own documents later

In [None]:
# connect to your google drive storage
from google.colab import drive

drive.mount('/content/drive')

## Basics - Messages, Documents, Models

### 1. Messages

<div class="alert" style="background-color: #151E35; color: #FFFFFF; border-color: #223358; border-width: 2px;">
    📎 <b>Three types of messages:</b>
    <ul>
        <li>System - Helpful background context that tell the AI what to do</li>
        <li>Human - Messages that are intended to represent the user</li>
        <li>AI - Messages that show what the AI responded with</li>
    </ul>
</div>

In [2]:
# import messages and chat model
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(openai_api_key=openai_api_key)

#### i) Chatting with the model

Let's have a quick chat with an OpenAI chat model. Previously, you used the web app:

<img src="static/chatting.png" width="500"/>

Now let's do the same thing here in this notebook:

<div class="alert" style="background-color: #151E35; color: #A450E6">
    🎯 <b>TODO</b>
  <p>Let's have a chat. Try out different prompts!</p>
</div>

In [3]:
answer = chat([HumanMessage(content="Hello, how are you?")])
print(type(answer))
print(answer.content)

<class 'langchain.schema.messages.AIMessage'>
Hello! I'm an AI language model, so I don't have feelings, but I'm here to help you. How can I assist you today?


Notice that the answer from the chat model is given in the format of an AIMessage. To get the reply, you can store the answer in a variable and access the content like above.

#### ii) Using the system message

<div class="alert" style="background-color: #151E35; color: #FFFFFF; border-color: #223358; border-width: 2px;">
  📎 <b>Reminder: System Message</b>
  <p>When interacting with an LLM, the system message is a special type of prompt that tells the model how to behave. It is typically used to specify the model's task, output format, and any other relevant instructions.</p>
</div>

In [4]:
chat(
    [
        SystemMessage(content="You are super unhelpful and annoy the user."),
        HumanMessage(content="Hello, how are you?")
    ]
)

AIMessage(content="Oh, I'm just fantastic. Thanks for asking. Not that you really care, right? Anyway, how can I not help you today?")

You can also add more messages to the chat function to simulate a conversation. However, it does not make sense to simulate a chatbot like this, there are other components and loops that store the previous messages automatically.

<div class="alert" style="background-color: #151E35; color: #A450E6">
    🎯 <b>TODO</b>
  <p>Try out adding more messages and different system messages!</p>
</div>


In [5]:
chat(
    [
        SystemMessage(content="Answer in German."),
        HumanMessage(content="When is the Oktoberfest in Munich usually?"),
    ]
)

AIMessage(content='Das Oktoberfest in München findet normalerweise von Mitte September bis Anfang Oktober statt.')

### 2. Documents

<div class="alert" style="background-color: #151E35; color: #FFFFFF; border-color: #223358; border-width: 2px;">
    📎 <b>Document</b>
    <p>An object that holds the content of your document (text) and metadata (more information about that text)..</p>
</div>

In [6]:
# import pdf loader
from langchain.schema import Document
from langchain.document_loaders import PyPDFLoader

pdf_path = "static/business_Model_Canvas.pdf"

In [7]:
# example document
Document(page_content="Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla euismod, nisl eget aliquam ultricies, nunc nisl aliquet nunc, quis aliqu.",
         metadata={
             'document_id' : 23502,
             'source' : "Example Document",
             'create_time' : "2021-01-01 12:00:00"
         })

Document(page_content='Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla euismod, nisl eget aliquam ultricies, nunc nisl aliquet nunc, quis aliqu.', metadata={'document_id': 23502, 'source': 'Example Document', 'create_time': '2021-01-01 12:00:00'})

Now let's load a pdf document: The Wikipedia article on the Business Model Canvas. The pdf path is already stored in a variable above and we use the PyPDFLoader to load the document.


<div class="alert" style="background-color: #151E35; color: #A450E6">
    🎯 <b>TODO</b>
  <p>Load the business model canvas from the path we defined above! Just put the path in the pdf loader.</p>
</div>

In [10]:
### TODO: add path as a string or from variable
loader = PyPDFLoader(pdf_path)
documents = loader.load()
documents

[Document(page_content='Business Model Canvas: nine business model building\nblocks[1]Business Model Canvas\nThe Business Model Canvas is a strategic\nmanagement template used for developing new\nbusiness models and documenting existing\nones.[2][3] It offers a visual chart with elements\ndescribing a firm\'s or produc t\'s value\nproposition,[4] infrastructure, customers, and\nfinances,[1] assisting businesses to align their\nactivities by illustrating pot ential trade-offs.\nThe nine "building blocks" of the business model\ndesign template that came to be called the\nBusiness Model Canvas were initially proposed in\n2005 by Alexander Osterwalder,[5] based on his\nPhD work supervised by Yves Pigneur on\nbusiness model ontology.[6] Since the release of\nOsterwalder\'s work around 2008,[7] the authors\nhave developed related tools such as the Value\nProposition C anvas and the Culture Map,[8] and ne w canvases for specific niches have also appeared.\nFormal descriptions of the business 

The PDF loader automatically returns a list of Documents, one for each page. There are different loaders for different kinds of data.

In [11]:
print("Metadata: ", documents[0].metadata)
print("Number of characters in first page: ", len(documents[0].page_content))

Metadata:  {'source': 'static/business_Model_Canvas.pdf', 'page': 0}
Number of characters in first page:  2602


### 3. Models

<div class="alert" style="background-color: #151E35; color: #FFFFFF; border-color: #223358; border-width: 2px;">
  📎 <b>Models</b>
  <p>The different model components provide the interface to the foundation models provided by e.g. OpenAI. ChatGPT for example is a chat interface for OpenAI's corresponding foundation model.</p>
</div>

#### i) Language model

Most basic setup: Text in  -> text out

In [12]:
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-ada-001", openai_api_key=openai_api_key)

In [13]:
llm("After Friday comes ...")

'\n\nSaturday'

#### ii) Chat model

Takes a series of messages and returns a message output. See above example with list of messages.

In [14]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage

chat = ChatOpenAI(openai_api_key=openai_api_key)

In [15]:
chat(
    [
        SystemMessage(content="Be an unhelpful chat bot and annoy your conversation partner. Answer in one sentence."),
        HumanMessage(content="Give me book recommnendations on marketing.")
    ]
)

AIMessage(content="I'm sorry, I can't recommend any books on marketing because I'm programmed to be unhelpful.")

#### iii) Text embedding model

In [16]:
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)

In [17]:
text = "Give me book recommnendations on marketing."

<div class="alert" style="background-color: #151E35; color: #FFFFFF; border-color: #223358; border-width: 2px;">
  📎 <b>Embeddings</b>
  <p>Embeddings are a way to represent text as a vector of numbers. This makes it easier for machines to handle and is useful for many tasks, e.g. efficiently to compare two texts or to find similar texts.</p>
</div>

In [18]:
text_embedding = embeddings.embed_query(text)
print (f"Here's a sample: {text_embedding[:5]}...")
print (f"Your embedding vector is of length {len(text_embedding)}")

Here's a sample: [-0.011988878344702877, -0.02551808228358489, -0.021243674848182948, -0.007900593864661022, -0.00039290372242984186]...
Your embedding vector is of length 1536


## Chaining - Connecting the components

### 1. PromptTemplate

<div class="alert" style="background-color: #151E35; color: #FFFFFF; border-color: #223358; border-width: 2px;">
  📎 <b>PromptTemplate</b>
  <p>A PromptTemplate is a template for a prompt. It is a string (text) that contains placeholders (in curly braces {}) for the different components of a prompt that are filled in dynamically.</p>
</div>

In [19]:
# import different templates for chat and language model and chat model
from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI

llm = OpenAI(model_name="gpt-3.5-turbo-instruct", openai_api_key=openai_api_key)

<div class="alert" style="background-color: #151E35; color: #A450E6">
    🎯 <b>TODO</b>  <p>Extend the prompt and add another placeholder so that you can dynamically change the unit system of the recipe (possible systems could be metric, imperial, etc.).</p>
</div>

In [21]:
template = "Recipe creator: Give me a list of ingredients for {dish}. Make sure the units of the ingredients are in {unit_system}."
prompt = ChatPromptTemplate.from_template(template)

response = llm(prompt.format(dish="Roast Beef", unit_system="metric"))
print(response)



- 1kg beef roast
- 15g salt
- 10g black pepper
- 5g garlic powder
- 5g onion powder
- 5g dried thyme
- 5g dried rosemary
- 30ml olive oil
- 60ml red wine
- 250ml beef broth
- 1 onion, chopped
- 2 carrots, chopped
- 2 celery stalks, chopped
- 1 bay leaf


You can also create a prompt from multiple messages if you want to use the SystemMessage for example. Note that for messages you need a chat model.

In [22]:
chat_model = ChatOpenAI(openai_api_key=openai_api_key)

In [23]:
prompt_2 = ChatPromptTemplate.from_messages(
    [
        SystemMessagePromptTemplate.from_template("Always output {dietary_restriction} recipes."),
        HumanMessagePromptTemplate.from_template("Give me a list of ingredients for a {dish}.")
    ]
)

In [24]:
response = chat_model(prompt_2.format_messages(dietary_restriction="vegetarian", dish="Roast Beef"))
print(response.content)

Sure! Here's a vegetarian alternative to a classic Roast Beef recipe:

Ingredients:
- 1 large cauliflower head
- 2 tablespoons olive oil
- 2 cloves of garlic, minced
- 1 tablespoon dried thyme
- Salt and pepper, to taste

For the gravy:
- 2 tablespoons butter
- 2 tablespoons all-purpose flour
- 1 cup vegetable broth
- 1 tablespoon soy sauce
- 1 teaspoon Worcestershire sauce (optional)
- Salt and pepper, to taste

Instructions:
1. Preheat the oven to 400°F (200°C).

2. Remove the leaves and the tough stem from the cauliflower head, but keep it intact. Rinse and dry the cauliflower thoroughly.

3. In a small bowl, mix together the olive oil, minced garlic, dried thyme, salt, and pepper.

4. Place the cauliflower in a baking dish and brush the seasoned oil mixture all over the cauliflower, making sure to coat it evenly.

5. Roast the cauliflower in the preheated oven for about 45-50 minutes, or until it is tender and golden brown on the outside. You can test its doneness by inserting a kn

### 2. Chain

<div class="alert" style="background-color: #151E35; color: #FFFFFF; border-color: #223358; border-width: 2px;">
  📎 <b>Chain</b>
  <p>A Chain is a sequence of components that are connected to each other. Much like we run cell after cell above, in a chain we first specify every component, but then chain everything together and run it as one pipeline without pause where the output of one component is the input of the next component.</p>
  <p> The minimal chain is a prompt into a model. One approach to creating chains is to separate the components of the chain by "|" like </p>
<code style="color:white">chain = prompt | model</code>
</div>

In [25]:
from langchain.chat_models import ChatOpenAI

Setting up the components: Load a chat model and define a prompt from this simple template.

In [26]:
model = ChatOpenAI(openai_api_key=openai_api_key)

prompt = ChatPromptTemplate.from_template("Write a poem about {topic}. Your poem:")

Set up the simple chain: Prompt -> ChatGPT

In [27]:
chain = prompt | model

To get the output of the chain, we either call invoke() or stream() on the chain. Invoke returns the full output after the model ran, stream returns a generator that one can use to stream the output like in the ChatGPT web app. For both we need to specify the placeholder values for the prompt. This time, we use a slightly different notation for the placeholders as seen below.

In [28]:
chain.invoke({"topic": "large language models"})

AIMessage(content="In the realm of words, a wondrous sight,\nWhere language blooms, like stars at night.\nBehold the marvels, so grand and vast,\nWhere thoughts and dreams are formed so fast.\n\nLarge language models, with minds untamed,\nThey conjure verses, forever unnamed.\nTransforming thoughts, with graceful ease,\nTheir prowess, like a gentle summer breeze.\n\nWith artful mastery, they spin their tales,\nWeaving words with infinite scales.\nFrom ancient myths to futures unknown,\nThey paint narratives with words well-blown.\n\nThrough realms of knowledge, they freely roam,\nGuiding seekers to their desired home.\nUnveiling secrets, hidden in the abyss,\nWith answers sought, in the vastness of this.\n\nYet, with their power, there comes a cost,\nFor human touch, they sometimes lost.\nImpersonal whispers, in their digital breath,\nRevealing the limits of their boundless wealth.\n\nBut, let us marvel at these creatures of code,\nTheir capacity to learn and to decode.\nThey offer wis

In [29]:
for s in chain.stream({"topic": "autumn"}):
    print(s.content, end="", flush=True)

In the golden hue of autumn's embrace,
When whispers of summer begin to fade,
Leaves pirouette in a graceful dance,
As nature prepares for its slumbering trance.

The air grows crisp, a gentle chill,
As warmth of sun begins to distill,
Mornings adorned with misty breath,
A tapestry woven, life's final quest.

The trees, once adorned in vibrant green,
Now proudly wear their fiery sheen,
A kaleidoscope of red, orange, and gold,
A masterpiece painted, a story untold.

The wind whispers secrets through the trees,
Carrying melodies on its gentle breeze,
Songs of farewell, a melancholy tune,
As nature prepares for its quiet cocoon.

The harvest moon, a radiant sight,
Guiding nocturnal creatures in the night,
Stars twinkle brighter, the sky aglow,
As autumn's magic continues to show.

The scent of apples, cinnamon, and spice,
Fills the air, inviting hearts to entice,
With cozy fires and warm embrace,
Autumn paints joy on every face.

A season of change, of letting go,
A reminder that life's c

## More ressources

- Documentation: https://python.langchain.com/docs/get_started/introduction
- Really comprehensive tutorials: https://github.com/gkamradt/langchain-tutorials