# LangChain Workshop

## What is LangChain?

LangChain is a library that provides a lot of helpful components for working with LLMs. These include:
* Model I/O
* Memory 
* Data connection 
* Chains
* Agents

## Model I/O

Model I/O deals with the components for interfacing with the LLM models themselves, as well as formatting the input prompt and getting the desired outputs.

#### Use LangChains interface to make calls to LLM providers

LangChain provides inferfaces for making calls to LLM. This can either:
1. a call to an LLM provider (OpenAI, Cohere, PaLM, HuggingFace, etc)
2. a custom LLM wrapper for your own LLM. 

In this example, we will use OpenAI inferface to interfact with ChatGPT

#### Use prompt templates to format input 

Prompt templates are a way to parameterize and reuse prompts. It's basically a wrapper class for an f-string. So you can specify the parameters at for that specific instance instead of having to redeclare them throughout your code. 

#### Use Output Parsers to format LLM response into desired format

The problem with LLM outputs - they only output strings. So even if we ask for a specific format (json, etc) the LLM can only return a stirng representation of that. Output Parsers are classes to help structure LLM responses.They work in 2 steps: 
1) Use the output parser to get format instructions that you will add to the prompt. This tells the LLM format the output in a way that is parsable by the output parser
2) Use the output parser to parse the LLM output into the desired structure

## Chains

Chains are basically functionality to string together mutiple LLMs and prompts to achieve a desired behavior. The simplest chain is an LLMChain (LLM + Prompt)

#### LLMChain (Model + Prompt)

#### SimpleSequentialChain(multiple LLM Chains)

### Conversational Memory

#### How do we give LLMs conversational memory?

LLMs naturally do not remember anything. They are stateless, so each call is independent and isolated from previous ones. We can only make them "appear" to have memory by injecting the conversation history into the prompt as context. 

Sometimes we want to set a limit on the number of tokens we pass into the LLM. This could be to reduce cost or prevent LLM performance from degrading, or to make sure we are under the token limit. There are several methods to limit the conversation history:
* ConversationBufferWindowMemory - Only keep the last k conversational exchanges (each conversational exchange is a message from you + a message from the chatbot)
* ConversationalTokenBufferMeory - Only keep the last k tokens from the conversation
* ConversationalSummaryBufferMemory - use a separate LLM to write a summary of the conversation so far as memory

## Data Connection

### What's the issue with long prompts? 

The issue is that LLMs can only take in a fixed token size. So if we want to use documents, or lots of data as context, it wont work.

### Solution: Chunking + Vector Databases

The solution is to break the document into smaller chunks, and only inject the chunks that are relevant to the question into the prompt. Vector Databases are used to store embeddings of the chunks so we can perform fast similarity search and retrieval of the relevant chunks. The workflow looks like this: 
1. Load in documents
2. Split document into chunks
3. Create embeddings of those chunks
4. Index those chunks into a Vector Database 
5. Use similarity search to retrieve the relevant chunks for the prompt

In [62]:
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain.tex1t_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader

####  Step 1: Load Documents

#### Step 2: Split into Chunks

In [73]:
text_splitter = CharacterTextSplitter(chunk_size=50, chunk_overlap=0,separator="\n")
chunks = text_splitter.split_documents(docs)

In [76]:
chunks

[Document(page_content='Birthdays in the family:', metadata={'source': './birthday_ideas.txt'}),
 Document(page_content="- Zarah's birthday is in February", metadata={'source': './birthday_ideas.txt'}),
 Document(page_content="- Max's birthday is in April", metadata={'source': './birthday_ideas.txt'}),
 Document(page_content="- James' birthday is in May", metadata={'source': './birthday_ideas.txt'}),
 Document(page_content="- Mom's birthday is in July", metadata={'source': './birthday_ideas.txt'}),
 Document(page_content="- Dad's birthday is in October", metadata={'source': './birthday_ideas.txt'}),
 Document(page_content="- Zyde's birthday is in December", metadata={'source': './birthday_ideas.txt'})]

#### Step 3: Create embedding function

#### Step 4: Create vector database

#### Step 5: Similarity Search for relevant chunk

#### Create LLM Chain to use Vector DB

In [134]:
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

In [136]:
llm = OpenAI()
template = """
Answer the following question:
```{question}```
Using ONLY the following context:
```{context}```
If the context does not provide relevant information, just say "I don't know"
"""
chain = LLMChain(
    llm=llm,
    prompt=PromptTemplate.from_template(template),
    verbose=True
)
question = "When is James's birthday?"
context = db.similarity_search(question)
print(chain.run(question=question,context=context,verbose=True))



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3m
Answer the following question:
```When is James's birthday?```
Using ONLY the following context:
```[Document(page_content="- James' birthday is in May", metadata={'source': './birthday_ideas.txt'}), Document(page_content='Birthdays in the family:', metadata={'source': './birthday_ideas.txt'}), Document(page_content="- Max's birthday is in April", metadata={'source': './birthday_ideas.txt'}), Document(page_content="- Dad's birthday is in October", metadata={'source': './birthday_ideas.txt'})]```
If the context does not provide relevant information, just say "I don't know"
[0m

[1m> Finished chain.[0m

James's birthday is in May.
