# Intro to LangChain

LangChain is an open-source library that provides developers with the tools to build applications powered by large language models (LLMs). 

More specifically, LangChain is an orchestration tool for prompts, making it easier for developers to chain different prompts interactively.

- Homepage - [link](https://python.langchain.com/en/latest/index.html)
- Github - [link](https://github.com/hwchase17/langchain#%EF%B8%8F-langchain)

## Why Use LangChain?

LLMs are already incredibly powerful when used with a single prompt, but they execute completions by guessing the most likely next word, rather than reasoning as humans do.

LangChain is a framework that enables developers to **build agents that can reason about problems and break them into smaller sub-tasks**. With LangChain, we can introduce context and memory into completions by **creating intermediate steps and chaining commands together**.

## Modules
- **Models**: Supported model types and integrations.

- **Prompts**: Prompt management, optimization, and serialization.

- **Chains**: Chains are structured sequences of calls (to an LLM or to a different utility).

- **Agents**: An agent is a Chain in which an LLM, given a high-level directive and a set of tools, repeatedly decides an action, executes the action and observes the outcome until the high-level directive is complete.

- **Memory**: Memory refers to state that is persisted between calls of a chain/agent.

- **Indexes**: Language models become much more powerful when combined with application-specific data - this module contains interfaces and integrations for loading, querying and updating external data.

- **Callbacks**: Callbacks let you log and stream the intermediate steps of any chain, making it easy to observe, debug, and evaluate the internals of an application.

In [None]:
import numpy as np
import pandas as pd

import warnings
warnings.filterwarnings('ignore')

###############################
### Use your OpenAI API Key ###
import openai
with open('key.txt') as f:
    openai_key = f.read()
###############################

from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(openai_api_key=openai_key, temperature=0.0)

The `temperature` controls the "creativity" of the generated text, with higher values leading to more varied and surprising output. In this case, a `temperature` of 0 will result in the most "safe" and predictable output.

## Prompt

A prompt template refers to a reproducible way to generate a prompt. It contains a text string (“the template”), that can take in a set of parameters from the end user and generate a prompt.

In [None]:
from langchain.prompts import ChatPromptTemplate

template = "What is {base} raised to the {exponent} power?"
prompt = ChatPromptTemplate.from_template(template)

By default, `PromptTemplate` will treat the provided template as a Python f-string. 

In [None]:
prompt.input_variables

In [None]:
prompt.format(base="13", exponent=".3432")

In [None]:
msg = prompt.format_messages(base="13", exponent=".3432")
llm(msg).content

In [None]:
from langchain import PromptTemplate, FewShotPromptTemplate

example_formatter_template = """Word: {word}
Antonym: {antonym}
"""
example_prompt = PromptTemplate.from_template(example_formatter_template)

examples = [
    {"word": "happy", "antonym": "sad"},
    {"word": "tall", "antonym": "short"},
]

few_shot_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input\n",
    suffix="Word: {input}\nAntonym: ",
    input_variables=["input"],
    example_separator="\n",
)

print(few_shot_prompt.format(input="big"))

## Chain

Chains are one of the fundamental building blocks of LangChain

Using an LLM in isolation is fine for some simple applications, but more complex applications require chaining LLMs - either with each other or with other experts. 

So a chain is basically a pipeline that processes an input by using a specific combination of primitives. Intuitively, it can be thought of as a 'step' that performs a certain set of operations on an input and returns the result. They can be anything from a prompt-based pass through a LLM to applying a Python function to an text.

### LLM Chain

`LLMChain` is perhaps one of the most popular ways of querying an LLM object. It formats the prompt template using the input key values provided, passes the formatted string to LLM and returns the LLM output.

In [None]:
from langchain.chains import LLMChain

# template = "What is {base} raised to the {exponent} power?"
# prompt = ChatPromptTemplate.from_template(template)
chain = LLMChain(llm=llm, prompt=prompt)
chain.predict(base="13", exponent=".3432") 

In [None]:
13 ** 0.3432

### LLMMathChain

Utility Chains: chains that are usually used to extract a specific answer from a llm with a very narrow purpose and are ready to be used out of the box.

In [None]:
from langchain.chains import LLMMathChain

llm_math = LLMMathChain(llm=llm, verbose=True)
llm_math.run("What is 13 raised to the .3432 power?") 

In [None]:
print(llm_math.prompt.template)

### SequentialChain

Reference: [deeplearning.ai - LangChain for LLM Application Development](https://learn.deeplearning.ai/langchain/lesson/4/chains)

Use caes: Amazon product reviews

In [None]:
# prompt template 1: translate to english
first_prompt = ChatPromptTemplate.from_template(
    "Translate the following review to english:"
    "\n\n{Review}"
)
# chain 1: input= Review and output= English_Review
chain_one = LLMChain(
    llm=llm, 
    prompt=first_prompt, 
    output_key="English_Review"
)


In [None]:
second_prompt = ChatPromptTemplate.from_template(
    "Can you summarize the following review in 1 sentence:"
    "\n\n{English_Review}"
)
# chain 2: input= English_Review and output= summary
chain_two = LLMChain(
    llm=llm, 
    prompt=second_prompt, 
    output_key="summary"
)

In [None]:
# prompt template 3: translate to english
third_prompt = ChatPromptTemplate.from_template(
    "What language is the following review:\n\n{Review}"
)
# chain 3: input= Review and output= language
chain_three = LLMChain(
    llm=llm, 
    prompt=third_prompt,
    output_key="language"
)

In [None]:
# prompt template 4: follow up message
fourth_prompt = ChatPromptTemplate.from_template(
    "Write a follow up response to the following "
    "summary in the specified language:"
    "\n\nSummary: {summary}\n\nLanguage: {language}"
)
# chain 4: input= summary, language and output= followup_message
chain_four = LLMChain(
    llm=llm, 
    prompt=fourth_prompt,
    output_key="followup_message"
)

In [None]:
# overall_chain: input= Review 
# and output= English_Review,summary, followup_message
from langchain.chains import SequentialChain

overall_chain = SequentialChain(
    chains=[chain_one, chain_two, chain_three, chain_four],
    input_variables=["Review"],
    output_variables=["language", "English_Review", "summary", "followup_message"],
    verbose=True
)

In [None]:
review = '''
Je trouve le goût médiocre. La mousse ne tient pas, c'est bizarre. J'achète les mêmes dans le commerce et le goût est bien meilleur...
Vieux lot ou contrefaçon !?
'''
seqchain = overall_chain(review)
seqchain

## Agent

LangChain is a framework that sits between the Large Language Models (LLMS) and the Tools (Google Drive, Python, Wikipedia, Calculator, Wolfram Alpha, etc). It also connects to a custom data base via a Vector Store (Pinecone, Milvus, Chroma, etc) and has Agents that can can chain together different actions in a LangChain.

**Tools**: Connect to LangChain with external sources, something like a function that performs a specific extern duty (external to LLMs) — like google search, database lookup, python REPL, mathematical calcuations, wikipedia lookup, etc. They are the interface between the LLM and external sources.

**Agents**: Think of Agents as ‘Bots’ that make AI do things for you. They are the interface between LLM and the tools, and figure out the task (what needs to be done) and the tool (what is the right tool for this specific task). There are a lot of predefined agents which can already do a lot of things, and you can also make your own agent to do a specific task.

Reference: [Medium - The Agents of AI: Data Analysis with LLMs and LangChain Agents](https://ashukumar27.medium.com/the-agents-of-ai-1402548e9b8c)

Data: [Kaggle - Black Friday](https://www.kaggle.com/datasets/sdolezel/black-friday)

In [None]:
df = pd.read_csv('train.csv')
df.head()

In [None]:
from langchain.agents import create_csv_agent

agent = create_csv_agent(
    llm=llm,
    path='train.csv',
    verbose=True
)

In [None]:
print(agent.agent.llm_chain.prompt.template)

In [None]:
agent.run("How many rows and columns are there in the dataframe?")

In [None]:
agent.run('How many unique user_id are there?')

In [None]:
agent.run('Which user made the most purchase?')

In [None]:
# import plotly.express as px
# agent.run("Plot a histgram of purchase distribution using plotly library")

In [None]:
agent.run('How many people are female?')

In [None]:
agent.run('How many people are female? Note there can be multiple records under the same people.')

More use cases:
- [Python Agent](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/python.html)
- [SQL Database Agent](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/sql_database.html)

## Memory

Reference: [deeplearning.ai - LangChain for LLM Application Development](https://learn.deeplearning.ai/langchain/lesson/3/memory)

### ConversationBufferMemory

In [None]:
## ConversationBufferMemory
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)

In [None]:
# print(conversation.prompt.template)

In [None]:
conversation.predict(input="Hi, my name is Yi")

In [None]:
conversation.predict(input="What is 1+1?")

In [None]:
conversation.predict(input="What is my name?")

### ConversationBufferWindowMemory

In [None]:
## ConversationBufferWindowMemory
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)

In [None]:
conversation.predict(input="Hi, my name is Yi")

In [None]:
conversation.predict(input="What is 1+1?")

In [None]:
conversation.predict(input="What is my name?")

### ConversationSummaryBufferMemory

In [None]:
## ConversationSummaryBufferMemory
from langchain.memory import ConversationSummaryBufferMemory

# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(
    llm=llm, 
    max_token_limit=100
)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, {"output": f"{schedule}"})
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)
conversation.predict(input="What would be a good demo to show?")