# [LongChain Crash Course](https://www.python-engineer.com/posts/langchain-crash-course/)

[https://youtu.be/LbT1yp6quS8](https://youtu.be/LbT1yp6quS8)

LangChain is a framework for developing applications powered by language models. In this LangChain Crash Course you will learn how to build applications powered by large language models. We go over all important features of this framework.

- [GitHub repo](https://github.com/hwchase17/langchain)
- [Official Docs](https://python.langchain.com/en/latest/index.html)

## Overview:

- Installation
- LLMs
- Prompt Templates
- Chains
- Agents and Tools
- Memory
- Document Loaders
- Indexes

Try out all the code in this [Google Colab](https://colab.research.google.com/drive/1VOwJpcZqOXag-ZXi-52ibOx6L5Pw-YJi?usp=sharing).

## Installation

In [None]:
# !pip install langchain

## LLMs

LangChain provides a generic interface for many different LLMs. Most of them work via their API but you can also run local models.

See all [LLM providers](https://python.langchain.com/en/latest/modules/models/llms/integrations.html).

In [1]:
# pip install openai

import os
# os.environ["OPENAI_API_KEY"] ="YOUR_OPENAI_TOKEN"

In [2]:
from langchain.llms import OpenAI

llm = OpenAI(temperature=0.9)  # model_name="text-davinci-003"
text = "What would be a good company name for a company that makes colorful socks?"
print(llm(text))

Could not import azure.core python package.




QuirkySox.


In [3]:
# os.environ["HUGGINGFACEHUB_API_TOKEN"] = "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"
os.environ["HUGGINGFACEHUB_API_TOKEN"]

'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'

In [7]:
# pip install huggingface_hub

from langchain import HuggingFaceHub

# https://huggingface.co/google/flan-t5-xl
# llm = HuggingFaceHub(repo_id="google/flan-t5-xl", model_kwargs={"temperature":0, "max_length":64})
llm = HuggingFaceHub(repo_id="google/flan-t5-large", model_kwargs={"temperature":0, "max_length":64})

llm("translate English to German: How old are you?")

'Wie alte sind Sie?'

## Prompt Templates

LangChain faciliates prompt management and optimization.

Normally, when you use an LLM in an application, you are not sending user input directly to the LLM. Instead, you need to take the user input and construct a prompt, and only then send that to the LLM.

In [8]:
llm("Can Barack Obama have a conversation with George Washington?")

'no'

A better prompt is this:

In [9]:
prompt = """Question: Can Barack Obama have a conversation with George Washington?

Let's think step by step.

Answer: """
llm(prompt)

'Barack Obama was born in 1961. George Washington died in 1789. So the answer is no.'

This can be achieved with `PromptTemplates`:

In [10]:
from langchain import PromptTemplate

template = """Question: {question}

Let's think step by step.

Answer: """

prompt = PromptTemplate(template=template, input_variables=["question"])

In [11]:
prompt.format(question="Can Barack Obama have a conversation with George Washington?")

"Question: Can Barack Obama have a conversation with George Washington?\n\nLet's think step by step.\n\nAnswer: "

## Chains

Combine LLMs and Prompts in multi-step workflows.

In [12]:
from langchain import LLMChain

llm_chain = LLMChain(prompt=prompt, llm=llm)

question = "Can Barack Obama have a conversation with George Washington?"

print(llm_chain.run(question))

Barack Obama was born in 1961. George Washington died in 1789. So the answer is no.


## Agents and Tools¶
Agents involve an LLM making decisions about which cctions to take, taking that cction, seeing an observation, and repeating that until done.

When used correctly agents can be extremely powerful. In order to load agents, you should understand the following concepts:

- Tool: A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. See available [Tools](https://python.langchain.com/en/latest/modules/agents/tools.html).
- LLM: The language model powering the agent.
- Agent: The agent to use. See also [Agent Types](https://python.langchain.com/en/latest/modules/agents/agents/agent_types.html).

In [13]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent

# pip install wikipedia

from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
tools = load_tools(["wikipedia", "llm-math"], llm=llm)

agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

In [14]:
agent.run("In what year was the film Departed with Leopnardo Dicaprio released? What is this year raised to the 0.43 power?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out the year the film was released and then use the calculator to calculate the power.
Action: Wikipedia
Action Input: Departed with Leonardo Dicaprio[0m
Observation: [36;1m[1;3mPage: Leonardo DiCaprio filmography
Summary: Leonardo DiCaprio is an American actor who began his career performing as a child on television. He appeared on the shows The New Lassie (1989) and Santa Barbara (1990) and also had long running roles in the comedy-drama Parenthood (1990) and the sitcom Growing Pains (1991). DiCaprio played Tobias "Toby" Wolff opposite Robert De Niro in the biographical coming-of-age drama This Boy's Life in 1993. In the same year, he had a supporting role as a developmentally disabled boy Arnie Grape in What's Eating Gilbert Grape, which earned him nominations for the Academy Award for Best Supporting Actor and the Golden Globe Award for Best Supporting Actor – Motion Picture. In 1995, DiCaprio played th

'The film Departed with Leonardo DiCaprio was released in 2006 and this year raised to the 0.43 power is 26.30281917656938.'

## Memory¶
Add state to Chains and Agents.

Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.

In [15]:
from langchain import OpenAI, ConversationChain

llm = OpenAI(temperature=0)
conversation = ConversationChain(llm=llm, verbose=True)

conversation.predict(input="Hi there!")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi there!
AI:[0m

[1m> Finished chain.[0m


" Hi there! It's nice to meet you. My name is AI. What's your name?"

In [16]:
conversation.predict(input="Can we talk about AI?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi there!
AI:  Hi there! It's nice to meet you. My name is AI. What's your name?
Human: Can we talk about AI?
AI:[0m

[1m> Finished chain.[0m


' Sure! What would you like to know about AI?'

In [17]:
conversation.predict(input="I'm interested in Reinforcement Learning.")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi there!
AI:  Hi there! It's nice to meet you. My name is AI. What's your name?
Human: Can we talk about AI?
AI:  Sure! What would you like to know about AI?
Human: I'm interested in Reinforcement Learning.
AI:[0m

[1m> Finished chain.[0m


' Reinforcement Learning is a type of machine learning algorithm that allows AI agents to learn from their environment by taking actions and receiving rewards for those actions. It is a powerful tool for training AI agents to solve complex tasks.'

## Document Loaders¶
Combining language models with your own text data is a powerful way to differentiate them. The first step in doing this is to load the data into documents (i.e., some pieces of text). This module is aimed at making this easy.

See all [available Document Loaders](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html).

In [18]:
from langchain.document_loaders import NotionDirectoryLoader

loader = NotionDirectoryLoader("Notion_DB")

docs = loader.load()

## Indexes

Indexes refer to ways to structure documents so that LLMs can best interact with them. This module contains utility functions for working with documents

- Embeddings: An embedding is a numerical representation of a piece of information, for example, text, documents, images, audio, etc.
- Text Splitters: When you want to deal with long pieces of text, it is necessary to split up that text into chunks.
- Vectorstores: Vector databases store and index vector embeddings from NLP models to understand the meaning and context of strings of text, sentences, and whole documents for more accurate and relevant search results. See [available vectorstores](https://python.langchain.com/en/latest/modules/indexes/vectorstores.html).

In [20]:
import requests

url = "https://raw.githubusercontent.com/hwchase17/langchain/master/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
  f.write(res.text)

In [21]:
# Document Loader
from langchain.document_loaders import TextLoader
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()

In [22]:
# Text Splitter
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

In [23]:
# !pip install sentence_transformers

# Embeddings
from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings()

Downloading (…)a8e1d/.gitattributes: 100%|██████████| 1.18k/1.18k [00:00<?, ?B/s]
Downloading (…)_Pooling/config.json: 100%|██████████| 190/190 [00:00<00:00, 188kB/s]
Downloading (…)b20bca8e1d/README.md: 100%|██████████| 10.6k/10.6k [00:00<00:00, 8.13MB/s]
Downloading (…)0bca8e1d/config.json: 100%|██████████| 571/571 [00:00<00:00, 452kB/s]
Downloading (…)ce_transformers.json: 100%|██████████| 116/116 [00:00<00:00, 116kB/s]
Downloading (…)e1d/data_config.json: 100%|██████████| 39.3k/39.3k [00:00<00:00, 401kB/s]
Downloading pytorch_model.bin: 100%|██████████| 438M/438M [00:11<00:00, 38.7MB/s] 
Downloading (…)nce_bert_config.json: 100%|██████████| 53.0/53.0 [00:00<?, ?B/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 239/239 [00:00<00:00, 255kB/s]
Downloading (…)a8e1d/tokenizer.json: 100%|██████████| 466k/466k [00:00<00:00, 1.59MB/s]
Downloading (…)okenizer_config.json: 100%|██████████| 363/363 [00:00<00:00, 202kB/s]
Downloading (…)8e1d/train_script.py: 100%|██████████| 13.1k/13.

In [26]:
text = "This is a test document."
query_result = embeddings.embed_query(text)
query_result

[-0.048951826989650726,
 -0.039862025529146194,
 -0.021562788635492325,
 0.009908532723784447,
 -0.03810392692685127,
 0.01268437597900629,
 0.04349449276924133,
 0.07183387875556946,
 0.009748621843755245,
 -0.006987075321376324,
 0.06352806836366653,
 -0.030322620645165443,
 0.013839483261108398,
 0.025805899873375893,
 -0.001136284088715911,
 -0.014563587494194508,
 0.041640330106019974,
 0.03622831776738167,
 -0.026800833642482758,
 0.02512073889374733,
 -0.024978604167699814,
 -0.0045332652516663074,
 -0.026667168363928795,
 0.004100714810192585,
 -0.052048057317733765,
 -0.009930485859513283,
 -0.05206530913710594,
 0.00899210199713707,
 -0.038300469517707825,
 -0.04405846446752548,
 -0.0042044054716825485,
 0.07047972828149796,
 0.005133941303938627,
 -0.07161542028188705,
 1.697531274658104e-06,
 -0.0060477107763290405,
 -0.01107643824070692,
 0.017513373866677284,
 -0.02229987643659115,
 0.040954913944005966,
 0.03379019349813461,
 0.056650374084711075,
 -0.07114940881729126,


In [27]:
doc_result = embeddings.embed_documents([text])
doc_result

[[-0.048951826989650726,
  -0.039862025529146194,
  -0.021562788635492325,
  0.009908532723784447,
  -0.03810392692685127,
  0.01268437597900629,
  0.04349449276924133,
  0.07183387875556946,
  0.009748621843755245,
  -0.006987075321376324,
  0.06352806836366653,
  -0.030322620645165443,
  0.013839483261108398,
  0.025805899873375893,
  -0.001136284088715911,
  -0.014563587494194508,
  0.041640330106019974,
  0.03622831776738167,
  -0.026800833642482758,
  0.02512073889374733,
  -0.024978604167699814,
  -0.0045332652516663074,
  -0.026667168363928795,
  0.004100714810192585,
  -0.052048057317733765,
  -0.009930485859513283,
  -0.05206530913710594,
  0.00899210199713707,
  -0.038300469517707825,
  -0.04405846446752548,
  -0.0042044054716825485,
  0.07047972828149796,
  0.005133941303938627,
  -0.07161542028188705,
  1.697531274658104e-06,
  -0.0060477107763290405,
  -0.01107643824070692,
  0.017513373866677284,
  -0.02229987643659115,
  0.040954913944005966,
  0.03379019349813461,
  0.0

In [29]:
# !pip install faiss-cpu

from langchain.vectorstores import FAISS

db = FAISS.from_documents(docs, embeddings)

query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)

In [30]:
print(docs[0].page_content)

Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. 

Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. 

One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. 

And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.


In [31]:
# Save and load:
db.save_local("faiss_index")
new_db = FAISS.load_local("faiss_index", embeddings)
docs = new_db.similarity_search(query)
print(docs[0].page_content)

Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. 

Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. 

One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. 

And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.


## End-to-end example¶
Check out the https://github.com/hwchase17/chat-langchain repo.