# Getting Started with LangChain: A Beginner’s Guide to Building LLM-Powered Applications
https://towardsdatascience.com/getting-started-with-langchain-a-beginners-guide-to-building-llm-powered-applications-95fc8898732c

LangChain (version 0.0.147) covers six modules:

- Models: Choosing from different LLMs and embedding models
- Prompts: Managing LLM inputs
- Chains: Combining LLMs with other components
- Indexes: Accessing external data
- Memory: Remembering previous conversations
- Agents: Accessing other tools


In [23]:
import sys
!{sys.executable} -m pip install --upgrade pyopenssl ndg-httpsclient pyasn1
!{sys.executable} -m pip install --upgrade youtube-transcript-api
!{sys.executable} -m pip install --upgrade wikipedia


Collecting wikipedia
  Downloading wikipedia-1.4.0.tar.gz (27 kB)
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (pyproject.toml) ... [?25ldone
[?25h  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11680 sha256=c9f98ee1535a03b48de8a679602cf2b4b5e168e88b1e9ec1f8a213342c43caa5
  Stored in directory: /Users/ytchen/Library/Caches/pip/wheels/c2/46/f4/caa1bee71096d7b0cdca2f2a2af45cacf35c5760bee8f00948
Successfully built wikipedia
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0


In [13]:
from getpass import getpass
OPENAI_API_KEY = getpass('Enter your OpenAI key: ')


Enter your OpenAI key: ········


In [14]:
import os
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY


In [3]:
# Proprietary LLM from e.g. OpenAI
# pip install openai
from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003")

# Alternatively, open-source LLM hosted on Hugging Face
# pip install huggingface_hub
#from langchain import HuggingFaceHub
#llm = HuggingFaceHub(repo_id = "google/flan-t5-xl")


- LLMs take a string as an input (prompt) and output a string (completion).

In [4]:
# The LLM takes a prompt as an input and outputs a completion
prompt = "Alice has a parrot. What animal is Alice's pet?"
completion = llm(prompt)
print(completion)




Alice's pet is a parrot.


In [6]:
import openai
import os

# Set up the OpenAI API client
openai.api_key = os.environ['OPENAI_API_KEY']

# Prepare the API request
prompt = "Once upon a time in a land far, far away,"
model = "text-davinci-002"  # You can replace this with your preferred model
#model = "gpt-3.5-turbo"
max_tokens = 50

# Make the API call
response = openai.Completion.create(
    engine=model,
    prompt=prompt,
    max_tokens=max_tokens,
    n=1,
    stop=None,
    temperature=0.7,
)

# Print the generated text
generated_text = response.choices[0].text
print(prompt + generated_text)


Once upon a time in a land far, far away, there lived a beautiful princess. The princess was loved by everyone in her kingdom, except for her wicked stepmother. The stepmother was always jealous of the princess and wished she could be rid of her.

One day, the stepmother asked


- Text embedding models take text input and return a list of floats (embeddings), which are the numerical representation of the input text. 

In [7]:
# Proprietary text embedding model from e.g. OpenAI
# pip install tiktoken
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

# Alternatively, open-source text embedding model hosted on Hugging Face
# pip install sentence_transformers
#from langchain.embeddings import HuggingFaceEmbeddings
#embeddings = HuggingFaceEmbeddings(model_name = "sentence-transformers/all-MiniLM-L6-v2")

# The embeddings model takes a text as an input and outputs a list of floats
text = "Alice has a parrot. What animal is Alice's pet?"
text_embedding = embeddings.embed_query(text)


In [8]:
print(text_embedding)


[0.013288114219903946, -0.009326402097940445, -0.008071756921708584, -0.011409236118197441, -0.013473529368638992, 0.011149654164910316, -0.005432675592601299, 0.0032571330666542053, -0.0020812894217669964, -0.0018170722760260105, -0.002262069610878825, 0.02381971664726734, 0.0023053332697600126, -0.02585928700864315, -0.0012044284958392382, 0.010129868984222412, 0.03948114812374115, 0.015389489941298962, 0.033869240432977676, -0.017095312476158142, -0.011761525645852089, 0.021298065781593323, -0.006761486642062664, -0.017886418849229813, 0.0030933492816984653, 0.00839314330369234, 0.012546451762318611, -0.011081668548285961, 0.023263469338417053, -0.014697272330522537, 0.02427707426249981, 0.008189186453819275, -0.01813364028930664, -0.017676280811429024, -0.02551317773759365, -0.005809687077999115, 0.012033469043672085, 0.006545168813318014, 0.025154707953333855, 0.005661354400217533, 0.002897117752581835, 0.019431548193097115, 0.00015084327606018633, -0.0015041836304590106, -0.01496

- zero-shot problem setting

In [9]:
from langchain import PromptTemplate

template = "What is a good name for a company that makes {product}?"

prompt = PromptTemplate(
    input_variables=["product"],
    template=template,
)

prompt.format(product="colorful socks")


'What is a good name for a company that makes colorful socks?'

- few-shot problem setting

In [10]:
from langchain import PromptTemplate, FewShotPromptTemplate

examples = [
    {"word": "happy", "antonym": "sad"},
    {"word": "tall", "antonym": "short"},
]

example_template = """
Word: {word}
Antonym: {antonym}\n
"""

example_prompt = PromptTemplate(
    input_variables=["word", "antonym"],
    template=example_template,
)

few_shot_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Word: {input}\nAntonym:",
    input_variables=["input"],
    example_separator="\n",
)

few_shot_prompt.format(input="big")


'Give the antonym of every input\n\nWord: happy\nAntonym: sad\n\n\n\nWord: tall\nAntonym: short\n\n\nWord: big\nAntonym:'

- Chains: Combining LLMs with other components

In [15]:
from langchain.chains import LLMChain

chain = LLMChain(llm = llm, 
                  prompt = prompt)

# Run the chain only specifying the input variable.
chain.run("colorful socks")


'\n\nRainbow Toes.'

In [16]:
from langchain.chains import LLMChain, SimpleSequentialChain

# Define the first chain as in the previous code example
# ...

# Create a second chain with a prompt template and an LLM
second_prompt = PromptTemplate(
    input_variables=["company_name"],
    template="Write a catchphrase for the following company: {company_name}",
)

chain_two = LLMChain(llm=llm, prompt=second_prompt)

# Combine the first and the second chain 
overall_chain = SimpleSequentialChain(chains=[chain, chain_two], verbose=True)

# Run the chain specifying only the input variable for the first chain.
catchphrase = overall_chain.run("colorful socks")




[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m

Socktastic![0m
[33;1m[1;3m

"Get your feet ready for Socktastic!"[0m

[1m> Finished chain.[0m


- Indexes: Accessing external data

- QnA based on a Youtube video

In [19]:
# pip install youtube-transcript-api
# pip install pytube

from langchain.document_loaders import YoutubeLoader

loader = YoutubeLoader.from_youtube_url("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
    
documents = loader.load()


In [20]:
# pip install faiss-cpu
from langchain.vectorstores import FAISS

# create the vectorestore to use as the index
db = FAISS.from_documents(documents, embeddings)


In [21]:
from langchain.chains import RetrievalQA

retriever = db.as_retriever()

qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=retriever, 
    return_source_documents=True)

query = "What am I never going to do?"
result = qa({"query": query})

print(result['result'])


 I'm never going to say goodbye, never gonna make you cry, and never gonna let you down.


Memory: Remembering previous conversations
- conversational memory

In [22]:
from langchain import ConversationChain

conversation = ConversationChain(llm=llm, verbose=True)

conversation.predict(input="Alice has a parrot.")

conversation.predict(input="Bob has two cats.")

conversation.predict(input="How many pets do Alice and Bob have?")




[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Alice has a parrot.
AI:[0m

[1m> Finished chain.[0m


[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Alice has a parrot.
AI:  Interesting! What kind of parrot does Alice have?
Human: Bob has two cats.
AI:[0m

[1m> Finished chain.[0m


[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is 

' I believe Alice has one pet, which is a parrot, and Bob has two pets, which are cats.'

- Agents: Accessing other tools

In [1]:
"""
# pip install wikipedia
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType

tools = load_tools(["wikipedia", "llm-math"], llm=llm)
agent = initialize_agent(tools, 
                         llm, 
                         agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, 
                         verbose=True)


agent.run("When was Barack Obama born? How old was he in 2022?")
"""


'\n# pip install wikipedia\nfrom langchain.agents import load_tools\nfrom langchain.agents import initialize_agent\nfrom langchain.agents import AgentType\n\ntools = load_tools(["wikipedia", "llm-math"], llm=llm)\nagent = initialize_agent(tools, \n                         llm, \n                         agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, \n                         verbose=True)\n\n\nagent.run("When was Barack Obama born? How old was he in 2022?")\n'