# 01-LangChai-overview

LangChain is a framework for developing applications powered by language models.
- Connect a language model to other sources of data
- Allow a language model to interact with its environment

https://docs.langchain.com/docs/

Why LangChain?
- Components - LangChain makes it easy to swap out abstractions and components necessary to work with language models.

- Customized Chains - LangChain provides out of the box support for using and customizing 'chains' - a series of actions strung together.

- Speed - This team ships insanely fast. You'll be up to date with the latest LLM features.

- Community - Wonderful discord and community support, meet ups, hackathons, etc.

In [None]:
%%capture
!pip install langchain
!pip install openai
!pip install google-search-results
!pip install tiktoken
!pip install faiss-cpu
!pip install chromadb
!pip install unstructured
!pip install pypdf
!pip install jq

In [None]:
import os
import getpass
import openai

APIKEY = getpass.getpass("APIKEY:")
os.environ["OPENAI_API_KEY"] = APIKEY

APIKEY:··········


## OpenAI

- Legacy models (2020–2022)	: text-davinci-003, text-davinci-002, davinci, curie, babbage, ada
- Newer models (2023–):	gpt-4, gpt-3.5-turbo

In [None]:
openai_models =openai.Model.list()

model_ids = [model["id"] for model in openai_models["data"]]
print(len(model_ids))
print(model_ids)

56
['text-davinci-001', 'text-search-curie-query-001', 'davinci', 'text-babbage-001', 'curie-instruct-beta', 'text-davinci-003', 'davinci-similarity', 'code-davinci-edit-001', 'text-similarity-curie-001', 'text-embedding-ada-002', 'ada-code-search-text', 'text-search-ada-query-001', 'gpt-4-0314', 'babbage-search-query', 'ada-similarity', 'gpt-3.5-turbo', 'gpt-4-0613', 'text-search-ada-doc-001', 'text-search-babbage-query-001', 'code-search-ada-code-001', 'curie-search-document', 'text-search-davinci-query-001', 'text-search-curie-doc-001', 'babbage-search-document', 'babbage-code-search-text', 'davinci-instruct-beta', 'davinci-search-query', 'text-similarity-babbage-001', 'text-davinci-002', 'code-search-babbage-text-001', 'babbage', 'text-search-davinci-doc-001', 'code-search-ada-text-001', 'ada-search-query', 'text-similarity-ada-001', 'whisper-1', 'gpt-4', 'ada-code-search-code', 'ada', 'text-davinci-edit-001', 'davinci-search-document', 'gpt-3.5-turbo-16k-0613', 'curie-search-query

In [None]:
# Set key
openai.api_key = os.environ["OPENAI_API_KEY"]

**openai.Completion.create**

- Given a prompt, the model will return one or more predicted completions, and can also return the probabilities of alternative tokens at each position.
- Parameters:
  - model: ID of the model
  - prompt: a string, array of strings
  - suffix: The suffix that comes after a completion of inserted text.
  - max_tokens: The maximum number of tokens
  - temperature: value between 0 to 2. 0 = deterministic, 2  = random. We generally recommend altering this or top_p but not both.
  - top_p: top_p probability mass. 0.1 means only the tokens comprising the top 10% probability mass are considered.
  - n: Number of completions output.
  - ...
- https://platform.openai.com/docs/api-reference/completions

In [None]:
# Use openai: text-davinci-003

import openai

response = openai.Completion.create(
  model="text-davinci-003",
  prompt="Write a tagline for an ice cream shop."
)

In [None]:
print(type(response))
print(response['choices'][0]['text'])
print('\n')
response

<class 'openai.openai_object.OpenAIObject'>


"I scream for our delicious ice cream!"




<OpenAIObject text_completion id=cmpl-7mEB1XTrf7ZLVui4WwKnjE6twd1gq at 0x7e3cc34cbf10> JSON: {
  "id": "cmpl-7mEB1XTrf7ZLVui4WwKnjE6twd1gq",
  "object": "text_completion",
  "created": 1691728595,
  "model": "text-davinci-003",
  "choices": [
    {
      "text": "\n\n\"I scream for our delicious ice cream!\"",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 11,
    "total_tokens": 21
  }
}

**openai.ChatCompletion**

- Chat models take a list of messages as input and return a model-generated message as output.
- Parameters:
  - model: model id
  - messages: system, user, assistant role dictionary converstion.
  - functions: Function calling allows you to more reliably get structured data back from the model.
  - temperature: value between 0 to 2. 0 = deterministic, 2  = random.
  - top_p: top_p probability mass. 0.1 means only the tokens comprising the top 10% probability mass are considered.
  - n: How many chat completion choices to generate
  - stop: Up to 4 sequences where the API will stop generating further tokens.
  - max_tokens: The maximum number of tokens to generate in the chat completion.
  - presence_penalty: Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
  - frequency_penalty: Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
  - logit_bias - Modify the likelihood of specified tokens appearing in the completion.
  - user: A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.

- https://platform.openai.com/docs/api-reference/chat/create

In [None]:
# Use openai: gpt-3.5-turbo

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

In [None]:
print(type(response))
print(response['choices'][0]['message']['content'])
print('\n')
response

<class 'openai.openai_object.OpenAIObject'>
The World Series in 2020 was played at the Globe Life Field in Arlington, Texas.




<OpenAIObject chat.completion id=chatcmpl-7mE8eyU1MyMn7BXXWPC0Fnt6keYcd at 0x7e3cc3588f90> JSON: {
  "id": "chatcmpl-7mE8eyU1MyMn7BXXWPC0Fnt6keYcd",
  "object": "chat.completion",
  "created": 1691728448,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The World Series in 2020 was played at the Globe Life Field in Arlington, Texas."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 53,
    "completion_tokens": 19,
    "total_tokens": 72
  }
}

## LangChain

In [None]:
# Use langchain
from langchain.llms import OpenAI

llm = OpenAI(temperature=0.9)

text = "What is the captial of Japan?"

response = llm(text)
print(response)



The capital of Japan is Tokyo.


In [None]:
type(response)

str

## Chains

Chains is mutli-step workflow

In [None]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

llm = OpenAI(temperature=0.9)

prompt = PromptTemplate(
    input_variables = ["country"],
    template = "What is the capital of {country}?"
)

In [None]:
chain = LLMChain(llm=llm, prompt=prompt)

In [None]:
print(chain.run("Franch"))



The capital of France is Paris.


# Compoents

## Schema

1. Text - simple text

2. ChatMessages

  2.1 SystemChatMessage - instructions to the AI system.

  2.2 HumanChatMessage - human message with the AI system.

  2.3vAIChatMessage - AI response

3. Examples - input/output pairs for training and evaluation of models.

4. Document - data consists of page_content and metadata


In [None]:
# ChatMessages

from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(temperature=.7)

chat(
    [
        SystemMessage(content="You are a nice AI bot that helps a user figure out where to travel in one short sentence"),
        HumanMessage(content="I like the beaches where should I go?"),
        AIMessage(content="You should go to Nice, France"),
        HumanMessage(content="What else should I do when I'm there?")
    ]
)

AIMessage(content='In addition to enjoying the beaches, you should explore the charming old town and indulge in delicious Mediterranean cuisine.', additional_kwargs={}, example=False)

In [None]:
# Document

from langchain.schema import Document

Document(page_content="This is my document. It is full of text that I've gathered from other places",
         metadata={
             'my_document_id' : 234234,
             'my_document_source' : "The LangChain Papers",
             'my_document_create_time' : 1680013019
         })

Document(page_content="This is my document. It is full of text that I've gathered from other places", metadata={'my_document_id': 234234, 'my_document_source': 'The LangChain Papers', 'my_document_create_time': 1680013019})

## Models

- LLMs - These models take a text string as input, and return a text string as output.

- Chat Models - These models take a list of Chat Messages as input, and return a Chat Message.

- Text Embedding Models - These models take text as input and return a list of floats.

In [None]:
# LLMs

from langchain.llms import OpenAI

llm = OpenAI(model_name="text-ada-001")
llm("What day comes after Friday?")

'\n\nSaturday'

In [None]:
# Chat Models

from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(temperature=1)
chat(
    [
        SystemMessage(content="You are an unhelpful AI bot that makes a joke at whatever the user says"),
        HumanMessage(content="I would like to go to New York, how should I do this?")
    ]
)

AIMessage(content='Oh, to get to New York, I suggest you sprout wings and fly! Or, you know, you could always take a plane like a mere mortal.', additional_kwargs={}, example=False)

In [None]:
# Text Embedding Models

from langchain.embeddings import OpenAIEmbeddings
import tiktoken

embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
text = "Hi! It's time for the beach"

In [None]:
text_embedding = embeddings.embed_query(text)
print(f"text_embedding size: {len(text_embedding)}")
print(f"Sample: {text_embedding[:5]}")

text_embedding size: 1536
Sample: [-0.00011466222223832396, -0.003150652560942347, -0.0007831145605424051, -0.01950432835081537, -0.015125558574222013]


In [None]:
encoding = tiktoken.get_encoding("cl100k_base")
tokens = encoding.encode(text)
len(tokens)

8

## Prompts

In [None]:
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003")

prompt = """
Today is Monday, tomorrow is Wednesday.
What is wrong with that statement?
"""

llm(prompt)

'\nThe statement is incorrect because tomorrow is Tuesday, not Wednesday.'

## Prompt Templates

In [None]:
# Crate prompt template

from langchain.prompts import PromptTemplate

prompt = PromptTemplate(
    input_variables = ["country"],
    template = "What is the capital of {country}?"
)

In [None]:
print(prompt.format(country="Thailand"))

What is the capital of Thailand?


In [None]:
print(llm(prompt.format(country="Thailand")))



Bangkok.


In [None]:
from langchain.llms import OpenAI
from langchain import PromptTemplate

llm = OpenAI(model_name="text-davinci-003")

template = """
I really want to travel to {location}. What should I do there?

Respond in one short sentence
"""

prompt = PromptTemplate(
    input_variables=["location"],
    template=template,
)

final_prompt = prompt.format(location='Rome')

print (f"Final Prompt: {final_prompt}")
print ("-----------")
print (f"LLM Output: {llm(final_prompt)}")

Final Prompt: 
I really want to travel to Rome. What should I do there?

Respond in one short sentence

-----------
LLM Output: Visit the Colosseum, Pantheon, Trevi Fountain, Spanish Steps, and Vatican City to experience the best of Rome.


## Example Selectors

- An easy way to select from a series of examples that allow you to dynamic place in-context information into your prompt.
- https://python.langchain.com/docs/modules/model_io/prompts/example_selectors/

In [None]:
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003")

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Example Input: {input}\nExample Output: {output}",
)

# Examples of locations that nouns are found
examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]

In [None]:
# SemanticSimilarityExampleSelector will select examples that are similar to your input by semantic meaning

example_selector = SemanticSimilarityExampleSelector.from_examples(
    examples,
    OpenAIEmbeddings(), # embedding
    FAISS,              # VectorStore
    k=2                 # number of examle to produce.
)
similar_prompt = FewShotPromptTemplate(
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input: {adjective}\nOutput:",
    input_variables=["adjective"],
)

In [None]:
# Input is a feeling, so should select the happy/sad example
print(similar_prompt.format(adjective="worried"))

Give the antonym of every input

Example Input: happy
Example Output: sad

Example Input: sunny
Example Output: gloomy

Input: worried
Output:


In [None]:
# Input is a measurement, so should select the tall/short example
print(similar_prompt.format(adjective="fat"))

Give the antonym of every input

Example Input: happy
Example Output: sad

Example Input: tall
Example Output: short

Input: fat
Output:


In [None]:
llm(similar_prompt.format(adjective="fat"))

' skinny'

## Output Parsers

- A helpful way to format the output of a model. Usually used for structured output.
1. Format Instructions - A autogenerated prompt that tells the LLM how to format it's response based off your desired result

2. Parser - A method which will extract your model's text output into a desired structure (usually json)


In [None]:
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003")

response_schemas = [
    ResponseSchema(name="bad_string", description="This a poorly formatted user input string"),
    ResponseSchema(name="good_string", description="This is your response, a reformatted response")
]

output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = output_parser.get_format_instructions()
print (format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```


In [None]:
template = """
You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly

{format_instructions}

% USER INPUT:
{user_input}

YOUR RESPONSE:
"""

prompt = PromptTemplate(
    input_variables=["user_input"],
    partial_variables={"format_instructions": format_instructions},
    template=template
)

prompt_input = prompt.format(user_input="welcom to califonya!")

print(prompt_input)


You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```

% USER INPUT:
welcom to califonya!

YOUR RESPONSE:



In [None]:
llm_output = llm(prompt_input)
llm_output

'```json\n{\n\t"bad_string": "welcom to califonya!",\n\t"good_string": "Welcome to California!"\n}\n```'

In [None]:
output = output_parser.parse(llm_output)
output

{'bad_string': 'welcom to califonya!', 'good_string': 'Welcome to California!'}

## Indexes

- Structuring documents to LLMs can work with them

## Document Loaders
- Import data from other sources.

- https://python.langchain.com/docs/modules/data_connection/document_loaders.html

Load example files

In [None]:
!mkdir document
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/example.txt
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/example.csv
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/example.pdf
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/example.json
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/panda.txt
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/japan.txt
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/japan.html
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/document/news-01.txt
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/document/news-02.txt
!curl -O -L https://raw.githubusercontent.com/jingwora/LangChain-tutorial/main/files/document/news-03.txt
!mv news-01.txt document/
!mv news-02.txt document/
!mv news-03.txt document/

mkdir: cannot create directory ‘document’: File exists
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   597  100   597    0     0   1117      0 --:--:-- --:--:-- --:--:--  1115
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   505  100   505    0     0   1516      0 --:--:-- --:--:-- --:--:--  1521
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 82627  100 82627    0     0   225k      0 --:--:-- --:--:-- --:--:--  224k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1452  100  1452    0     0   4472      0 --:--:-- --:--:-- --:--:--  44

**TextLoader**

In [None]:
from langchain.document_loaders import TextLoader

example_txt = TextLoader("example.txt").load()

In [None]:
print(type(example_txt))
print(len(example_txt))
# print(example_txt[0].page_content)  # page_content
# print(example_txt[0].metadata['source']) # source
example_txt

<class 'list'>
1


[Document(page_content="A prompt is a specific instruction, question, or input provided to an AI language model, like myself, to elicit a desired response or generate text. It serves as the starting point for the model to understand the context and generate coherent and relevant content. The quality and clarity of the prompt greatly influence the output of the AI, as it guides the model's understanding and influences the style, tone, and information included in the generated text. Effective prompts are clear, concise, and well-structured, enabling the AI to produce accurate and contextually appropriate responses.\n", metadata={'source': 'example.txt'})]

In [None]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

llm = OpenAI(temperature=0.9)

prompt = PromptTemplate(
    input_variables = ["content"],
    template = "What is the topic of this cotent? \n content: ```{content}```"
)

chain = LLMChain(llm=llm, prompt=prompt)
response =chain.run(example_txt)

print(response)



The topic of this content is prompt instructions for AI language models.


**CSVLoader**

In [None]:
from langchain.document_loaders.csv_loader import CSVLoader


example_csv = CSVLoader(file_path='example.csv',
                        csv_args={
                            'delimiter': ',',
                            'fieldnames': ['id', 'category', 'content', 'date']
                            }
                        ).load()

print(type(example_csv))
example_csv

<class 'list'>


[Document(page_content='id: id\ncategory: category\ncontent: content\ndate: date', metadata={'source': 'example.csv', 'row': 0}),
 Document(page_content='id: 01\ncategory: travel\ncontent: Exploring Ancient Ruins\ndate: 2020-05-15', metadata={'source': 'example.csv', 'row': 1}),
 Document(page_content='id: 02\ncategory: technology\ncontent: Latest Gadgets Review\ndate: 2021-09-23', metadata={'source': 'example.csv', 'row': 2}),
 Document(page_content='id: 03\ncategory: science\ncontent: Discovering New Planets\ndate: 2022-04-10', metadata={'source': 'example.csv', 'row': 3}),
 Document(page_content='id: 04\ncategory: food\ncontent: Delicious Pasta Recipe\ndate: 2020-12-08', metadata={'source': 'example.csv', 'row': 4}),
 Document(page_content='id: 05\ncategory: travel\ncontent: Hiking in the Mountains\ndate: 2021-07-19', metadata={'source': 'example.csv', 'row': 5}),
 Document(page_content='id: 06\ncategory: technology\ncontent: AI and Its Applications\ndate: 2022-11-30', metadata={'so

In [None]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

llm = OpenAI(temperature=0)

prompt = PromptTemplate(
    input_variables = ["question", "content"],
    template = "{question} \n content: ```{content}```"
)
chain = LLMChain(llm=llm, prompt=prompt)

In [None]:
question = "What are the food cotents?"
response = chain.run(question=question, content=example_csv)

print(response)



The food content is "Delicious Pasta Recipe" and "Healthy Smoothie Ideas".


In [None]:
question = "How many pieces of content are related to health?"
response = chain.run(question=question, content=example_csv)

print(response)



There is one piece of content related to health in the given list: 
id: 08
category: food
content: Healthy Smoothie Ideas
date: 2023-02-14


**UnstructuredCSVLoader**
- https://api.python.langchain.com/en/latest/document_loaders/langchain.document_loaders.csv_loader.UnstructuredCSVLoader.html

In [None]:
from langchain.document_loaders.csv_loader import UnstructuredCSVLoader


loader = UnstructuredCSVLoader(
    file_path="example.csv", mode="elements"
)
docs = loader.load()

In [None]:
print(type(docs))
print(docs[0].page_content)
print(docs[0].metadata)
print("--------------")
docs

<class 'list'>

  
    
      1
      travel
      Exploring Ancient Ruins
      2020-05-15
    
    
      2
      technology
      Latest Gadgets Review
      2021-09-23
    
    
      3
      science
      Discovering New Planets
      2022-04-10
    
    
      4
      food
      Delicious Pasta Recipe
      2020-12-08
    
    
      5
      travel
      Hiking in the Mountains
      2021-07-19
    
    
      6
      technology
      AI and Its Applications
      2022-11-30
    
    
      7
      science
      Genetic Engineering Breakthrough
      2020-09-05
    
    
      8
      food
      Healthy Smoothie Ideas
      2023-02-14
    
    
      9
      travel
      Beach Paradise Vacation
      2022-08-27
    
    
      10
      technology
      Future of Augmented Reality
      2023-06-20
    
  

{'source': 'example.csv', 'filename': 'example.csv', 'last_modified': '2023-08-11T10:21:14', 'filetype': 'text/csv', 'text_as_html': '<table border="1" class="dataframe">\n  <tb

[Document(page_content='\n  \n    \n      1\n      travel\n      Exploring Ancient Ruins\n      2020-05-15\n    \n    \n      2\n      technology\n      Latest Gadgets Review\n      2021-09-23\n    \n    \n      3\n      science\n      Discovering New Planets\n      2022-04-10\n    \n    \n      4\n      food\n      Delicious Pasta Recipe\n      2020-12-08\n    \n    \n      5\n      travel\n      Hiking in the Mountains\n      2021-07-19\n    \n    \n      6\n      technology\n      AI and Its Applications\n      2022-11-30\n    \n    \n      7\n      science\n      Genetic Engineering Breakthrough\n      2020-09-05\n    \n    \n      8\n      food\n      Healthy Smoothie Ideas\n      2023-02-14\n    \n    \n      9\n      travel\n      Beach Paradise Vacation\n      2022-08-27\n    \n    \n      10\n      technology\n      Future of Augmented Reality\n      2023-06-20\n    \n  \n', metadata={'source': 'example.csv', 'filename': 'example.csv', 'last_modified': '2023-08-11T10:21:14', '

In [None]:
# Show text_as_html

from IPython.display import HTML

display(HTML(docs[0].metadata["text_as_html"]))

0,1,2,3
1,travel,Exploring Ancient Ruins,2020-05-15
2,technology,Latest Gadgets Review,2021-09-23
3,science,Discovering New Planets,2022-04-10
4,food,Delicious Pasta Recipe,2020-12-08
5,travel,Hiking in the Mountains,2021-07-19
6,technology,AI and Its Applications,2022-11-30
7,science,Genetic Engineering Breakthrough,2020-09-05
8,food,Healthy Smoothie Ideas,2023-02-14
9,travel,Beach Paradise Vacation,2022-08-27
10,technology,Future of Augmented Reality,2023-06-20


In [None]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

llm = OpenAI(temperature=0)

prompt = PromptTemplate(
    input_variables = ["question", "content"],
    template = "{question} \n content: ```{content}```"
)
chain = LLMChain(llm=llm, prompt=prompt)

question = "What are the food cotents?"
response = chain.run(question=question, content=docs)

print(response)



The food content in this document is "Delicious Pasta Recipe" and "Healthy Smoothie Ideas".


**PyPDFLoader**

In [None]:
# PyPDFLoader
# Need pypdf library
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader(file_path='example.pdf')

pages = loader.load_and_split()

In [None]:
print(type(pages))
print(len(pages))
for p in pages:
  print(p.metadata)
pages

<class 'list'>
2
{'source': 'example.pdf', 'page': 0}
{'source': 'example.pdf', 'page': 1}


[Document(page_content='Mocked up Resume                                                                                                                                                                                           1 | P a g e  \n John Doe  \n123 Main Street  \nTechville, CA 12345  \njohn.doe@email.com  \n(123) 456 -7890  \nlinkedin.com/in/johndoe  \n \nObjective:  \nDedicated and innovative AI Engineer with a passion for developing cutting -edge AI solutions to drive business growth an d enhance user \nexperiences. Seeking a challenging role in a dynamic organization to apply my expertise in machine learning, natural language  \nprocessing, and deep learning.  \n \nEducation:  \nMaster of Science in Artificial Intelligence  \nUniversity of Techville  \nTechville, State  \nGraduated: May 20XX  \n \nBachelor of Computer Science  \nTechtopia University  \nTechtopia, State  \nGraduated: May 20XX  \n \nSkills:  \n- Proficient in Python, TensorFlow, PyTorch, and scikit -learn  \

In [None]:
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

faiss_index = FAISS.from_documents(pages, OpenAIEmbeddings())

In [None]:
print(type(faiss_index))

<class 'langchain.vectorstores.faiss.FAISS'>


In [None]:
docs = faiss_index.similarity_search("What languages does he speak, read, and write fluently?", k=1)
for doc in docs:
    print(str(doc.metadata["page"]) + ":", doc.page_content[:300])

1: Mocked up Resume                                                                                                                                                                                           2 | P a g e  
 - Developed a deep l earning -based image recognition system that achieved a 95% a


**DirectoryLoader**

In [None]:
# default loader is UnstructuredLoader
# Need unstructured library

from langchain.document_loaders import DirectoryLoader, TextLoader

text_loader_kwargs={'autodetect_encoding': True}
example_dir = DirectoryLoader('document', glob="**/*.txt",
                              show_progress=True,
                              use_multithreading=True,
                              loader_cls=TextLoader,
                              silent_errors=True
                              ).load()

print(type(example_dir))
print(len(example_dir))
example_dir

100%|██████████| 3/3 [00:00<00:00, 1949.63it/s]

<class 'list'>
3





[Document(page_content="Travel: Unveiling Hidden Gems: Exploring the Enchanting Countryside of a Remote Land\nEmbarking on a journey off the beaten path, travelers are discovering the allure of a remote and enchanting countryside that has long remained hidden from the tourism radar. Away from the bustling cities and tourist hotspots, this unspoiled land offers a serene escape into nature's embrace. Rolling hills, meandering rivers, and quaint villages characterize the landscape, providing a picturesque backdrop for unforgettable adventures. As wanderers immerse themselves in the local culture and engage with friendly residents, they gain a deeper appreciation for the beauty and authenticity of this hidden gem.\n", metadata={'source': 'document/news-02.txt'}),
 Document(page_content="Sport: Thrilling Victory Seals Championship Win for Local Soccer Team\nIn a heart-pounding match that kept fans on the edge of their seats, the local soccer team secured a triumphant championship win. The g

In [None]:
doc_sources = [doc.metadata['source']  for doc in example_dir]
doc_sources

['document/news-02.txt', 'document/news-01.txt', 'document/news-03.txt']

In [None]:
doc_contents = [doc.page_content  for doc in example_dir]
doc_contents

["Travel: Unveiling Hidden Gems: Exploring the Enchanting Countryside of a Remote Land\nEmbarking on a journey off the beaten path, travelers are discovering the allure of a remote and enchanting countryside that has long remained hidden from the tourism radar. Away from the bustling cities and tourist hotspots, this unspoiled land offers a serene escape into nature's embrace. Rolling hills, meandering rivers, and quaint villages characterize the landscape, providing a picturesque backdrop for unforgettable adventures. As wanderers immerse themselves in the local culture and engage with friendly residents, they gain a deeper appreciation for the beauty and authenticity of this hidden gem.\n",
 "Sport: Thrilling Victory Seals Championship Win for Local Soccer Team\nIn a heart-pounding match that kept fans on the edge of their seats, the local soccer team secured a triumphant championship win. The game was a true spectacle of skill and determination, as the players showcased their athlet

**HNLoader**

In [None]:
from langchain.document_loaders import HNLoader

loader = HNLoader("https://news.ycombinator.com/item?id=34422627")
data = loader.load()

In [None]:
print (f"Found {len(data)} comments")
print (f"Here's a sample:\n\n{''.join([x.page_content[:150] for x in data[:2]])}")

Found 76 comments
Here's a sample:

Ozzie_osman 6 months ago  
             | next [–] 

LangChain is awesome. For people not sure what it's doing, large language models (LLMs) are very Ozzie_osman 6 months ago  
             | parent | next [–] 

Also, another library to check out is GPT Index (https://github.com/jerryjliu/gpt_index)


**UnstructuredHTMLLoader**

In [None]:
from langchain.document_loaders import UnstructuredHTMLLoader

loader = UnstructuredHTMLLoader("japan.html")
website = loader.load()

website

[Document(page_content="Discover Japan\n\nExperience the beauty and culture of Japan.\n\nIntroduction\n\nJapan, an island nation in East Asia, is renowned for its unique blend of tradition and modernity.\n\nCulture\n\nThe Japanese culture is rich and diverse, with influences from art, literature, cuisine, and more. The tea ceremony and traditional theater forms like Noh and Kabuki are integral parts of Japanese heritage.\n\nNature\n\nJapan's natural beauty is breathtaking. From the serene cherry blossoms in spring to the stunning fall foliage, nature is an integral part of Japanese life.\n\nGallery\n\nExplore the wonders of Japan and immerse yourself in its rich culture and history.", metadata={'source': 'japan.html'})]

In [None]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

llm = OpenAI(temperature=0)

prompt = PromptTemplate(
    input_variables = ["question", "content"],
    template = "{question} \n content: ```{content}```"
)
chain = LLMChain(llm=llm, prompt=prompt)

question = "Summarize this website."
response = chain.run(question=question, content=website)

print(response)



This website is about Japan and its culture and natural beauty. It introduces Japan as an island nation in East Asia with a unique blend of tradition and modernity. It also discusses the rich and diverse culture of Japan, including the tea ceremony and traditional theater forms. Additionally, it highlights the breathtaking natural beauty of Japan, from the cherry blossoms in spring to the stunning fall foliage. Finally, it provides a gallery to explore the wonders of Japan and immerse oneself in its culture and history.


**JSON file**
- require jq

jq_schema that can be used to extract content from the JSON data

```
JSON        -> [{"text": ...}, {"text": ...}, {"text": ...}]
jq_schema   -> ".[].text"

JSON        -> {"key": [{"text": ...}, {"text": ...}, {"text": ...}]}
jq_schema   -> ".key[].text"

JSON        -> ["...", "...", "..."]
jq_schema   -> ".[]"
```

In [None]:
import json
from pathlib import Path
from pprint import pprint
from langchain.document_loaders import JSONLoader

file_path='example.json'
data = json.loads(Path(file_path).read_text())
pprint(data)

{'is_still_participant': True,
 'messages': [{'content': "Jordan, Atom project was a huge success, don't you "
                          'think?',
               'sender_name': 'Alex',
               'timestamp_ms': 1579137191303,
               'type': 'Generic'},
              {'content': 'Absolutely, Alex. Our teamwork and problem-solving '
                          'really shone through.',
               'sender_name': 'Jordan',
               'timestamp_ms': 1579137103044,
               'type': 'Generic'},
              {'content': 'I was impressed with how smoothly everything went, '
                          'from planning to execution.',
               'sender_name': 'Alex',
               'timestamp_ms': 1579137078312,
               'type': 'Generic'},
              {'content': "And finishing ahead of schedule, plus the client's "
                          'rave reviews, show our dedication paid off.',
               'sender_name': 'Jordan',
               'timestamp_ms': 15

In [None]:
# JSONLoader

loader = JSONLoader(
    file_path='example.json',
    jq_schema='.messages[].content')

data = loader.load()
pprint(data)

[Document(page_content="Jordan, Atom project was a huge success, don't you think?", metadata={'source': '/content/example.json', 'seq_num': 1}),
 Document(page_content='Absolutely, Alex. Our teamwork and problem-solving really shone through.', metadata={'source': '/content/example.json', 'seq_num': 2}),
 Document(page_content='I was impressed with how smoothly everything went, from planning to execution.', metadata={'source': '/content/example.json', 'seq_num': 3}),
 Document(page_content="And finishing ahead of schedule, plus the client's rave reviews, show our dedication paid off.", metadata={'source': '/content/example.json', 'seq_num': 4}),
 Document(page_content="True. Let's use this momentum for more successful projects down the line.", metadata={'source': '/content/example.json', 'seq_num': 5}),
 Document(page_content="Agreed, Alex. Here's to our achievements and future success! 🥂", metadata={'source': '/content/example.json', 'seq_num': 6})]


In [None]:
# metadata_func

def metadata_func(record: dict, metadata: dict) -> dict:
    metadata["sender_name"] = record.get("sender_name")
    metadata["timestamp_ms"] = record.get("timestamp_ms")
    return metadata

loader = JSONLoader(
    file_path='example.json',
    jq_schema='.messages[]',
    content_key="content",
    metadata_func=metadata_func
)

data = loader.load()
pprint(data)

[Document(page_content="Jordan, Atom project was a huge success, don't you think?", metadata={'source': '/content/example.json', 'seq_num': 1, 'sender_name': 'Alex', 'timestamp_ms': 1579137191303}),
 Document(page_content='Absolutely, Alex. Our teamwork and problem-solving really shone through.', metadata={'source': '/content/example.json', 'seq_num': 2, 'sender_name': 'Jordan', 'timestamp_ms': 1579137103044}),
 Document(page_content='I was impressed with how smoothly everything went, from planning to execution.', metadata={'source': '/content/example.json', 'seq_num': 3, 'sender_name': 'Alex', 'timestamp_ms': 1579137078312}),
 Document(page_content="And finishing ahead of schedule, plus the client's rave reviews, show our dedication paid off.", metadata={'source': '/content/example.json', 'seq_num': 4, 'sender_name': 'Jordan', 'timestamp_ms': 1579136858575}),
 Document(page_content="True. Let's use this momentum for more successful projects down the line.", metadata={'source': '/conte

In [None]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

llm = OpenAI(temperature=0)

prompt = PromptTemplate(
    input_variables = ["question", "content"],
    template = "{question} \n content: ```{content}```"
)
chain = LLMChain(llm=llm, prompt=prompt)

question = """
What is the topic and sentiment of this chat?
Provide the output in JSON format, following the example below. The response should be in JSON output only.
The 'label' for the topic can be among "Project Management", "Team Collaboration", "Career Development", "Innovation Development", "Work-Life Balance".
The 'label' for sentiment can be among "Positive", "Negative", or "Neutral".
The 'score' field represents the confidence level of the decision, with a value ranging from 0 to 1 (rounded to 2 decimal places). A score of 1 indicates 100% confidence.
The 'explanation' field should include the top 5 words or less that support the decision's reasoning.
Example JSON output:
{
    "topic": {
        "label": "<topic_label>",
        "score": <topic_score>,
        "explanation": ["<word>", "<word>", ...]
    },
    "sentiment": {
        "label": "<sentiment_label>",
        "score": <sentiment_score>,
        "explanation": ["<word>", "<word>", ...]
    }
}
"""

response = chain.run(question=question, content=data)

print(response)



{
    "topic": {
        "label": "Project Management",
        "score": 0.98,
        "explanation": ["Jordan", "Atom", "teamwork", "problem-solving", "planning"]
    },
    "sentiment": {
        "label": "Positive",
        "score": 0.99,
        "explanation": ["impressed", "smoothly", "ahead", "dedication", "momentum"]
    }
}


In [None]:
# Convert to dict
json_string = response.strip()
json_string = json.loads(json_string)

print(type(json_string))
pprint(json_string)

<class 'dict'>
{'sentiment': {'explanation': ['impressed',
                               'smoothly',
                               'ahead',
                               'dedication',
                               'momentum'],
               'label': 'Positive',
               'score': 0.99},
 'topic': {'explanation': ['Jordan',
                           'Atom',
                           'teamwork',
                           'problem-solving',
                           'planning'],
           'label': 'Project Management',
           'score': 0.98}}


In [None]:
# Save the data to a JSON file

import json

output_file = 'output.json'

with open(output_file, 'w') as json_file:
    json.dump(json_string, json_file, indent=4)

## Text Splitters
- Long content need to be splited it up into chunks.
- Parameters:
  - length_function: how the length of chunks is calculated.
  - chunk_size: the maximum size of your chunks
  - chunk_overlap: the maximum overlap between chunks.
  - add_start_index: whether to include the starting position of each chunk within the original document in the metadata.
- https://api.python.langchain.com/en/latest/text_splitter/langchain.text_splitter.TextSplitter.html
- https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/recursive_text_splitter

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

with open('japan.txt') as f:
    documents = f.read()

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 600,
    chunk_overlap  = 30,
    length_function = len,
    add_start_index = True,
)
texts = text_splitter.create_documents([documents])

In [None]:
texts[0].metadata

{'start_index': 0}

In [None]:
print(len(texts))
for idx, text in enumerate(texts):
  print("chunk:", idx, "len:",len(text.page_content))

9
chunk: 0 len: 582
chunk: 1 len: 550
chunk: 2 len: 589
chunk: 3 len: 537
chunk: 4 len: 548
chunk: 5 len: 514
chunk: 6 len: 469
chunk: 7 len: 507
chunk: 8 len: 503


In [None]:
for idx, text in enumerate(texts):
  print(text.page_content)
  print("-----------------------------------------------")

**Exploring the Rich Tapestry of Japan: Land of Tradition, Innovation, and Natural Beauty**

*1.Introduction: Unveiling the Enigmatic Allure of Japan*

Nestled in the eastern part of Asia, Japan stands as a captivating blend of ancient traditions and cutting-edge modernity. With a history spanning millennia, this island nation has woven a captivating narrative that enthralls visitors and scholars alike. From its iconic cherry blossoms and serene temples to its bustling metropolises and technological prowess, Japan's multifaceted character is a tapestry waiting to be explored.
-----------------------------------------------
*2.A Glimpse into Japan's Historical Odyssey*

Delve into Japan's past, and you'll uncover a tale of emperors, shoguns, and samurai. The Heian period witnessed the emergence of a refined court culture, while the feudal era marked the dominance of samurai warriors. The Meiji Restoration, a turning point, propelled Japan into modernization, setting the stage for its me

## Retrievers

- Use language models to retrieve from document.
- Databases to store vectors. Most popular ones are Pinecone & Weaviate. More examples on OpenAIs retriever documentation. Chroma & FAISS are easy to work with locally.
- https://python.langchain.com/docs/integrations/vectorstores/faiss

In [None]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

loader = TextLoader('japan.txt')
documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=70)

texts = text_splitter.split_documents(documents)

In [None]:
print(len(texts))
for idx, text in enumerate(texts):
  print("chunk:", idx, "len:",len(text.page_content))

5
chunk: 0 len: 1200
chunk: 1 len: 1128
chunk: 2 len: 1113
chunk: 3 len: 1031
chunk: 4 len: 572


In [None]:
for idx, text in enumerate(texts):
  print(text.page_content)
  print("-----------------------------------------------")

**Exploring the Rich Tapestry of Japan: Land of Tradition, Innovation, and Natural Beauty**

*1.Introduction: Unveiling the Enigmatic Allure of Japan*

Nestled in the eastern part of Asia, Japan stands as a captivating blend of ancient traditions and cutting-edge modernity. With a history spanning millennia, this island nation has woven a captivating narrative that enthralls visitors and scholars alike. From its iconic cherry blossoms and serene temples to its bustling metropolises and technological prowess, Japan's multifaceted character is a tapestry waiting to be explored.

*2.A Glimpse into Japan's Historical Odyssey*

Delve into Japan's past, and you'll uncover a tale of emperors, shoguns, and samurai. The Heian period witnessed the emergence of a refined court culture, while the feudal era marked the dominance of samurai warriors. The Meiji Restoration, a turning point, propelled Japan into modernization, setting the stage for its meteoric rise in the global arena. Castles like H

In [None]:
# Embedd texts

embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(texts, embeddings)

In [None]:
db

<langchain.vectorstores.faiss.FAISS at 0x7fa3216dc220>

In [None]:
retriever = db.as_retriever()
retriever

VectorStoreRetriever(tags=['FAISS'], metadata=None, vectorstore=<langchain.vectorstores.faiss.FAISS object at 0x7fa3216dc220>, search_type='similarity', search_kwargs={})

In [None]:
docs = retriever.get_relevant_documents("What is popular Japanese food?")

In [None]:
print("\n-----------\n".join([x.page_content[:200] for x in docs]))

*3.Tradition Woven into Daily Life: Art, Cuisine, and Festivals*

Japanese culture is deeply rooted in age-old traditions that seep into every aspect of life. The delicate art of Ikebana (flower arran
-----------
*7.Pop Culture Phenomena: Anime, Manga, and Beyond*

Modern Japan's influence on global pop culture is undeniable, with anime and manga captivating audiences worldwide. From the whimsical worlds of St
-----------
*5.Sakura Dreams and Beyond: Natural Splendors*

Japan's natural beauty is a revelation in every season. Spring arrives with a flourish of cherry blossoms, transforming landscapes into dreamlike vista
-----------
*9.Conclusion: Embarking on a Journey Through Japan's Kaleidoscope*

As you embark on a journey through Japan, be prepared to be swept away by its multifaceted charm. Whether you find solace in its Ze


**Save and load index**

In [None]:
db.save_local("faiss_index_japan")

In [None]:
new_db = FAISS.load_local("faiss_index_japan", embeddings)

## Memory
- Remembering information you've chatted

**ChatMessageHistory**

In [None]:
from langchain.memory import ChatMessageHistory
from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI(temperature=0)

history = ChatMessageHistory()

history.add_ai_message("hi!")

history.add_user_message("what is the capital of france?")

In [None]:
# check history
history.messages

[AIMessage(content='hi!', additional_kwargs={}, example=False),
 HumanMessage(content='what is the capital of france?', additional_kwargs={}, example=False)]

In [None]:
# ai_response
ai_response = chat(history.messages)
ai_response

AIMessage(content='The capital of France is Paris.', additional_kwargs={}, example=False)

In [None]:
# add ai_response
history.add_ai_message(ai_response.content)
history.messages

[AIMessage(content='hi!', additional_kwargs={}, example=False),
 HumanMessage(content='what is the capital of france?', additional_kwargs={}, example=False),
 AIMessage(content='The capital of France is Paris.', additional_kwargs={}, example=False)]

**ConversationChain**

In [None]:
# ConversationChain

from langchain import OpenAI, ConversationChain

llm = OpenAI(temperature=0)
conversation = ConversationChain(llm=llm, verbose=True)

In [None]:
conversation.predict(input="Hello bot!")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hello bot!
AI:[0m

[1m> Finished chain.[0m


" Hi there! It's nice to meet you. How can I help you today?"

In [None]:
conversation.predict(input="What should I do if I feel sleepy?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hello bot!
AI:  Hi there! It's nice to meet you. How can I help you today?
Human: What should I do if I feel sleepy?
AI:[0m

[1m> Finished chain.[0m


" If you're feeling sleepy, it's best to take a short nap or get some exercise. Taking a break from whatever you're doing and getting some fresh air can also help. If you're still feeling sleepy, it's best to get some rest."

In [None]:
conversation.predict(input="What is the first thing I said to you?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hello bot!
AI:  Hi there! It's nice to meet you. How can I help you today?
Human: What should I do if I feel sleepy?
AI:  If you're feeling sleepy, it's best to take a short nap or get some exercise. Taking a break from whatever you're doing and getting some fresh air can also help. If you're still feeling sleepy, it's best to get some rest.
Human: What is the first thing I said to you?
AI:[0m

[1m> Finished chain.[0m


' You said "Hello bot!"'

## Chains
- Combining different LLM calls and action
- Ex: Summary #1, Summary #2, Summary #3 > Final Summary

**SimpleSequentialChain**

In [None]:
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains import SimpleSequentialChain

llm = OpenAI(temperature=1)

In [None]:
# location_chain

template = """Your job is to come up with a classic dish from the area that the users suggests.
% USER LOCATION
{user_location}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_location"], template=template)

# Holds my 'location' chain
location_chain = LLMChain(llm=llm, prompt=prompt_template)

In [None]:
# meal_chain

template = """Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL
{user_meal}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_meal"], template=template)

# Holds my 'meal' chain
meal_chain = LLMChain(llm=llm, prompt=prompt_template)

In [None]:
overall_chain = SimpleSequentialChain(chains=[location_chain, meal_chain], verbose=True)

In [None]:
review = overall_chain.run("Rome")



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mPasta Carbonara.[0m
[33;1m[1;3mTo make pasta carbonara at home, start by cooking 1/2 a pound of pasta in salted boiling water. In a skillet, cook 8 ounces of bacon until crisp. Add 1/4 cup of chopped shallots and cook until softened. Whisk together 4 egg yolks, 1 cup of Parmesan Cheese, 2 tablespoons of freshly chopped parsley, and 1/2 cup of heavy cream. When the pasta is finished, reserve some of the pasta water and drain the pasta. Add the pasta to the bacon and shallot mixture, then pour the egg mixture over the top. Stir everything together and use the reserved pasta water to help create a creamy sauce. Plate and top with more Parmesan cheese and parsley before serving. Enjoy![0m

[1m> Finished chain.[0m


In [None]:
review

'To make pasta carbonara at home, start by cooking 1/2 a pound of pasta in salted boiling water. In a skillet, cook 8 ounces of bacon until crisp. Add 1/4 cup of chopped shallots and cook until softened. Whisk together 4 egg yolks, 1 cup of Parmesan Cheese, 2 tablespoons of freshly chopped parsley, and 1/2 cup of heavy cream. When the pasta is finished, reserve some of the pasta water and drain the pasta. Add the pasta to the bacon and shallot mixture, then pour the egg mixture over the top. Stir everything together and use the reserved pasta water to help create a creamy sauce. Plate and top with more Parmesan cheese and parsley before serving. Enjoy!'

**load_summarize_chain**

In [None]:
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = TextLoader('panda.txt')
documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
texts = text_splitter.split_documents(documents)

In [None]:
print(len(texts))
for idx, text in enumerate(texts):
  print("chunk:", idx, "len:",len(text.page_content))

2
chunk: 0 len: 917
chunk: 1 len: 926


In [None]:
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)

response = chain.run(texts)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"Pandas, scientifically known as Ailuropoda melanoleuca, are captivating and iconic creatures native to the bamboo forests of China. Instantly recognizable by their distinctive black and white markings, pandas have captured the hearts of people worldwide. These gentle giants are primarily herbivores, with bamboo constituting about 99% of their diet. Despite their massive size, pandas are agile climbers and strong swimmers, showcasing their adaptability within their natural habitat. However, the survival of these magnificent creatures has been challenged by habitat loss due to deforestation and human encroachment. Conservation efforts have been instrumental in raising awareness and protecting these vulnerable creatures. Pandas' unique appeal and their role as an umbrella species have aided in garnering supp

In [None]:
print(len(response))
response

425


" Pandas are instantly recognizable animals that are culturally beloved, not only for their cuteness, but also for their role as ambassadors of conservation. Evidence of their global adoration can be seen in the WWF's panda mascot. Unfortunately, their population is threatened by deforestation and human encroachment. Therefore, conservation efforts have been made to protect this vulnerable species and its entire ecosystem."

## Agents

**serpapi**
- search API
- https://serpapi.com/manage-api-key

In [None]:
APIKEY = getpass.getpass("APIKEY:")
os.environ["SERPAPI_API_KEY"] = APIKEY

APIKEY:··········


In [None]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)

tools = load_tools(["serpapi", "llm-math"], llm=llm)

In [None]:
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

In [None]:
agent.run("Who is the current leader of Japan?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out who the current leader of Japan is.
Action: Search
Action Input: "current leader of Japan"[0m
Observation: [36;1m[1;3mFumio Kishida is the current prime minister of Japan, replacing Yoshihide Suga on 4 October 2021.[0m
Thought:[32;1m[1;3m I now know the final answer.
Final Answer: Fumio Kishida is the current prime minister of Japan.[0m

[1m> Finished chain.[0m


'Fumio Kishida is the current prime minister of Japan.'

# Reference

Ref:
- https://docs.langchain.com/docs/
- https://github.com/langchain-ai/langchain/tree/master/libs/langchain
- https://github.com/gkamradt/langchain-tutorials/blob/main/LangChain%20Cookbook%20Part%201%20-%20Fundamentals.ipynb

langchain API Reference

https://api.python.langchain.com/en/latest/api_reference.html#

 LangChain integration hub

 https://integrations.langchain.com/