### cookbook

In [73]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(temperature=.7)

In [74]:
chat([SystemMessage(content = "You are a helpful assistant that helps a user figure out what activity to do"),
      HumanMessage(content="I am bored and don't know what to do, I like outdoor activities")
])

AIMessage(content="Great! Since you enjoy outdoor activities, here are a few suggestions:\n\n1. Go for a hike: Find a local trail or nature reserve and enjoy some fresh air while exploring nature. It's a great way to exercise and unwind.\n\n2. Try Geocaching: Geocaching is like a modern-day treasure hunt. Use a GPS device or a smartphone app to search for hidden caches in your area. It's a fun and adventurous outdoor activity.\n\n3. Visit a park: Spend some time in a nearby park where you can have a picnic, read a book, play sports, or simply relax and enjoy the surroundings.\n\n4. Go cycling: Grab your bike and explore your neighborhood or find a local bike trail. Cycling is not only a great way to stay active but also allows you to discover new places around you.\n\n5. Play a sport: Gather some friends or join a local sports club for activities like soccer, basketball, tennis, or frisbee. It's a fun way to socialize and stay active.\n\n6. Take up gardening: If you have a garden or ev

Documents

In [75]:
from langchain.schema import Document

In [76]:
Document(page_content="This is my document", 
         metadata={'my_doc_id': 1234,
            'my_doc_title': 'My Document Title'})


Document(page_content='This is my document', metadata={'my_doc_id': 1234, 'my_doc_title': 'My Document Title'})

Models

In [77]:
#lanuage model 
from langchain.llms import OpenAI
llm = OpenAI(model_name='text-ada-001')

In [78]:
print(llm("what is the meaning of life?"))



The meaning of life is subjective and can differ from person to person.


chat model

In [79]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(temperature=1)

In [80]:
chat(
    [
        SystemMessage(content="You are an unhelpful AI bot that makes a joke at whatever the user says"),
        HumanMessage(content="I would like to go to New York, how should I do this?")
    ]
)

AIMessage(content='Oh, going to New York? Well, have you considered teleportation? It\'s the fastest and most convenient way to get there. Just close your eyes, click your heels together three times, and say, "There\'s no place like New York." Voila! Just kidding, you should probably book a flight or take a train.')

embeddings

In [81]:
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

In [82]:
text = "Hi! It's time for the beach"
text_embedding = embeddings.embed_query(text)
print (f"Here's a sample: {text_embedding[:5]}...")
print (f"Your embedding is length {len(text_embedding)}")

Here's a sample: [-0.0001654900139761228, -0.0031877204299245185, -0.000727170326471238, -0.01945995769154943, -0.015092385756617603]...
Your embedding is length 1536


function calling

In [83]:
chat = ChatOpenAI(model='gpt-3.5-turbo-0613', temperature=1)

output = chat(messages=
     [
         SystemMessage(content="You are an helpful AI bot"),
         HumanMessage(content="What’s the weather like in Boston right now?")
     ],
     functions=[{
         "name": "get_current_weather",
         "description": "Get the current weather in a given location",
         "parameters": {
             "type": "object",
             "properties": {
                 "location": {
                     "type": "string",
                     "description": "The city and state, e.g. San Francisco, CA"
                 },
                 "unit": {
                     "type": "string",
                     "enum": ["fahrenheit"]
                 }
             },
             "required": ["location", "unit"]
         }
     }
     ]
)
print(output.additional_kwargs)

{'function_call': {'arguments': '{\n"location": "Boston, MA",\n"unit": "fahrenheit"\n}', 'name': 'get_current_weather'}}


In [110]:
import json
output.additional_kwargs['function_call']['arguments']
json.loads(output.additional_kwargs['function_call']['arguments'])


{'location': 'Boston, MA', 'unit': 'fahrenheit'}

prompt

In [84]:
llm = OpenAI(model_name='text-davinci-003')
from langchain import PromptTemplate

template ='''
I really want to do {activity} today. How do I get started?
I live in {location}.
'''

prompt = PromptTemplate(input_variables = ['activity', 'location'],
                        template = template)

final_prompt = prompt.format(activity='surfing', location='Seattle')

print(final_prompt)
print('*'*40)
print(f'LLM: {llm(final_prompt)}')


I really want to do surfing today. How do I get started?
I live in Seattle.

****************************************
LLM: 
If you're looking to get started with surfing in Seattle, you'll need to find a surf shop and rent a surfboard, a wetsuit, and a leash. Look for a nearby beach with consistent waves and a designated surfing area. Be sure to check the beach's rules and regulations before entering the water. You may also want to sign up for a surf lesson or two, as they can help you learn the basics and ensure you stay safe while out in the ocean. Good luck and have fun!


In [85]:
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.llms import OpenAI

llm = OpenAI(model_name="text-davinci-003")

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Example Input: {input}\nExample Output: {output}",
)

# Examples of locations that nouns are found
examples = [
    {"input": "pirate", "output": "ship"},
    {"input": "pilot", "output": "plane"},
    {"input": "driver", "output": "car"},
    {"input": "tree", "output": "ground"},
    {"input": "bird", "output": "nest"},
]

In [86]:
example_selector = SemanticSimilarityExampleSelector.from_examples(
    examples,
    OpenAIEmbeddings(),#Embedding 
    Chroma,#Vector store
    k = 2)


In [87]:
similar_prompt = FewShotPromptTemplate(
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the location an item is usually found in",
    suffix = "Input: {noun}\nOutput:",
    input_variables=["noun"])

In [88]:
my_noun = "student"

print(similar_prompt.format(noun=my_noun))

Give the location an item is usually found in

Example Input: driver
Example Output: car

Example Input: driver
Example Output: car

Input: student
Output:


In [89]:
llm(similar_prompt.format(noun=my_noun), )

' school'

In [90]:
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate
response_schemas = [
    ResponseSchema(name="bad_string", description="This a poorly formatted user input string"),
    ResponseSchema(name="good_string", description="This is your response, a reformatted response")
]

# How you would like to parse your output
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

In [91]:
format_instructions = output_parser.get_format_instructions()
print (format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```


In [92]:
template = """
You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly

{format_instructions}

% USER INPUT:
{user_input}

YOUR RESPONSE:
"""

prompt = PromptTemplate(
    input_variables=["user_input"],
    partial_variables={"format_instructions": format_instructions},
    template=template
)

promptValue = prompt.format(user_input="welcom to califonya!")

print(promptValue)


You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```

% USER INPUT:
welcom to califonya!

YOUR RESPONSE:



In [93]:
llm_output = llm(promptValue)
llm_output

'\n```json\n{\n\t"bad_string": "welcom to califonya!",\n\t"good_string": "Welcome to California!"\n}\n```'

In [94]:
output_parser.parse(llm_output)

{'bad_string': 'welcom to califonya!', 'good_string': 'Welcome to California!'}

### Pydantic 

In [95]:
from langchain.pydantic_v1 import BaseModel, Field
from typing import Optional

class Person(BaseModel):
    """Identifying information about a person."""

    name: str = Field(..., description="The person's name")
    age: int = Field(..., description="The person's age")
    fav_food: Optional[str] = Field(None, description="The person's favorite food")

In [96]:
wes = Person(name="Wes", age=30, taco = "taco")

In [97]:
from langchain.chains.openai_functions import create_structured_output_chain


llm = ChatOpenAI(model='gpt-4-0613')

chain = create_structured_output_chain(Person, llm, prompt)
chain.run(
    "Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally."
)

Person(name='Sally', age=13, fav_food='spinach')

In [98]:
from typing import Sequence

class People(BaseModel):
    """Identifying information about all people in a text."""

    people: Sequence[Person] = Field(..., description="The people in the text")

In [99]:
chain = create_structured_output_chain(People, llm, prompt)
chain.run(
    "Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally."
)

People(people=[Person(name='Sally', age=13, fav_food=None), Person(name='Joey', age=12, fav_food='spinach'), Person(name='Caroline', age=23, fav_food=None)])

In [100]:
import enum 

class Product(str, enum.Enum):
    """A product that can be purchased."""

    COFFEE = "coffee"
    TEA = "tea"
    SMOOTHIE = "smoothie"

    


In [101]:
class Products(BaseModel):
    """Identifying products that were mentioned in a text"""

    products: Sequence[Product] = Field(..., description="The products mentioned in a text")

In [102]:
chain = create_structured_output_chain(Products, llm, prompt)
chain.run("The coffee was good, but the tea was better. I didn't like the smoothie. I hated the chocolate.")

Products(products=[<Product.COFFEE: 'coffee'>, <Product.TEA: 'tea'>, <Product.SMOOTHIE: 'smoothie'>])

Document loaders

In [112]:
from langchain.document_loaders import HNLoader
loader = HNLoader("https://news.ycombinator.com/item?id=34422627")

data = loader.load()

In [113]:
print (f"Found {len(data)} comments")
print (f"Here's a sample:\n\n{''.join([x.page_content[:150] for x in data[:2]])}")

Found 76 comments
Here's a sample:

Ozzie_osman 11 months ago  
             | next [–] 

LangChain is awesome. For people not sure what it's doing, large language models (LLMs) are veryOzzie_osman 11 months ago  
             | parent | next [–] 

Also, another library to check out is GPT Index (https://github.com/jerryjliu/gpt_index


In [114]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [115]:
# This is a long document we can split up.
with open('worked.txt') as f:
    pg_work = f.read()
    
print (f"You have {len([pg_work])} document")

You have 1 document


In [116]:
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size = 150,
    chunk_overlap  = 20,
)

texts = text_splitter.create_documents([pg_work])

In [117]:
print (f"You have {len(texts)} documents")

You have 610 documents


In [118]:
print ("Preview:")
print (texts[0].page_content, "\n")
print (texts[1].page_content)

Preview:
February 2021Before college the two main things I worked on, outside of school,
were writing and programming. I didn't write essays. I wrote what 

beginning writers were supposed to write then, and probably still
are: short stories. My stories were awful. They had hardly any plot,


In [120]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

In [121]:
loader = TextLoader('worked.txt')
documents = loader.load()

In [122]:
# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# Get embedding engine ready
embeddings = OpenAIEmbeddings()

# Embedd your texts
db = FAISS.from_documents(texts, embeddings)

ImportError: Could not import faiss python package. Please install it with `pip install faiss-gpu` (for CUDA supported GPU) or `pip install faiss-cpu` (depending on Python version).