<u>Main Modules</u>

1. `Model IO`: Interface with language models. Stuff that makes it easier to work with models.
2. `Retrieval`: Interface with application-specific data
3. `Agents`: Let chains choose which tools to use given high-level directives.

<u>Additional</u>

1. `Chains`: Common, building block compositions
2. `Memory`: Persist application state between runs of a chain
3. `Callbacks`: Log and stream intermediate steps of any chain

### Caching in LLMs

https://python.langchain.com/docs/integrations/llms/llm_caching

Interesting to try next: `SQLAlchemyCache`

In [1]:
import boto3
from langchain_community.chat_models import BedrockChat

llm = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":128}
)

In [2]:
from langchain.globals import set_llm_cache
from langchain.cache import InMemoryCache

set_llm_cache(InMemoryCache())

In [3]:
%%time

llm.predict("Tell me a joke")

  warn_deprecated(


CPU times: user 131 ms, sys: 16.9 ms, total: 148 ms
Wall time: 1.59 s


"Here's a silly joke for you:\n\nWhy can't a bicycle stand up on its own? Because it's two-tired!\n\nHow was that? I tried to come up with a simple, lighthearted pun-based joke. Let me know if you'd like to hear another one."

In [5]:
%%time
llm.predict("Tell me a joke")

CPU times: user 2.78 ms, sys: 0 ns, total: 2.78 ms
Wall time: 2.5 ms


"Here's a silly joke for you:\n\nWhy can't a bicycle stand up on its own? Because it's two-tired!\n\nHow was that? I tried to come up with a simple, lighthearted pun-based joke. Let me know if you'd like to hear another one."

### Output Parsers

<u>Useful</u>

1. StrOutputParser
2. JsonOutputParser, SimpleJsonOutputParser
3. XMLOutputParser
4. AgentOutputParser
    - ReActJsonSingleInputOutputParser
    - ReActSingleInputOutputParser
    - JSONAgentOutputParser
    - XMLAgentOutputParser
    - SelfAskOutputParser
5. RetryOutputParser
6. OutputFixingParser

In [5]:
import boto3
from langchain_community.chat_models import BedrockChat
from langchain_core.prompts import PromptTemplate

model = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":128}
)

#### JSON parser

In [12]:
from langchain.output_parsers.json import SimpleJsonOutputParser

json_prompt = PromptTemplate.from_template(
    "Return only an JSON object with an `answer` key that answers the following question: {question}"
)
json_parser = SimpleJsonOutputParser()
json_chain = json_prompt | model | json_parser

In [13]:
list(json_chain.stream({"question": "Who invented the microscope?"}))

[{},
 {'answer': ''},
 {'answer': 'The'},
 {'answer': 'The microsc'},
 {'answer': 'The microscope'},
 {'answer': 'The microscope was'},
 {'answer': 'The microscope was invented'},
 {'answer': 'The microscope was invented by'},
 {'answer': 'The microscope was invented by Hans'},
 {'answer': 'The microscope was invented by Hans L'},
 {'answer': 'The microscope was invented by Hans Lipp'},
 {'answer': 'The microscope was invented by Hans Lippersh'},
 {'answer': 'The microscope was invented by Hans Lippershey'},
 {'answer': 'The microscope was invented by Hans Lippershey,'},
 {'answer': 'The microscope was invented by Hans Lippershey, Zach'},
 {'answer': 'The microscope was invented by Hans Lippershey, Zacharias'},
 {'answer': 'The microscope was invented by Hans Lippershey, Zacharias Jan'},
 {'answer': 'The microscope was invented by Hans Lippershey, Zacharias Janssen'},
 {'answer': 'The microscope was invented by Hans Lippershey, Zacharias Janssen,'},
 {'answer': 'The microscope was inve

In [19]:
from tqdm import tqdm

mem = []
for i in tqdm(range(500)):
    try:
        res = json_chain.invoke({"question": "Who invented the microscope?"})
    except Exception as e:
        print(i, e)
        mem.append({
            "i": i,
            "error": e,
            "res": res
        })

len(mem)

100%|██████████| 500/500 [06:01<00:00,  1.38it/s]


0

#### Output fixing parser

In [56]:
from langchain.output_parsers import PydanticOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field
from typing import List

class Actor(BaseModel):
    name: str = Field(description="name of an actor")
    film_names: List[str] = Field(description="list of names of films they starred in")


parser = PydanticOutputParser(pydantic_object=Actor)

misformatted = "{'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}"

parser.parse(misformatted)

OutputParserException: Invalid json output: {'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}

In [57]:
from langchain.output_parsers import OutputFixingParser

fix_parser = OutputFixingParser.from_llm(parser=parser, llm=llm)

fix_parser.parse(misformatted)

Actor(name='Tom Hanks', film_names=['Forrest Gump'])

In [58]:
# prompt that's used by default
from langchain.output_parsers.prompts import NAIVE_FIX_PROMPT

NAIVE_FIX_PROMPT.pretty_print()

Instructions:
--------------
[33;1m[1;3m{instructions}[0m
--------------
Completion:
--------------
[33;1m[1;3m{completion}[0m
--------------

Above, the Completion did not satisfy the constraints given in the Instructions.
Error:
--------------
[33;1m[1;3m{error}[0m
--------------

Please try again. Please only respond with an answer that satisfies the constraints laid out in the Instructions:


In [81]:
# custom prompt but same input_ariables as above

template = """
Instructions:
<instructions>
{instructions}
</instructions>

Completion:
<completion>
{completion}
<completion>

Above, the Completion did not satisfy the constraints given in the Instructions.
<error>
{error}
</error>

Please only respond with completion that satisfies the constraints laid out in the Instructions. Do not generate text beyond given completion.
"""

fix_prompt = PromptTemplate.from_template(template=template) 

fix_prompt.pretty_print()


Instructions:
<instructions>
[33;1m[1;3m{instructions}[0m
</instructions>

Completion:
<completion>
[33;1m[1;3m{completion}[0m
<completion>

Above, the Completion did not satisfy the constraints given in the Instructions.
<error>
[33;1m[1;3m{error}[0m
</error>

Please only respond with completion that satisfies the constraints laid out in the Instructions. Do not generate text beyond given completion.



In [82]:
fix_parser_prompt = OutputFixingParser.from_llm(
    parser=parser, llm=llm, prompt=fix_prompt
)

fix_parser_prompt.parse(misformatted)

Actor(name='Tom Hanks', film_names=['Forrest Gump'])

#### Retry parser

In [96]:
from langchain.prompts import PromptTemplate

template = """Based on the user question, provide an Action and Action Input for what step should be taken.
{format_instructions}
Question: {query}
Response: """

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

prompt_value = prompt.format_prompt(query="who invented computer?")

bad_response = '{"action": "search"}'

In [97]:
from langchain.output_parsers import PydanticOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field

class Action(BaseModel):
    action: str = Field(description="action to take")
    action_input: str = Field(description="input to the action")

parser = PydanticOutputParser(pydantic_object=Action)
parser.parse(bad_response)

OutputParserException: Failed to parse Action from completion {'action': 'search'}. Got: 1 validation error for Action
action_input
  field required (type=value_error.missing)

In [87]:
from langchain.output_parsers import OutputFixingParser

fix_parser = OutputFixingParser.from_llm(parser=parser, llm=llm, prompt=fix_prompt)
fix_parser.parse(bad_response)

Action(action='search', action_input='')

In [49]:
from langchain.output_parsers import RetryOutputParser

retry_parser = RetryOutputParser.from_llm(parser=parser, llm=llm)
retry_parser.parse_with_prompt(bad_response, prompt_value)

OutputParserException: Invalid json output: Here is the response formatted as a JSON instance that conforms to the provided schema:

{
  "action": "provide information",
  "action_input": "The computer was invented by multiple people over time, with key contributions from pioneers in computer science and engineering. Some of the major figures in the invention of the computer include:

- Charles Babbage - Designed the first mechanical computer, the Analytical Engine, in the 19th century.
- Alan Turing - Developed the theoretical foundations of computer science and the concept of the Turing machine in the 1930s.

In [89]:
from langchain.output_parsers import RetryWithErrorOutputParser

retry_with_error_parser = RetryWithErrorOutputParser(parser=parser, llm=llm)
retry_with_error_parser.parse_with_prompt(bad_response, prompt_value)

AttributeError: 'NoneType' object has no attribute 'run'

In [93]:
# prompt that's used by default
NAIVE_COMPLETION_RETRY = """Prompt:
{prompt}
Completion:
{completion}

Above, the Completion did not satisfy the constraints given in the Prompt.
Please try again:"""

NAIVE_COMPLETION_RETRY_WITH_ERROR = """Prompt:
{prompt}
Completion:
{completion}

Above, the Completion did not satisfy the constraints given in the Prompt.
Details: {error}
Please try again:"""

NAIVE_RETRY_PROMPT = PromptTemplate.from_template(NAIVE_COMPLETION_RETRY)
NAIVE_RETRY_WITH_ERROR_PROMPT = PromptTemplate.from_template(
    NAIVE_COMPLETION_RETRY_WITH_ERROR
)

# NAIVE_RETRY_PROMPT.pretty_print()
# NAIVE_RETRY_WITH_ERROR_PROMPT.pretty_print()

In [101]:
# custom prompt but same input_ariables as above

template = """
Prompt:
<prompt>
{prompt}
</prompt>

Completion:
<completion>
{completion}
</completion>

Please only respond with completion that satisfies the constraints given in the Prompt. Do not generate text beyond given completion.
"""

retry_prompt = PromptTemplate.from_template(template=template) 

# retry_prompt.pretty_print()

error_template = """
Prompt:
<prompt>
{prompt}
</prompt>

Completion:
<completion>
{completion}
</completion>

Details: {error}

Please only respond with completion that satisfies the constraints given in the Prompt. Do not generate text beyond given completion.
"""

retry_with_error_prompt = PromptTemplate.from_template(template=error_template) 

retry_with_error_prompt.pretty_print()


Prompt:
<prompt>
[33;1m[1;3m{prompt}[0m
</prompt>

Completion:
<completion>
[33;1m[1;3m{completion}[0m
</completion>

Details: [33;1m[1;3m{error}[0m

Please only respond with completion that satisfies the constraints given in the Prompt. Do not generate text beyond given completion.



In [98]:
retry_parser = RetryOutputParser.from_llm(
    parser=parser, llm=llm, prompt=retry_prompt
)
retry_parser.parse_with_prompt(bad_response, prompt_value)

Action(action='search', action_input='who invented computer')

In [102]:
# below didnt work - need to pass prompt_value with error I guess
from langchain.output_parsers import RetryWithErrorOutputParser

retry_with_error_parser = RetryWithErrorOutputParser(
    parser=parser, llm=llm, prompt=retry_with_error_prompt
)
retry_with_error_parser.parse_with_prompt(bad_response, prompt_value)

AttributeError: 'NoneType' object has no attribute 'run'

#### XML output parser & challenges

In [110]:
from langchain_core.prompts import ChatPromptTemplate

template = """Generate the shortened filmography for {actor}.
Please enclose the movies in <movie></movie> tags."""
# prompt = ChatPromptTemplate.from_messages(
#     ("human", template)
# )
prompt = PromptTemplate.from_template(
    template=template
)

prompt_value = prompt.format(actor="Tom Hanks")

print(llm.invoke(prompt_value).content)

Here is the shortened filmography for Tom Hanks:

<movie>Forrest Gump</movie>
<movie>Saving Private Ryan</movie>
<movie>Cast Away</movie>
<movie>Apollo 13</movie>
<movie>Toy Story</movie>
<movie>The Green Mile</movie>
<movie>Catch Me If You Can</movie>
<movie>Captain Phillips</movie>
<movie>Sully</movie>


In [134]:
from langchain.output_parsers import XMLOutputParser

parser = XMLOutputParser()

model = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":512}
)

new_template = """Format Instructions:
<format_instructions>
{format_instructions}
</format_instructions>

Follow the Format Instructions and generate shortened the filmography for {actor}. Do not explain. Start with XML tags.
"""

# Follow the Format Instructions and generate the  filmography for {actor}. Do not explain. Start with XML tags.

# Enclose within ```xml ```.

# print(parser.get_format_instructions())
prompt = PromptTemplate(
    template=new_template,
    input_variables=["actor"],
    partial_variables={
        "format_instructions": parser.get_format_instructions(),
    }
)

chain = prompt | model | parser

print(chain.invoke({"actor": "Tom Hanks"}))

{'filmography': [{'film': [{'title': 'Forrest Gump'}, {'year': '1994'}]}, {'film': [{'title': 'Saving Private Ryan'}, {'year': '1998'}]}, {'film': [{'title': 'Cast Away'}, {'year': '2000'}]}, {'film': [{'title': 'The Green Mile'}, {'year': '1999'}]}, {'film': [{'title': 'Toy Story'}, {'year': '1995'}]}]}


In [142]:
parser_mod = XMLOutputParser(tags=["movies", "actor", "film", "name", "genre"])

prompt_mod = PromptTemplate(
    template=new_template,
    input_variables=["actor"],
    partial_variables={
        "format_instructions": parser_mod.get_format_instructions(),
    }
)

# chain = prompt_mod | model | parser_mod
chain = prompt_mod | model

print(chain.invoke({"actor": "Tom Hanks"}))

content='<movies>\n    <actor>\n        <name>Tom Hanks</name>\n        <film>\n            <name>Forrest Gump</name>\n            <genre>Drama, Comedy</genre>\n        </film>\n        <film>\n            <name>Saving Private Ryan</name>\n            <genre>War, Drama</genre>\n        </film>\n        <film>\n            <name>Cast Away</name>\n            <genre>Drama, Adventure</genre>\n        </film>\n        <film>\n            <name>Toy Story</name>\n            <genre>Animation, Comedy, Family</genre>\n        </film>\n    </actor>\n</movies>'


In [141]:
# for single key its possible 

from langchain_core.output_parsers import StrOutputParser

def _sanitize_output(text: str):
    _, after = text.split("<sql>")
    return after.split("</sql>")[0]


# chain = sql_prompt| model | StrOutputParser() | _sanitize_output

inputs = """<sql>
select *
from products;
</sql>"""

sql_query = _sanitize_output(inputs)
print(sql_query)


select *
from products;



In [144]:
# but what to do when we have so many
print('<movies>\n    <actor>\n        <name>Tom Hanks</name>\n        <film>\n            <name>Forrest Gump</name>\n            <genre>Drama, Comedy</genre>\n        </film>\n        <film>\n            <name>Saving Private Ryan</name>\n            <genre>War, Drama</genre>\n        </film>\n        <film>\n            <name>Cast Away</name>\n            <genre>Drama, Adventure</genre>\n        </film>\n        <film>\n            <name>Toy Story</name>\n            <genre>Animation, Comedy, Family</genre>\n        </film>\n    </actor>\n</movies>')

<movies>
    <actor>
        <name>Tom Hanks</name>
        <film>
            <name>Forrest Gump</name>
            <genre>Drama, Comedy</genre>
        </film>
        <film>
            <name>Saving Private Ryan</name>
            <genre>War, Drama</genre>
        </film>
        <film>
            <name>Cast Away</name>
            <genre>Drama, Adventure</genre>
        </film>
        <film>
            <name>Toy Story</name>
            <genre>Animation, Comedy, Family</genre>
        </film>
    </actor>
</movies>


#### YAML parser

In [147]:
from typing import List

from langchain.output_parsers import YamlOutputParser
from langchain.prompts import PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field

class Joke(BaseModel):
    setup: str=Field(description="question to set up a joke")
    punchline: str=Field(description="answer to resolve the joke")

parser = YamlOutputParser(pydantic_object=Joke)

# print(parser.get_format_instructions())

In [149]:
prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser

joke_query = "Tell me a joke."
chain.invoke({"query": joke_query})


Joke(setup="Why don't scientists trust atoms?", punchline='Because they make up everything!')

### Retrievers

<u>Flow</u>

Source -> Load -> Transform -> Embed -> Store -> Retrieve

<u>Concepts</u>
1. Document loaders: load Documents from diff sources. Document(page_content, metdata)
2. Text Splitting: chunking strategy. ex: optimized logic for code, markdown docs
3. Text Embedding models: embeddings capture the semantic meaning of the text
4. Vector Stores: store and search embeddings
5. Retrievers: retrieval algo (semantic search etc) + vectorstore
    - parent document retriever
    - self-query retriever
    - ensemble retriever
6. Indexing

##### PDF

In [150]:
! pip install pypdf rapidocr-onnxruntime --quiet

[0m

In [151]:
%%time
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("https://arxiv.org/pdf/1706.03762.pdf")
pages = loader.load_and_split()

CPU times: user 1.54 s, sys: 62.7 ms, total: 1.6 s
Wall time: 1.68 s


In [152]:
len(pages)

16

In [157]:
print(pages[3].page_content)

Figure 1: The Transformer - model architecture.
The Transformer follows this overall architecture using stacked self-attention and point-wise, fully
connected layers for both the encoder and decoder, shown in the left and right halves of Figure 1,
respectively.
3.1 Encoder and Decoder Stacks
Encoder: The encoder is composed of a stack of N= 6 identical layers. Each layer has two
sub-layers. The first is a multi-head self-attention mechanism, and the second is a simple, position-
wise fully connected feed-forward network. We employ a residual connection [ 11] around each of
the two sub-layers, followed by layer normalization [ 1]. That is, the output of each sub-layer is
LayerNorm( x+ Sublayer( x)), where Sublayer( x)is the function implemented by the sub-layer
itself. To facilitate these residual connections, all sub-layers in the model, as well as the embedding
layers, produce outputs of dimension dmodel = 512 .
Decoder: The decoder is also composed of a stack of N= 6identical layers.

##### extract images as text

In [161]:
%%time


loader = PyPDFLoader("https://arxiv.org/pdf/1706.03762.pdf", extract_images=True)
pages = loader.load_and_split()
# pages = loader.load()

len(pages)

CPU times: user 7.78 s, sys: 3.09 s, total: 10.9 s
Wall time: 10.9 s


16

In [162]:
pages

[Document(page_content='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.comAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.eduŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗ ‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer,\nbased solely on attention mechanisms, dispensing 

In [164]:
print(pages[3].page_content)

Figure 1: The Transformer - model architecture.
The Transformer follows this overall architecture using stacked self-attention and point-wise, fully
connected layers for both the encoder and decoder, shown in the left and right halves of Figure 1,
respectively.
3.1 Encoder and Decoder Stacks
Encoder: The encoder is composed of a stack of N= 6 identical layers. Each layer has two
sub-layers. The first is a multi-head self-attention mechanism, and the second is a simple, position-
wise fully connected feed-forward network. We employ a residual connection [ 11] around each of
the two sub-layers, followed by layer normalization [ 1]. That is, the output of each sub-layer is
LayerNorm( x+ Sublayer( x)), where Sublayer( x)is the function implemented by the sub-layer
itself. To facilitate these residual connections, all sub-layers in the model, as well as the embedding
layers, produce outputs of dimension dmodel = 512 .
Decoder: The decoder is also composed of a stack of N= 6identical layers.

##### using Amazon Textract

In [None]:
from langchain_community.document_loaders import AmazonTextractPDFLoader
loader = AmazonTextractPDFLoader("example_data/alejandro_rosalez_sample-small.jpeg")
documents = loader.load()

##### Text Splitters

langchain-text-splitters

`Transform docs` - split a long document into smaller chunks that can fit into your model's context window.
(goal is not to chunk for chunking sake, our goal is to get our data in a format where it can be retrieved for value later).

Text Splitters `work` as following:
1. Split the text up into small, semantically meaningful chunks (often sentences)
2. Start combining these small chunks into a larger chunk until you reach a certain size (as measured by some function).
3. Once you reach that size, make that chunk its own piece of text and then start creating a new chunk of text with some overlap 
   (to keep context between chunks).

Customize your text splitter:
1. how the text is split
2. how the chunk size is measured


`Ref`: https://github.com/FullStackRetrieval-com/RetrievalTutorials/blob/main/tutorials/LevelsOfTextSplitting/5_Levels_Of_Text_Splitting.ipynb

`Levels` of text splitting:
1. Character splitting
2. Recursive Character text splitting
3. Document specific splitting
4. Semantic splitting
5. Agentic splitting
6. Alternative Representation chunking + indexing

How to `evaluate` text splitters:
Chunkvix utility: https://www.chunkviz.com


`Interesting` ones:
1. Text: 
    - CharacterTextSplitter
    - RecursiveCharacterTextSplitter
    - RecursiveJsonSplitter
2. Code: PythonCodeTextSplitter
3. PDFs with tables
4. Multi-modal (text + images)
5. Semantic Chunking
    - SemanticChunker
6. Hypothetical Questions: generate hypothetical questions about raw documents. 
   Helpful when you have sparse unstructured data, like chat messages.
7. Split by tokens: when you split your text into chunks it is therefore a good idea to count the number of tokens. Tokenizers for this. 
   When you count tokens in your text you should use the same tokenizer as used in the language model.
   - SentenceTransformersTokenTextSplitter
   - NLTKTextSplitter
   - Huggingface's Tokenizer. ex: GPT2TokenizerFast
      from transformers import GPT2TokenizerFast
      from langchain_text_splitters import CharacterTextSplitter
      tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
      text_splitter = CharacterTextSplitter.from_huggingface_tokenizer(
         tokenizer, chunk_size=100, chunk_overlap=0
      )

Graphs:
if data is rich with entities, relationships, and connections -> then a graph structure would benefir
Options:
- Diffbot
- InstaGraph

Webscraping tools:
1. Diffbot - https://python.langchain.com/docs/integrations/document_loaders/diffbot
2. 


#### Semantic chunking

In [4]:
! pip install langchain_experimental -qU

[0m

In [5]:
import json, os

with open("/home/ubuntu/config.json") as f:
    config = json.loads(f.read())
os.environ["COHERE_API_KEY"] = config["cohere_api_key"]

with open("data/state_of_the_union.txt") as f:
    state_of_the_union = f.read()
print(state_of_the_union)

Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  

Last year COVID-19 kept us apart. This year we are finally together again. 

Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. 

With a duty to one another to the American people to the Constitution. 

And with an unwavering resolve that freedom will always triumph over tyranny. 

Six days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. 

He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. 

He met the Ukrainian people. 

From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. 

Groups of citizens blocking tanks with their bodies. Every

In [6]:
from langchain_experimental.text_splitter import SemanticChunker
from langchain_community.embeddings import CohereEmbeddings

text_splitter = SemanticChunker(CohereEmbeddings())
docs = text_splitter.create_documents([state_of_the_union])

print(f"splits: {len(docs)}")
print(f"first doc content: {docs[0].page_content}")

splits: 26
first doc content: Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. Last year COVID-19 kept us apart. This year we are finally together again. Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. With a duty to one another to the American people to the Constitution. And with an unwavering resolve that freedom will always triumph over tyranny. Six days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. He met the Ukrainian people. From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. Groups of citizens blocking tanks with their bo

In [8]:
text_splitter = SemanticChunker(
    CohereEmbeddings(), breakpoint_threshold_type="percentile"
)

TypeError: SemanticChunker.__init__() got an unexpected keyword argument 'breakpoint_threshold_type'

#### Retrieval

In [None]:
CacheBackedEmbeddings
- back by Vector Store
- back by ByteStore

In [None]:
# vectorstore
from langchain.storage import LocalFileStore
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter

underlying_embeddings = OpenAIEmbeddings()

store = LocalFileStore("./cache/")

cached_embedder = CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings, store, namespace=underlying_embeddings.model
)

list(store.yield_keys())

# bytestore
from langchain.embeddings import CacheBackedEmbeddings
from langchain.storage import InMemoryByteStore

store = InMemoryByteStore()

cached_embedder = CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings, store, namespace=underlying_embeddings.model
)

In [None]:
1. MultiQueryRetriever
2. 