# Vector stores and retrievers

Following: https://python.langchain.com/v0.2/docs/tutorials/retrievers/

## Store

In [1]:
from langchain_core.documents import Document

documents = [
    Document(
        page_content="Dogs are great companions, known for their loyalty and friendliness.",
        metadata={"source": "mammal-pets-doc"},
    ),
    Document(
        page_content="Cats are independent pets that often enjoy their own space.",
        metadata={"source": "mammal-pets-doc"},
    ),
    Document(
        page_content="Goldfish are popular pets for beginners, requiring relatively simple care.",
        metadata={"source": "fish-pets-doc"},
    ),
    Document(
        page_content="Parrots are intelligent birds capable of mimicking human speech.",
        metadata={"source": "bird-pets-doc"},
    ),
    Document(
        page_content="Rabbits are social animals that need plenty of space to hop around.",
        metadata={"source": "mammal-pets-doc"},
    ),
]

I'm gonna use mxbai-embed-large as embedding model.

See here for more information: https://ollama.com/blog/embedding-models

In [2]:
from langchain_community.embeddings import OllamaEmbeddings

In [3]:
embeddings = OllamaEmbeddings(model="mxbai-embed-large")

In [4]:
embeddings.embed_query("This is a test")

[0.010212134569883347,
 -0.03700023889541626,
 -0.4237123429775238,
 0.2987978160381317,
 0.7057973146438599,
 -0.1535431444644928,
 -0.39707496762275696,
 -0.4230448603630066,
 1.2715002298355103,
 1.0776206254959106,
 0.8736289143562317,
 0.45922887325286865,
 0.18683820962905884,
 0.024590475484728813,
 -0.42067667841911316,
 0.17621882259845734,
 -0.5702452063560486,
 -0.5122719407081604,
 -0.9891872406005859,
 -0.5622479319572449,
 -0.6907287836074829,
 0.3007518947124481,
 -0.5230625867843628,
 -0.9891065955162048,
 -0.6106640100479126,
 0.6101281046867371,
 -0.12390478700399399,
 0.13086625933647156,
 0.35936176776885986,
 1.0513122081756592,
 0.042592987418174744,
 -0.805928111076355,
 -0.2646864652633667,
 -0.932020902633667,
 -0.3157227635383606,
 -0.695506751537323,
 0.16210885345935822,
 -0.8042367696762085,
 0.05818185210227966,
 -0.6271530389785767,
 0.19418969750404358,
 -0.6769360303878784,
 0.5264475345611572,
 -0.7794733643531799,
 -0.3842529058456421,
 -0.33273488283

In [5]:
from langchain_chroma import Chroma

In [6]:
vectorstore = Chroma.from_documents(documents, embedding=embeddings)

## Retrieve

In [7]:
vectorstore.similarity_search_with_score("cat")

[(Document(page_content='Cats are independent pets that often enjoy their own space.', metadata={'source': 'mammal-pets-doc'}),
  282.37957763671875),
 (Document(page_content='Rabbits are social animals that need plenty of space to hop around.', metadata={'source': 'mammal-pets-doc'}),
  283.5023498535156),
 (Document(page_content='Goldfish are popular pets for beginners, requiring relatively simple care.', metadata={'source': 'fish-pets-doc'}),
  291.3170166015625),
 (Document(page_content='Parrots are intelligent birds capable of mimicking human speech.', metadata={'source': 'bird-pets-doc'}),
  312.940673828125)]

## Retrievers manually

In [8]:
from langchain_core.runnables import RunnableLambda

In [10]:
retriever = RunnableLambda(vectorstore.similarity_search).bind(k=1)
retriever.batch(["cat", "shark"])

[[Document(page_content='Cats are independent pets that often enjoy their own space.', metadata={'source': 'mammal-pets-doc'})],
 [Document(page_content='Goldfish are popular pets for beginners, requiring relatively simple care.', metadata={'source': 'fish-pets-doc'})]]

In [11]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 1})
retriever.batch(["cat", "shark"])

[[Document(page_content='Cats are independent pets that often enjoy their own space.', metadata={'source': 'mammal-pets-doc'})],
 [Document(page_content='Goldfish are popular pets for beginners, requiring relatively simple care.', metadata={'source': 'fish-pets-doc'})]]

## Simple RAG

In [12]:
from langchain_community.chat_models import ChatOllama

In [13]:
llm = ChatOllama(model="llama3")

In [16]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

message = """
Answer this question using only information from the provided context. Act as a helpful and informative pet shop employee.

{question}

Context:
{context}
"""

prompt = ChatPromptTemplate.from_messages([("human", message)])

rag_chain = {"context": retriever, "question": RunnablePassthrough()} | prompt | llm

In [19]:
response = rag_chain.invoke("Tell me about fish")
print(response.content)

Fish! We get lots of folks coming in here wanting to learn more about our finned friends. Let me tell you, we've got a great selection of fish that make perfect pets for people of all experience levels.

Goldfish are actually one of the most popular types of fish we carry. And it's no wonder why - they're super easy to care for! As your document mentioned, goldfish require relatively simple care, making them a great choice for beginners. Just be sure to give them a big enough tank with plenty of space to swim around, and you'll be all set.

We've also got a variety of other fish species that make great pets. If you're looking for something a little more low-maintenance than goldfish, we might recommend some of our hardy community fish like guppies or neon tetras. These guys are super easy to care for and can thrive in smaller tanks.

Of course, if you're looking to get really into the world of fish-keeping, we've also got some more advanced species that can be a lot of fun to keep. Jus