# Implementing our RAG Application

We've walked through the process of ingesting our data. Now we want to setup our search application.

There are multiple ways of implementing search that we won't be addressing in this workshop.

We're going to implement a similarity search. We can choose to use the OpenSearch®️ SDK or we can use LangChain.

## Similarity Search with Langchain

LangChain allows us to connect our existing configurations across our platform to ensure that we settings are consistent. This is paramount to the success of your similarity search in that if you're search parameters differ from the values you selected to load data from, you can run into some issues.

LangChain gives us the ability to access a [preexisting OpenSearch instance](https://python.langchain.com/docs/integrations/vectorstores/opensearch/#using-a-preexisting-opensearch-instance).

In [None]:
import os
from pprint import pprint

from dotenv import load_dotenv
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_postgres.vectorstores import PGVector
import psycopg
from psycopg.rows import dict_row

load_dotenv()

embeddings = HuggingFaceEmbeddings()
query = "how do I create healthy boundaries"
query_embed = f"[{", ".join(str(x) for x in embeddings.embed_query(query))}]"
print(query_embed)

Now let's take that embedding and perform a vector search.

In [None]:
from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

# Connection parameters
_CONNECTION_STRING = os.getenv("AIVEN_POSTGRES_SERVICE_URI")
conn = psycopg.connect(_CONNECTION_STRING)

k_results = 4
query = "Tips for checking email"
query_embed = f"[{", ".join(str(x) for x in embeddings.embed_query(query))}]"
with conn.cursor(row_factory=dict_row) as cur:
    cur.execute(
        "SELECT * FROM quotes ORDER BY embedding <-> %s LIMIT %s;",
        (
            query_embed,
            k_results,
        ),
    )
    rows = cur.fetchall()
    episodes = set()

    for row in rows:
        episodes.add(" - ".join((row["transcription_title"], row["content"])))

print(f'Here are some episodes that might help you with "{query}":')
print("\n".join(episodes))
print("-------------------")

llm = ChatOllama(model="llama3.2")
prompt = ChatPromptTemplate.from_template("""
        Offer supportive advice for the question {query} with supporting quotes from 
        "{docs}".

        If there are no documents to quote, say "I don't have any information on that."

        Mention the quote you're pulling from                                                                     
        Don't include quotes from other sources
        make responses about 600 characters
""")

chain = prompt | llm | StrOutputParser()
topic = {"query": query, "docs": "\n".join([result.page_content for result in results])}
for chunks in chain.stream(topic):
    print(chunks, end="", flush=True)



In [None]:
from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

# Connection parameters
_CONNECTION_STRING = os.getenv("AIVEN_POSTGRES_SERVICE_URI")
conn = psycopg.connect(_CONNECTION_STRING)

query = "Tips for Checking Email"
query_embed = f"[{", ".join(str(x) for x in embeddings.embed_query(query))}]"
k_results = 4

with conn.cursor(row_factory=dict_row) as cur:
    cur.execute(
        "SELECT * FROM quotes ORDER BY embedding <-> %s LIMIT %s;",
        (
            query_embed,
            k_results,
        ),
    )
    rows = cur.fetchall()
    episodes = set()

    for row in rows:
        episodes.add(" - ".join((row["transcription_title"], row["content"])))

print(f'Here are some episodes that might help you with "{query}":')
print("\n".join(episodes))
print("-------------------")

llm = ChatOllama(model="deepseek-r1:8b")
prompt = ChatPromptTemplate.from_template("""
        In the style of Dr. Suess, make a catchy acronym from these quotes:
        "{docs}"
        
""")

chain = prompt | llm | StrOutputParser()
topic = {"query": query, "docs": "\n".join([row["content"] for row in rows])}
for chunks in chain.stream(topic):
    print(chunks, end="", flush=True)



This is good but we probably don't want to use Ollama in our production environment. Let look at how easy it is to use a different model.

> NOTE: ⚠️ The next block uses the OpenAI API which cannot be used without an API key which you will need to pay for.

To run the OpenAI example, you need to add your OpenAI API key to the `.env` file.
To do this, replace `<REPLACE_WITH_YOUR_OPENAI_API_KEY>` in the block below and run it:


In [None]:
!echo "OPENAI_API_KEY=<REPLACE_WITH_YOUR_OPENAI_API_KEY>" > .env

And now you can run the block below, to make the same sort of query as before, but this time using OpenAI instead of Ollama:


In [None]:
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

load_dotenv()

print("-------------------")
print(f'Here are some episodes that might help you with "{query}":')
episodes = set()
for result in results:
    episodes.add(f"{result.metadata["title"]} - {result.metadata["url"]}")
print("\n".join(episodes))

llm = ChatOpenAI()
prompt = ChatPromptTemplate.from_messages([
    ("system",
     """Offer supportive advice for the question {query} with supporting quotes from 
     ---
     {docs}
     ---
    Wrap quotes in quotation marks. Don't include quotes from other sources.
    If there are no documents to quote, say "I don't have any information on that."
    
    Mention the quote you're pulling from                                                                     
    Don't include quotes from other sources
    limit responses to under 1000 characters but use multiple paragraphs for readibility
    """)
    ,
    ("user",
     "{query}"),
])

chain = prompt | llm | StrOutputParser()
topic = {"query": query, "docs": "\n".join([result.page_content for result in results])}
for chunks in chain.stream(topic):
    print(chunks, end="", flush=True)
