# Retrieval Augmented Generation With Postgres, Ollama and LangChain

The notebook demonstrates how to use the retrieval-augmented generation (RAG) technique to make data stored in Postgres available to an Ollama LLM.

**0. Prerequsites**

* Install python and pip
* Start a Postgres container with pgvector and upload the movies dataset following the instructions from [chapter8.md](../chapter8.md)
* Start an Ollama container and download required embedding and large language models follow the [ai_samples/README.md](README.md) instructions.

**1. Install Required Modules**

In [None]:
pip install -q psycopg2-binary==2.9.9 langchain-ollama==0.2.3

**2. Import Modules**

In [None]:
from langchain_ollama import OllamaLLM
import psycopg2
from langchain_ollama import OllamaEmbeddings

**3. Define Function Using Ollama LLM**

In [3]:
def answer_question(question, context):
    # Connecting to the tinyllama model with the LangChain Ollama interface
    llm = OllamaLLM(model="tinyllama", temperature=0.6)

    # Generating a final prompt for the LLM based considering the provided context
    prompt = f"""
    You're a movie expert and your task is to answer questions about movies based on the provided context.

    This is the user's question: {question}  
    Consider the following context to provide a detailed and accurate answer: {context}  

    The context includes the following details for each movie:
    - "Title" of the movie
    - "Vote Average" - the average rating of the movie
    - "Budget" - the budget allocated for the movie in the US dollars
    - "Revenue" - the total revenue generated by the movie in the US dollars
    - "Release Date" - the date the movie was released

    Respond in an engaging style that inspires the user to watch the movies.
    """
    
    # Invoke the LLM passing the prompt. The LLM will generate a response.
    response = llm.invoke(prompt)
    return response

**4. Define Function Retreiving Context from Postgres**

In [9]:
def retrieve_context_from_postgres(question):
    # Connect to the Postgres intance with the pgvector extension
    db_params = {
    "host": "localhost",
    "port": 5432,
    "dbname": "postgres",
    "user": "postgres",
    "password": "password"
    }
    conn = psycopg2.connect(**db_params)
    cursor = conn.cursor()

    # Connect to the embedding model using the OllamaEmbeddings interface
    embedding_model = OllamaEmbeddings(model="mxbai-embed-large:335m")

    # Generate the embedding for the user's question
    embedding = embedding_model.embed_query(question)

    # Perform vector similarity search to find relevant movies
    query = """
    SELECT name, vote_average, budget, revenue, release_date
    FROM omdb.movies
    ORDER BY movie_embedding <=> %s::vector LIMIT 3
    """
    
    cursor.execute(query, (embedding, ))
    
    context = ""

    # Generate context from the retrieved rows
    for row in cursor.fetchall():
        context += f"Movie title: {row[0]}, Vote Average: {row[1]}"
        context += f", Budget: {row[2]}, Revenue: {row[3]}"
        context += f", Release Date: {row[4]}\n"

    cursor.close()
    conn.close()

    return context

**5. Perform RAG**

Using the RAG approach to provide the LLM with more details stored in Postgres. This will augment the LLM's behavior, letting it answer more accurately and with no or fewer hallucinations. You can minimize the hallucinations by tweaking the LLM parameters, the prompt or switching to a more advanced LLM.

In [None]:
# Prepare a sample question.
question = "I'd like to watch the best movies about pirates." + \
    "Any suggestions?"

# Retrieve context from Postgres based on the user's question
context = retrieve_context_from_postgres(question)

print("Context from Postgres:")
print(context)

# Use the context to answer the user's question using the LLM
answer = answer_question(question, context)

print("LLM's answer:")
print(answer)