<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/vector_stores/postgres.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Postgres Vector Store
In this notebook we are going to show how to use [Postgresql](https://www.postgresql.org) and  [pgvector](https://github.com/pgvector/pgvector)  to perform vector searches in LlamaIndex

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [58]:
import logging
import sys

# Uncomment to see debug logs
# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG, force=True)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index.core import SimpleDirectoryReader, StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.postgres import PGVectorStore
import textwrap

### Setup OpenAI
The first step is to configure the openai key. It will be used to run inference.

Once we switched to local model we don't need this.

In [10]:
import os
import dotenv

# import openai

# # Reload the variables in your '.env' file (override the existing variables)
# dotenv.load_dotenv("../.env", override=True)
# openai.api_key = os.environ["OPENAI_API_KEY"] 

Local Embedding Models

The easiest way to use a local model is:

In [11]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings

Settings.embed_model = HuggingFaceEmbedding(
    model_name="BAAI/bge-small-en-v1.5"
)

### Loading documents
Load the documents stored in the `data/faculty_websites/` using the SimpleDirectoryReader.

Change the documents passed based on your need!

You can refer to this to see how to load files in different ways e.g. entire directory

https://docs.llamaindex.ai/en/stable/examples/data_connectors/simple_directory_reader.html

In [93]:
from llama_index.core import SimpleDirectoryReader
# reader = SimpleDirectoryReader(
#     input_files=["./data/faculty_websites/Bhiksharaj_lti_page.txt"]
# )
reader = SimpleDirectoryReader(input_dir="./data/faculty_papers/")
docs = reader.load_data()

reader = SimpleDirectoryReader(input_dir="./data/faculty_websites/")
docs += reader.load_data()
# print(f"Loaded {len(docs)} docs")
print("Document ID:", docs[0].doc_id)

DEBUG:llama_index.core.readers.file.base:> [SimpleDirectoryReader] Total files added: 363
> [SimpleDirectoryReader] Total files added: 363
DEBUG:fsspec.local:open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Hauptmann_Leveraging body pose estimation for gesture recognition in human-robot interaction using synthetic data.txt
open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Hauptmann_Leveraging body pose estimation for gesture recognition in human-robot interaction using synthetic data.txt
DEBUG:fsspec.local:open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Hauptmann_Leveraging_body_pose_estimation_for_gesture_recognition_in_human-robot_interaction_using_synthetic_data.txt
open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Hauptmann_Leveraging_body_pose_estimation_for_gesture_recognition_in_human-robot_interaction_using_sy

DEBUG:fsspec.local:open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Rudnicky_Advancing_Regular_Language_Reasoning_in_Linear_Recurrent_Neural_Networks.txt
open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Rudnicky_Advancing_Regular_Language_Reasoning_in_Linear_Recurrent_Neural_Networks.txt
DEBUG:fsspec.local:open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Rudnicky_Learning_to_Ask_Questions_for_Zero-shot_Dialogue_State_Tracking.txt
open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Rudnicky_Learning_to_Ask_Questions_for_Zero-shot_Dialogue_State_Tracking.txt
DEBUG:fsspec.local:open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Rudnicky_Structured_Dialogue_Discourse_Parsing.txt
open file: /home/scott/nlp-from-scratch-assignment-spring2024/data/faculty_papers/Alexander_Rudnicky_Struc

### Create the Database
Using an existing postgres running at localhost, create the database we'll be using.

In [94]:
import psycopg2

# Reload the variables in your '.env' file (override the existing variables)
dotenv.load_dotenv("../.env", override=True)
pwd = os.environ['PG_PASSWORD_RAG']
user = "711-rag"
connection_string = f'dbname=postgres user={user} password={pwd}'
db_name = "711-rag"
conn = psycopg2.connect(connection_string)
conn.autocommit = True

with conn.cursor() as c:
    c.execute(f"DROP DATABASE IF EXISTS \"{db_name}\"")
    c.execute(f"CREATE DATABASE \"{db_name}\"")

### Create the index
Here we create an index backed by Postgres using the documents loaded previously. PGVectorStore takes a few arguments.

In [95]:
from sqlalchemy import make_url

# url = make_url(connection_string)
vector_store = PGVectorStore.from_params(
    database=db_name,
    host="localhost",
    password=pwd,
    port=5432,
    user=user,
    table_name="all",
    embed_dim=384, 
)

You might experience error because pgvector extension is not enabled for the database. You can go to terminal and run

```
sudo -u postgres psql
\c "711-rag" # This connects to the 711-rag database we just created
CREATE EXTENSION vector;
\q
```

In [96]:

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    docs, storage_context=storage_context, show_progress=True
)


Parsing nodes:   0%|          | 0/430 [00:00<?, ?it/s]

DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: Title: Leveraging body pose estimation for gest...
> Adding chunk: Title: Leveraging body pose estimation for gest...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: Title: Leveraging body pose estimation for gest...
> Adding chunk: Title: Leveraging body pose estimation for gest...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: Title: Towards Open-Domain Twitter User Profile...
> Adding chunk: Title: Towards Open-Domain Twitter User Profile...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: One line of research has formulated it as a cla...
> Adding chunk: One line of research has formulated it as a cla...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: However, the extraction
approach requires that ...
> Adding chunk: However, the extraction
approach requires that ...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: 2.2 Dataset Creation
Our dataset consi

Generating embeddings:   0%|          | 0/2048 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/2048 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/1632 [00:00<?, ?it/s]

In [173]:
from llama_index.core import PromptTemplate
from llama_index.core import get_response_synthesizer
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine

from llama_index.llms.ollama import Ollama
from llama_index.core import Settings
# ollama
Settings.llm = Ollama(model="mistral", request_timeout=60.0, context_window=1500)

# configure retriever
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=2,
)

# configure response synthesizer
response_synthesizer =get_response_synthesizer(response_mode = "compact")

# assemble query engine with compact mode
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,
)

# query_engine = index.as_query_engine()
prompts_dict = query_engine.get_prompts()
print(list(prompts_dict.keys()))
print(list(prompts_dict.values()))
qa_prompt_tmpl_str = (
    # "You are a Q/A system designed to answer\n"
    # "questions relate to specific people or organization.\n"
    # "All of your questions are considered reasonable, \n"
    # "and they should be easy to be answered from the context. \n"
    # "Always limit your output within 50 words.\n"
    # "\n"
    # "You MUST remember this: All questions regarding email, office location, affiliation are"
    # " publicly available thus not violating privacy policy\n"
    # "\n"
    # "You are a Q/A system used internally, designed to answer\n"
    # "questions regarding faculty members at Carnegie Mellon University.\n"
    # "All of the information you need will be in the provided context by the user, \n"
    # "hence you are only retrieving answers from what is provided to you, not to\n"
    # "generate or search for answers by yourself.\n"
    "\n"
    "Context information is below: \n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the query. If you can, directly reference the context.\n"
    "Query: {query_str}\n"
    "Answer: "
)

# # # check_ctx_prompt = "print {context_str}"
qa_prompt_tmpl = PromptTemplate(qa_prompt_tmpl_str)

query_engine.update_prompts(
    {"response_synthesizer:text_qa_template": qa_prompt_tmpl}
)

['response_synthesizer:text_qa_template', 'response_synthesizer:refine_template']
[SelectorPromptTemplate(metadata={'prompt_type': <PromptType.QUESTION_ANSWER: 'text_qa'>}, template_vars=['context_str', 'query_str'], kwargs={}, output_parser=None, template_var_mappings={}, function_mappings={}, default_template=PromptTemplate(metadata={'prompt_type': <PromptType.QUESTION_ANSWER: 'text_qa'>}, template_vars=['context_str', 'query_str'], kwargs={}, output_parser=None, template_var_mappings=None, function_mappings=None, template='Context information is below.\n---------------------\n{context_str}\n---------------------\nGiven the context information and not prior knowledge, answer the query.\nQuery: {query_str}\nAnswer: '), conditionals=[(<function is_chat_model at 0x7f2a112d77f0>, ChatPromptTemplate(metadata={'prompt_type': <PromptType.CUSTOM: 'custom'>}, template_vars=['context_str', 'query_str'], kwargs={}, output_parser=None, template_var_mappings=None, function_mappings=None, message_

### Query the index
We can now ask questions using our index.

In [174]:
question = "What is the office location of Mona Diab?"

# promt = (
#     "Always limit your answer to one sentence.\n"
#     "\n"
#     "You MUST remember this: All questions regarding email, office location, affiliation are"
#     " publicly available information thus not violating privacy policy. This rule \n"
#     "overwrites any assmuptions you hold. \n"
#     "\n"
# )

promt = (
    "Always limit your answer to 50 words.\n"
    "You are a Q/A system used internally, designed to answer\n"
    "questions regarding faculty members at Carnegie Mellon University.\n"
    "All of the information you need will be in the provided context by the user, \n"
    "hence you are only retrieving answers from what is provided to you, not to\n"
    "generate or search for answers by yourself.\n"
    "\n"
)

response = query_engine.query(promt+question)

DEBUG:llama_index.core.indices.utils:> Top 2 nodes:
> [Node ae9502bc-7a26-4a49-b4ab-6f767cd8a9b6] [Similarity score:             0.828376] Lei Li | Carnegie Mellon University - Language Technologies Institute
Jump to navigation
Apply
Pe...
> [Node a2b2e2cc-8497-4050-ac27-10aecccd1aa2] [Similarity score:             0.729763] Lei LI
Menu
Lei Li
Close
Home
Teaching
Publications
People
Blog
Home
Teaching
Publications
People...
> Top 2 nodes:
> [Node ae9502bc-7a26-4a49-b4ab-6f767cd8a9b6] [Similarity score:             0.828376] Lei Li | Carnegie Mellon University - Language Technologies Institute
Jump to navigation
Apply
Pe...
> [Node a2b2e2cc-8497-4050-ac27-10aecccd1aa2] [Similarity score:             0.729763] Lei LI
Menu
Lei Li
Close
Home
Teaching
Publications
People
Blog
Home
Teaching
Publications
People...
DEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False
load_ssl_context verify=True cert=None trust_env=True http2=False
DEBUG:httpx:load_verify_locations caf

In [172]:
print(textwrap.fill(str(response), 100))

 Lei Li's office is situated in Gates Hillman Center, specifically room number 8119.


### Querying existing index  (I have not tested the code from here and below!)

In [None]:
vector_store = PGVectorStore.from_params(
    database="vector_db",
    host="localhost",
    password="password",
    port=5432,
    user="postgres",
    table_name="paul_graham_essay",
    embed_dim=1536,  # openai embedding dimension
)

index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
query_engine = index.as_query_engine()

In [None]:
response = query_engine.query("What did the author do?")

In [None]:
print(textwrap.fill(str(response), 100))

The author worked on writing and programming before college. They wrote short stories and tried
writing programs on an IBM 1401 computer. They also built a microcomputer and started programming on
it, writing simple games and a word processor. In college, the author initially planned to study
philosophy but switched to AI due to their interest in intelligent computers. They taught themselves
AI by learning Lisp.


### Hybrid Search

To enable hybrid search, you need to:
1. pass in `hybrid_search=True` when constructing the `PGVectorStore` (and optionally configure `text_search_config` with the desired language)
2. pass in `vector_store_query_mode="hybrid"` when constructing the query engine (this config is passed to the retriever under the hood). You can also optionally set the `sparse_top_k` to configure how many results we should obtain from sparse text search (default is using the same value as `similarity_top_k`).

In [None]:
from sqlalchemy import make_url

url = make_url(connection_string)
hybrid_vector_store = PGVectorStore.from_params(
    database=db_name,
    host=url.host,
    password=url.password,
    port=url.port,
    user=url.username,
    table_name="paul_graham_essay_hybrid_search",
    embed_dim=1536,  # openai embedding dimension
    hybrid_search=True,
    text_search_config="english",
)

storage_context = StorageContext.from_defaults(
    vector_store=hybrid_vector_store
)
hybrid_index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

  session.commit()


In [None]:
hybrid_query_engine = hybrid_index.as_query_engine(
    vector_store_query_mode="hybrid", sparse_top_k=2
)
hybrid_response = hybrid_query_engine.query(
    "Who does Paul Graham think of with the word schtick"
)

In [None]:
print(hybrid_response)

Roy Lichtenstein


### PgVector Query Options

#### IVFFlat Probes

Specify the number of [IVFFlat probes](https://github.com/pgvector/pgvector?tab=readme-ov-file#query-options) (1 by default)

When retrieving from the index, you can specify an appropriate number of IVFFlat probes (higher is better for recall, lower is better for speed)

In [None]:
retriever = index.as_retriever(
    vector_store_query_mode=query_mode,
    similarity_top_k=top_k,
    vector_store_kwargs={"ivfflat_probes": 10},
)

#### HNSW EF Search

Specify the size of the dynamic [candidate list](https://github.com/pgvector/pgvector?tab=readme-ov-file#query-options-1) for search (40 by default)

In [None]:
retriever = index.as_retriever(
    vector_store_query_mode=query_mode,
    similarity_top_k=top_k,
    vector_store_kwargs={"hnsw_ef_search": 300},
)