# Python RAG Pattern with Semantic Kernel and PgVector

## Azure PostgreSQL Flexible Server - PGVector Setup in Azure

### Running a local database with a container:

- docker pull pgvector/pgvector:pg16
- Then execute:

```bash
docker run --name pgvector16 \
  --restart unless-stopped \
  -p 5432:5432 \
  -e POSTGRES_PASSWORD=password \
  -v pgdata:/var/lib/postgresql/data \
  -d pgvector/pgvector:pg16
```

After deployment, connect using psql and type: `CREATE EXTENSION vector;`

```bash
docker exec -it pgvector16 psql -U postgres
```

### Running a Flexible Server in Azure - Manual Instructions:

- Create a Flexible server instance in the Azure Portal
- After creation, navigate to the Server Parameters pane:
  - Search for azure.extensions
  - Check the `Vector` value
  - Save the changes and wait for the server to deploy
- After deployment, open the instance and navigate to the `Database` panel:
  - Click `Connect` link on the Postgres database
    - Using the Cloud Shell psql, active the vector extension by typing: `CREATE EXTENSION vector;`

Connection string:
- Docker: `PG_CONN_STR_PY="postgresql://<user>:<password>@<server>:5432/<database>"`
- Azure: `PG_CONN_STR_PY="postgresql://<USER>:<PASSWORD>@<NAME>.postgres.database.azure.com:5432/postgres"`

Useful commands:

- `truncate table public."PYCollection";`

## Setup

### Load required packages

In [None]:
%pip install -q semantic-kernel==1.9.0 python-dotenv psycopg[binary,pool] azure-search-documents azure-identity

In [1]:
import semantic_kernel as sk
from semantic_kernel.memory import SemanticTextMemory, VolatileMemoryStore
from semantic_kernel.core_plugins import TextMemoryPlugin
from semantic_kernel.connectors.ai.open_ai import (
    AzureChatCompletion,
    AzureTextEmbedding,
)
from semantic_kernel.connectors.memory.postgres.postgres_memory_store import (
    PostgresMemoryStore,
)
from semantic_kernel.connectors.memory.azure_cognitive_search import AzureCognitiveSearchMemoryStore, AzureAISearchSettings
from semantic_kernel.connectors.ai.open_ai import AzureChatPromptExecutionSettings, OpenAIChatPromptExecutionSettings
from semantic_kernel.prompt_template import InputVariable, PromptTemplateConfig
from semantic_kernel.functions import KernelArguments

from dotenv import load_dotenv
import os

COLLECTION_NAME = "PYCollection"
ADA_EMBEDDINGS_SIZE = 1536

### Load the environment variables

In [2]:
load_dotenv()
endpoint = os.getenv("GPT_OPENAI_ENDPOINT")
api_key = os.getenv("GPT_OPENAI_KEY")
gpt_deployment_name = os.getenv("GPT_OPENAI_DEPLOYMENT_NAME")
conn_str = os.getenv("PG_CONN_STR_PY")
ada_deployment_name = "text-embedding-ada-002"
ai_search_endpoint = os.getenv("AI_SEARCH_ENDPOINT")
ai_search_key = os.getenv("AI_SEARCH_KEY")
#print(endpoint, api_key, gpt_deployment_name, conn_str, ai_search_endpoint, ai_search_key)

### Get a kernel instance configured for text completions and embeddings

In [3]:
kernel = sk.Kernel()
kernel.add_service(AzureChatCompletion("gpt",deployment_name=gpt_deployment_name, endpoint=endpoint, api_key=api_key))
embedding_generator = AzureTextEmbedding("ada",deployment_name=ada_deployment_name, endpoint=endpoint, api_key=api_key)
kernel.add_service(embedding_generator)

In [19]:
#mem_store=VolatileMemoryStore()
#mem_store = PostgresMemoryStore(conn_str,ADA_EMBEDDINGS_SIZE,1,3)
mem_store = AzureCognitiveSearchMemoryStore(vector_size=ADA_EMBEDDINGS_SIZE,
                                            search_endpoint=ai_search_endpoint,
                                            admin_key=ai_search_key) 
if await mem_store.does_collection_exist(COLLECTION_NAME):
    await mem_store.delete_collection(COLLECTION_NAME)

# async with AzureCognitiveSearchMemoryStore(vector_size=ADA_EMBEDDINGS_SIZE,search_endpoint=ai_search_endpoint,admin_key=ai_search_key) as acs_connector:
#     pass

In [None]:
memory = SemanticTextMemory(storage=mem_store, embeddings_generator=embedding_generator)
kernel.add_plugin(TextMemoryPlugin(memory), "TextMemoryPlugin")
print("Kernel is ready to use")

## Ingestion

### Read the files and chunk them by paragraph

In [14]:
def read_file(file: str)->str:
    with open(file, "r") as f:
        return f.read()
    
def ingest_content(path:str):
    import os
    chunks = []
    files = os.listdir(path)
    for f in files:
        if f.endswith("water.txt"):            
            content = read_file("data/"+f)
            paragraphs = content.split("\n\n")
            l = len(paragraphs)
            id = 1
            for p in paragraphs:
                lid = f"{f}-{l}-{id}"
                c = {"id":lid,"chunk":p,"file":f}
                chunks.append(c)
                id += 1
    return chunks

chunks = ingest_content("data")

### Save the chunks and embeddings in the vector database

In [15]:
async def populate_memory(memory: SemanticTextMemory, chunks: list) -> None:
    for chunk in chunks:
        await memory.save_information(collection=COLLECTION_NAME, id=chunk["id"], text=chunk["chunk"], description=chunk["file"])

await populate_memory(memory, chunks)

## Grounding

### Find memories based on query, and collect the text in the memories to augment the prompt

In [16]:
async def search_memory_examples(memory, question: str, limit: int=3, relevance=0.75) -> list:
    results = await memory.search(COLLECTION_NAME, question,limit,relevance)
    return results


## Build a context from the text chunks in the memories

In [None]:
question = "What is the chemical composition of water?"
results = await search_memory_examples(memory, question)
prompt_context = "Context: \"\"\"\n"

for result in results:
    prompt_context += f"Text:\n{result.text}\nSource:\n{result.description}\n"
    
prompt_context += "\"\"\""
prompt_context

## Process Prompt & Completion

### Create a SK function

In [None]:
rag_prompt = """
{{$input}}

{{ $context }}
""".strip()

arguments = KernelArguments(input=question, context=prompt_context)
execution_settings = AzureChatPromptExecutionSettings(
        service_id="gpt",
        max_tokens=50,
        temperature=0.1,
    )

answer = await kernel.invoke_prompt(rag_prompt,arguments=arguments,service_id="gpt",execution_settings=execution_settings)
print(answer)