# Azure Cosmos DB - Chat History Queries

This notebook demonstrates query patterns for chat history data in Azure Cosmos DB, including SQL queries and vector similarity search.

**Prerequisites:**
- Run the `cosmosdb-insert.ipynb` notebook first to seed the data
- Ensure you have a valid .env file with Cosmos DB connection details

## Setup

Install required packages and configure the Cosmos DB connection.

**Note:** If you've already run the insert notebook in this session, you may skip the setup steps.

**Step 1**: Install NuGet packages

**IMPORTANT**: If you get version conflict errors, restart the kernel first (click "Clear All Outputs" or restart the notebook kernel).

We'll use:
- `Microsoft.Azure.Cosmos` - Azure Cosmos DB SDK for .NET
- `DotNetEnv` - Load environment variables from .env file
- `System.Text.Json` - JSON serialization
- `Azure.AI.OpenAI` - Azure OpenAI for embeddings
- `Azure.Identity` - Authentication

In [None]:
#r "nuget: DotNetEnv, 3.1.0"
#r "nuget: System.Text.Json, 8.0.0"
#r "nuget: Azure.Identity, 1.13.1"
#r "nuget: Azure.AI.OpenAI, 2.1.0"
#r "nuget: Microsoft.Azure.Cosmos, 3.43.1"

**Step 2**: Load environment variables from .env file

The .env file in the root directory contains connection strings and configuration.

In [None]:
using DotNetEnv;
using System.IO;

var envFilePath = Path.Combine(Directory.GetCurrentDirectory(), "..", ".env");
if (File.Exists(envFilePath))
{
    Env.Load(envFilePath);
    Console.WriteLine($"Loaded environment variables from .env");
}
else
{
    Console.WriteLine($"No .env file found at {envFilePath}");
}

var cosmosEndpoint = Environment.GetEnvironmentVariable("COSMOS_DB_ENDPOINT");
var databaseName = Environment.GetEnvironmentVariable("COSMOS_DB_DATABASE_NAME");
var containerName = Environment.GetEnvironmentVariable("COSMOS_DB_CHAT_HISTORY_CONTAINER");

Console.WriteLine($"Cosmos DB Endpoint: {cosmosEndpoint}");
Console.WriteLine($"Database: {databaseName}");
Console.WriteLine($"Container: {containerName}");

**Step 3**: Define data models

Create C# classes to represent the chat history records returned by queries.

In [None]:
using System.Text.Json.Serialization;

public class ChatHistoryRecord
{
    [JsonPropertyName("id")]
    public string id { get; set; }
    
    public string ApplicationId { get; set; }
    
    public string UserId { get; set; }
    
    public string ThreadId { get; set; }
    
    public string Role { get; set; }
    
    public string Content { get; set; }
    
    public float[] ContentEmbedding { get; set; }
    
    public DateTime CreatedAt { get; set; }
    
    [JsonPropertyName("_ts")]
    public long _ts { get; set; }
}

Console.WriteLine("Models defined");

**Step 4**: Connect to Cosmos DB

Establish connection using the connection string or endpoint/key pair.

In [None]:
using Microsoft.Azure.Cosmos;
using System.Diagnostics;

CosmosClient client = null;
var cosmosConnectionString = Environment.GetEnvironmentVariable("COSMOS_DB_CONNECTION_STRING");
client = new CosmosClient(cosmosConnectionString);
var container = client.GetContainer(databaseName, containerName);
Console.WriteLine($"Container reference obtained: {containerName}");

---

## Querying Chat History

Let's explore different ways to query chat history data using Azure Cosmos DB SQL API.

**Query 1**: Get all chat messages for a specific user

In [None]:
var userId = "user-001";
var query = $"SELECT * FROM c WHERE c.UserId = @userId ORDER BY c.CreatedAt";

var queryDefinition = new QueryDefinition(query)
    .WithParameter("@userId", userId);

var iterator = container.GetItemQueryIterator<ChatHistoryRecord>(queryDefinition);

Console.WriteLine($"Chat history for user: {userId}\n");
Console.WriteLine(new string('-', 80));

while (iterator.HasMoreResults)
{
    var response = await iterator.ReadNextAsync();
    foreach (var item in response)
    {
        Console.WriteLine($"[{item.CreatedAt:yyyy-MM-dd HH:mm}] {item.Role}: {item.Content}");
        Console.WriteLine();
    }
}

Console.WriteLine(new string('-', 80));

**Query 2**: Get recent messages (last 7 days)

In [None]:
var sevenDaysAgo = DateTime.UtcNow.AddDays(-7);
var query = "SELECT * FROM c WHERE c.CreatedAt >= @sevenDaysAgo ORDER BY c.CreatedAt DESC";

var queryDefinition = new QueryDefinition(query)
    .WithParameter("@sevenDaysAgo", sevenDaysAgo);

var iterator = container.GetItemQueryIterator<ChatHistoryRecord>(queryDefinition);

Console.WriteLine($"Recent messages (last 7 days):\n");
Console.WriteLine(new string('-', 80));

var count = 0;
while (iterator.HasMoreResults)
{
    var response = await iterator.ReadNextAsync();
    foreach (var item in response)
    {
        count++;
        Console.WriteLine($"[{item.CreatedAt:yyyy-MM-dd HH:mm}] {item.UserId} - {item.Role}");
        Console.WriteLine($"   {item.Content.Substring(0, Math.Min(60, item.Content.Length))}...");
        Console.WriteLine();
    }
}

**Query 3**: Search messages containing specific keywords

In [None]:
var keyword = "seafood";
var query = "SELECT * FROM c WHERE CONTAINS(c.Content, @keyword, true)";

var queryDefinition = new QueryDefinition(query)
    .WithParameter("@keyword", keyword);

var iterator = container.GetItemQueryIterator<ChatHistoryRecord>(queryDefinition);

Console.WriteLine($"Messages containing '{keyword}':\n");
Console.WriteLine(new string('-', 80));

while (iterator.HasMoreResults)
{
    var response = await iterator.ReadNextAsync();
    foreach (var item in response)
    {
        Console.WriteLine($"[{item.ThreadId}] {item.Role}:");
        Console.WriteLine($"   {item.Content}");
        Console.WriteLine();
    }
}

Console.WriteLine(new string('-', 80));

---

## Semantic Search with Vector Embeddings

Now let's demonstrate **semantic search** using vector embeddings! This is how the agent "remembers" past conversations based on meaning, not just keywords.

**Setup Azure OpenAI Embedding Client**

Configure the embedding client to generate vectors for semantic search queries.

In [None]:
using Azure.AI.OpenAI;
using Azure.Identity;

var azureAIEndpoint = Environment.GetEnvironmentVariable("AZURE_AI_FOUNDRY_SERVICE_ENDPOINT");
var embeddingModelName = Environment.GetEnvironmentVariable("AZURE_EMBEDDING_MODEL_NAME") ?? "text-embedding-ada-002";

var credential = new DefaultAzureCredential();
var azureOpenAIClient = new AzureOpenAIClient(new Uri(azureAIEndpoint), credential);
var embeddingClient = azureOpenAIClient.GetEmbeddingClient(embeddingModelName);

Console.WriteLine($"Azure OpenAI Embedding Client configured");
Console.WriteLine($"Model: {embeddingModelName}");

**Semantic Search Query**

Use vector embeddings to find conversations based on semantic similarity, not just exact keyword matches.

In [None]:
// User query - try different questions!
var userQuery = "What was that seafood place you mentioned?";
// Other queries to try:
// var userQuery = "When is my flight to Brisbane?";
// var userQuery = "What was that seafood place you mentioned?";
// var userQuery = "Where should I eat near the Opera House?";

Console.WriteLine($"Searching for: \"{userQuery}\"\n");

// Step 1: Generate embedding for the user's query
var queryEmbeddingResponse = await embeddingClient.GenerateEmbeddingAsync(userQuery);
var queryEmbedding = queryEmbeddingResponse.Value.ToFloats().ToArray();

// Step 2: Use Cosmos DB vector search to find similar conversations
var vectorSearchQuery = @"
SELECT TOP 5 
    c.id, 
    c.UserId, 
    c.ThreadId, 
    c.Role, 
    c.Content,
    c.CreatedAt,
    VectorDistance(c.ContentEmbedding, @queryEmbedding) AS SimilarityScore
FROM c
WHERE c.ApplicationId = @applicationId
ORDER BY VectorDistance(c.ContentEmbedding, @queryEmbedding)
";

var queryDef = new QueryDefinition(vectorSearchQuery)
    .WithParameter("@queryEmbedding", queryEmbedding)
    .WithParameter("@applicationId", "ContosoTravelApp");

var searchIterator = container.GetItemQueryIterator<dynamic>(queryDef);

Console.WriteLine("Most relevant conversations:\n");
Console.WriteLine(new string('=', 80));

var resultCount = 0;
while (searchIterator.HasMoreResults)
{
    var response = await searchIterator.ReadNextAsync();
    foreach (var item in response)
    {
        resultCount++;
        var createdAt = DateTime.Parse(item.CreatedAt.ToString());
        var score = (double)item.SimilarityScore;
        
        Console.WriteLine($"\n#{resultCount} - Similarity: {(1-score)*100:F1}% | {item.Role} | {createdAt:yyyy-MM-dd}");
        Console.WriteLine($"Thread: {item.ThreadId} | User: {item.UserId}");
        Console.WriteLine($"\n{item.Content}");
        Console.WriteLine(new string('-', 80));
    }
}

if (resultCount == 0)
{
    Console.WriteLine("\nNo similar conversations found.");
}
else
{
    Console.WriteLine($"\nFound {resultCount} relevant conversation(s) based on semantic similarity!");
}