# Module 06: Define and implement an indexing strategy for Azure Cosmos DB SQL API

- [[Learning path]](https://docs.microsoft.com/en-us/learn/paths/define-implement-indexing-strategy-cosmos-db-sql-api/?ns-enrollment-type=Collection&ns-enrollment-id=1k8wcz8zooj2nx)
- [[Lab]](https://microsoftlearning.github.io/dp-420-cosmos-db-dev/instructions/11-default-indexing-policy.html): Review the default index policy for an Azure Cosmos DB SQL API container with the portal
- [[Lab]](https://microsoftlearning.github.io/dp-420-cosmos-db-dev/instructions/12-sdk-indexing-policy-custom.html): Configure an Azure Cosmos DB SQL API container’s index policy with the portal

## Demo setup

In [None]:
Connect-AzAccount
Set-AzContext -Subscription "b895a719-7034-411a-9944-ff196d1f450f"
$connectionString = (Get-AzCosmosDBAccountKey -ResourceGroupName rg-dp-420 -Name cosmos-dp-420-sql-provisioned -Type "ConnectionStrings")["Primary SQL Connection String"]
$primaryMasterKey = (Get-AzCosmosDBAccountKey -ResourceGroupName rg-dp-420 -Name cosmos-dp-420-sql-provisioned -Type "Keys")["PrimaryMasterKey"]
$documentEndpoint = (Get-AzCosmosDBAccount -ResourceGroupName rg-dp-420 -Name cosmos-dp-420-sql-provisioned).DocumentEndpoint

In [None]:
cosmicworks --endpoint $documentEndpoint --key $primaryMasterKey --datasets product

In [None]:
#r "nuget: Newtonsoft.Json, 13.0.1"
#r "nuget: Microsoft.Azure.Cosmos , 3.22.1"

#!share --from pwsh connectionString
#!share --from pwsh primaryMasterKey
#!share --from pwsh documentEndpoint

In [None]:
using Microsoft.Azure.Cosmos;
using System;
using System.Collections.Generic;

CosmosClient client = new (connectionString);
Database database = client.GetDatabase("cosmicworks");
Container container = database.GetContainer("products");

public class Product
{
	public string id { get; set; }
	public string categoryId { get; set; }
	public string categoryName { get; set; }
	public string sku { get; set; }
	public string name { get; set; }
	public string description { get; set; }
	public double price { get; set; }
}

## Define indexes in Azure Cosmos DB SQL API

### Understand indexes

- Every Azure Cosmos DB SQL API container has a built-in index policy. 
- By default, the index includes all properties of every item in the container.
- By default, all create, update, or delete operations update the index.

Example of the default policy in action:

Item 1 in the product container

```json
{ 
   "name": "Touring-1000 Blue", 
   "tags": [ 
              { "name": "bike" }, 
              { "name": "touring" }, 
              { "name": "blue" } 
           ] 
}
```

Item 2 in the product container

```json
{ 
    "name": "Mountain-400-W Silver", 
    "sku": "BK-M38S-38", 
    "tags": [ 
               { "name": "bike" }, 
               { "name": "silver" } 
            ] 
}
```

How is the index used when we run this query?

```sql
SELECT p.id 
FROM products p 
WHERE p.name = 'Touring-1000 Blue'
```

Index created for these product container items

![image](https://docs.microsoft.com/en-us/learn/wwl-data-ai/define-indexes-azure-cosmos-db-sql-api/media/2-search-tree-02.png)

### Understand indexing policies

The default indexing policy consists of the following settings:

- The inverted index is updated for all create, update, or delete operations on an item
- All properties for every item is automatically indexed
- Range indexes are used for all strings or numbers

Indexing policies are defined and managed in JSON. This is the default:

```json
{
  "indexingMode": "consistent",
  "automatic": true,
  "includedPaths": [
    {
      "path": "/*"
    }
  ],
  "excludedPaths": [
    {
      "path": "/\"_etag\"/?"
    }
  ]
}
```

### Indexing modes and Include/Exclude paths

Index policies can be updated to better meet your container’s usage patterns.

Configure indexing mode:

- **Consistent**: Updates index synchronously with all item modifications. Default mode.
- **None**: Disables indexing on a container. Useful for bulk operations.

Including and excluding paths:

Three primary operators are used when defining a property path:
- The ? operator indicates that a path terminates with a string or number (scalar) value
- The [] operator indicates that this path includes an array and avoids having to specify an array index value
- The * operator is a wildcard and matches any element beyond the current path

Consider this JSON object that represents a product item in our Azure Cosmos DB SQL API container:

```json
{
  "id": "8B363B8B-378E-402A-9E68-A935302000B8",
  "name": "HL Touring Frame - Yellow, 46",
  "category": {
    "id": "F3FBB167-11D8-41E4-84B4-5AAA92B1E737",
    "name": "Components, Touring Frames"
  },
  "metadata": {
    "sku": "FR-T98Y-46"
  },
  "price": 1003.91,
  "tags": [
    {
      "name": "accessory"
    },
    {
      "name": "yellow"
    },
    {
      "name": "frame"
    }
  ]
}
```
Path examples:

| **Path expression** | **Description** |
| ---: | :--- |
| **/\*** | All properties |
| **/name/?** | The scalar value of the **name** property |
| **/category/\*** | All properties under the **category** property |
| **/metadata/sku/?** | The scalar value of the **metadata.sku** property |
| **/tags/[]/name/?** | Within the **tags** array, the scalar values of all possible **name** properties |

### Review indexing policy strategies

An indexing policy:

- Is two sets of include/exclude expressions that evaluates which actual properties are indexed.
- Must include the root path and all possible values (/*) as either an included or excluded path.


Example indexing policy that includes all properties except category.id:

```json
{ 
    "indexingMode": "consistent", 
    "automatic": true, 
    "includedPaths": [ 
                       { 
                           "path": "/*" 
                       } 
                     ], 
    "excludedPaths": [ 
                       { 
                           "path": "/category/id/?“
                       } 
                     ] 
}
```
Example indexing policy excluding all properties and selectively indexes only the name and tags[].name properties:

```json
{ 
    "indexingMode": "consistent", 
    "automatic": true, 
    "includedPaths": [ 
                       { 
                           "path": "/name/?" 
                       }, 
                       { 
                           "path": "/tags/[]/name/?" 
                       } 
                     ], 
    "excludedPaths": [ 
                       { 
                           "path": "/*" 
                       } 
                     ] 
}
```

In [None]:
using Newtonsoft.Json;

var query = new QueryDefinition("SELECT TOP 10 * FROM products");
var requestOptions = new QueryRequestOptions() { PopulateIndexMetrics = false };
var iterator = container.GetItemQueryIterator<dynamic>(query, requestOptions: requestOptions);
var resultSet = await iterator.ReadNextAsync();
var diagnosticsJsonString = resultSet.Diagnostics.ToString();
dynamic jsonResponse = JsonConvert.DeserializeObject(diagnosticsJsonString);


Console.WriteLine(jsonResponse.children[1].children[1].children[0].children[0].data["Query Metrics"]);
Console.WriteLine($"RequestCharge: {resultSet.RequestCharge}");
Console.WriteLine($"IndexMetrics: {resultSet.IndexMetrics}");

//Console.WriteLine($"Diagnostics: {resultSet.Diagnostics}");

## Customize indexes in Azure Cosmos DB SQL API

### Customize the indexing policy

The .NET SDK ships with a Microsoft.Azure.Cosmos.IndexingPolicy class that is a representation of a JSON policy object.

Assume we would like to use the following index policy when we create a container

```json
{ 
    "indexingMode": "consistent", 
    "automatic": true, 
    "includedPaths": [ 
                       { 
                           "path": "/name/?" 
                       }, 
                       { 
                           "path": "/categoryName/?" 
                       } 
                     ], 
    "excludedPaths": [ 
                       { 
                            "path": "/*" 
                       } 
                     ]
}
```

Let’s use the SDK to define the policy, and create the container with that index policy

In [None]:
// first cleanup the existing container
//await database.DeleteAsync();
Database database = await client.CreateDatabaseIfNotExistsAsync("cosmicworks");

IndexingPolicy policy = new () 
{ 
    IndexingMode = IndexingMode.Consistent, 
    Automatic = true 
};

policy.IncludedPaths.Add( new IncludedPath{ Path = "/name/?" } ); 
policy.IncludedPaths.Add( new IncludedPath{ Path = "/categoryName/?" } );

policy.ExcludedPaths.Add( new ExcludedPath{ Path = "/*" } );

ContainerProperties options = new () { 
    Id = "products", 
    PartitionKeyPath = "/categoryId", 
    IndexingPolicy = policy }; 

Container container = await database.CreateContainerIfNotExistsAsync(options, throughput: 400);

// check the azure portal for index

### Evaluate composite indexes

For queries that sort or filter on multiple properties, create one or more composite indexes.

Let’s assume we will run the following queries:

```sql
-- This query has a filter on two properties.
-- Note that the direction (ASC/DESC) and order of the
-- composite index properties is not used by the filter.
SELECT * 
FROM products p 
WHERE p.price > 50 AND p.category = "Saddles" 
```

```sql
-- This query will sort the results on two properties.
-- Note that this composite index must match the order
-- of the properties.
SELECT * 
FROM products p 
ORDER BY p.price ASC, p.name ASC
```

We can define a composite index for each query type:

```json
{ 
    "indexingMode": "consistent", 
    "automatic": true, 
    "includedPaths": [ 
                       { 
                           "path": "/*" 
                       } 
                     ], 
    "excludedPaths": [ 
                       { 
                           "path": "/_etag/?" 
                       } 
                     ], 
    "compositeIndexes": 
                     [ 
                       [ 
                         { 
                            "path": "/category", 
                            "order": "ascending" 
                         }, 
                         { 
                            "path": "/price", 
                            "order": "descending" 
                         } 
                       ], 
                       [ 
                         { 
                            "path": "/price", 
                            "order": "ascending" 
                         }, 
                         { 
                            "path": "/name", 
                            "order": "ascending" 
                         } 
                       ]
                    ] 
}
```

## Demo teardown

In [None]:
await database.DeleteAsync();