# Cosmos DB in Fabric

## Python SDK Simple Functions Sample Notebook

This sample notebook demonstrates foundational operations for working with Cosmos DB in Fabric using the Python SDK. You'll learn essential concepts for building applications that interact with your Cosmos DB data.

### What You'll Learn
This notebook covers the following key concepts:

- **Authentication** - Secure access to Cosmos DB in Fabric using token credentials
- **Database & Container Management** - Initialize clients and work with containers
- **Querying Data** - Efficient parameterized queries with partition key optimization
- **Point Reads** - High-performance document retrieval by ID
- **Data Operations** - Create containers and insert documents with different schemas

### Prerequisites
- **Runtime**: PySpark/Python runtime environment
- **Fabric Artifact**: Cosmos DB in Fabric artifact created in your workspace
- **Sample Data**: The sample dataset loaded in Cosmos DB Data Explorer


### Getting Started
Make sure your Cosmos DB in Fabric artifact is created and loaded with sample data from the Data Explorer before running this notebook.

In [None]:
#Install packages
%pip install azure-cosmos
%pip install azure-core

Let's gather the necessary imports and set our Cosmos DB in Fabric endpoint from the artifact settings. The Cosmos database name is the same as the Cosmos artifact name in your workspace. We'll use 'SampleData' as the container name for this notebook sample. 

[!NOTE] You don't need to specify a database name since each the Fabric endpoint maps to one database automatically. This is shown later as an optional step, just be sure to enable the code for that option.

In [None]:
#Imports and config values
import base64, json
from typing import Any, Optional

#from azure.cosmos.aio import CosmosClient why aio
from azure.cosmos import CosmosClient, PartitionKey, ThroughputProperties
from azure.core.credentials import TokenCredential, AccessToken


COSMOS_ENDPOINT = '' # The Cosmos DB artifact endpoint from the artifact settings
COSMOS_DATABASE_NAME = '' # The Cosmos DB artifact name is the database name
COSMOS_CONTAINER_NAME = 'SampleData'

The `FabricTokenCredential` class handles secure authentication to Cosmos DB in Fabric by automatically managing token acquisition and refresh for Cosmos DB in Fabric authentication.

In [None]:
## Authentication Class

class FabricTokenCredential(TokenCredential):
    """Token credential for Fabric Cosmos DB access with automatic refresh and retry logic."""
    
    def get_token(self, *scopes: str, claims: Optional[str] = None, tenant_id: Optional[str] = None,
                  enable_cae: bool = False, **kwargs: Any) -> AccessToken:
        access_token = notebookutils.credentials.getToken("https://cosmos.azure.com/.default")
        parts = access_token.split(".")
        if len(parts) < 2:
            raise ValueError("Invalid JWT format")
        payload_b64 = parts[1]
        # Fix padding
        padding = (-len(payload_b64)) % 4
        if padding:
            payload_b64 += "=" * padding
        payload_json = base64.urlsafe_b64decode(payload_b64.encode("utf-8")).decode("utf-8")
        payload = json.loads(payload_json)
        exp = payload.get("exp")
        if exp is None:
            raise ValueError("exp claim missing in token")
        return AccessToken(token=access_token, expires_on=exp)

Next, let's initialize all the clients we need to access and modify our Cosmos DB in Fabric artifact.

In [None]:
# Initialize Cosmos DB cosmos client
COSMOS_CLIENT = CosmosClient(COSMOS_ENDPOINT, FabricTokenCredential())

# [!OPTIONAL] Autoload your database artifact name
# In Fabric, the Cosmos DB endpoint maps to one database automatically
# This eliminates the need to specify a database name once you provide the endpoint

# database_list = list(COSMOS_CLIENT.list_databases())
# COSMOS_DATABASE_NAME = database_list[0]['id']

# Initialize Cosmos DB database client
DATABASE_CLIENT = COSMOS_CLIENT.get_database_client(COSMOS_DATABASE_NAME)

# Intialize Cosmos DB container client
CONTAINER_CLIENT = DATABASE_CLIENT.get_container_client(COSMOS_CONTAINER_NAME) # Default is SampleData

Now let's run a query to see some of our sample data

In [None]:
# Query example - Get all Electronics products from our sample data
# Since partition key is 'categoryName', this query is efficient and stays within one partition
queryText = "SELECT * FROM c WHERE c.categoryName = @categoryName"

results = CONTAINER_CLIENT.query_items(
    query=queryText,
    parameters=[
        dict(
            name="@categoryName",
            # docType="product",  # Optional: filter by document type
            value="Devices, Tablets"  # Matches the sample data category
        )
    ],
    enable_cross_partition_query=False,  # False since we're filtering by partition key
)

# Display the results
print("QUERY EXAMPLE")
print("Electronics Items Found:")
print("=" * 60)

# The query returns all Electronics documents, which may have different structures based on their type.
# This demonstrates Cosmos DB's schema flexibility where different document types can coexist in the same container.
# We'll handle 'product' and 'customerRating' documents differently since they have unique properties.

for item in results:
    doc_type = item.get('docType')
    
    if doc_type == 'product':
        # Handle product documents
        print("📦 PRODUCT")
        print(f"   Product ID: {item.get('productId', 'N/A')}")
        print(f"   Name: {item.get('name', 'N/A')}")
        print(f"   Description: {item.get('description', 'N/A')[:100]}...")
        print(f"   Current Price: ${item.get('currentPrice', 'N/A')}")
        print(f"   Inventory: {item.get('inventory', 'N/A')} units")
        print(f"   Country: {item.get('countryOfOrigin', 'N/A')}")
        print(f"   First Available: {item.get('firstAvailable', 'N/A')}")
        
    elif doc_type == 'review':
        # Handle customer review documents  
        print("⭐ CUSTOMER REVIEW")
        print(f"   Product ID: {item.get('productId', 'N/A')}")
        print(f"   Customer: {item.get('customerName', 'N/A')}")
        print(f"   Rating: {item.get('stars', 'N/A')}/5 stars")
        print(f"   Date: {item.get('reviewDate', 'N/A')}")
        # Display first 100 characters of review text if available
        review_text = item.get('reviewText', '')
        if review_text:
            preview = review_text[:100] + "..." if len(review_text) > 100 else review_text
            print(f"   Review: {preview}")
        
    else:
        # Handle unknown document types
        print(f"❓ UNKNOWN TYPE: {doc_type}")
        print(f"   ID: {item.get('id', 'N/A')}")
        
    print("-" * 60)

Now let's run a point read operation

In [None]:
# Point read example - Get a specific item by ID and partition key (category)
# This is the most efficient way to retrieve a single document
item_id = "cb919d62-80e4-4234-9403-b1f272e0c020"
partition_key = "Devices, Tablets"  # Using categoryName as partition key

# Point read using read_item method
item = CONTAINER_CLIENT.read_item(item=item_id, partition_key=partition_key)

# Display the results
print("POINT READ EXAMPLE")
print(f"   Product ID: {item.get('productId', 'N/A')}")
print(f"   Name: {item.get('name', 'N/A')}")
print(f"   Description: {item.get('description', 'N/A')[:100]}...")
print(f"   Current Price: ${item.get('currentPrice', 'N/A')}")
print(f"   Inventory: {item.get('inventory', 'N/A')} units")
print(f"   Country: {item.get('countryOfOrigin', 'N/A')}")
print(f"   First Available: {item.get('firstAvailable', 'N/A')}")

Let's explore adding new data to a seperate container related to this initial product catalog data in SampleData. 

Now let's create a new container

In [None]:
# Create a new container with customerId as partition key
COSMOS_CONTAINER_NAME2 = "SampleOrders"

try:
    CONTAINER_CLIENT2 = DATABASE_CLIENT.create_container(
        id=COSMOS_CONTAINER_NAME2,
        partition_key=PartitionKey(path="/customerId"),
        offer_throughput=ThroughputProperties(auto_scale_max_throughput=5000)  # Required: Set autoscale throughput or an error will be thrown
    )
    print(f"Container {COSMOS_CONTAINER_NAME2} created.")
except Exception as e:
    print(f"Error creating container: {e}")

# List all containers in the database
print(f"Containers in database '{COSMOS_DATABASE_NAME}':")
for container in DATABASE_CLIENT.list_containers():
    print(f"- {container['id']}")

Now let's create a new item for a customer order within the new container

In [None]:
# Define a customer order document (item) to insert
import uuid

customer_order = {
    "id": str(uuid.uuid4()),
    "customerId": "cust-456789", 
    "orderDate": "2025-10-15T10:30:00",
    "docType": "customerOrder",
    "status": "shipped",
    "totalAmount": 1053.54,
    "shippingAddress": {
        "street": "123 Main St",
        "city": "Seattle", 
        "state": "WA",
        "zipCode": "98101"
    },
    "items": [
        {
            "productId": "a74e7af9-7e13-40cd-90df-7b5e172a8acc",
            "productName": "HyperType Pro K100 RGB Mechanical Keyboard",
            "quantity": 2,
            "purchasePrice": 133.12,
            "categoryName": "Peripherals, Keyboards"
        },
        {
            "productId": "f9619b13-30a7-4ad4-ae00-9817576fba81",
            "productName": "InfinityCore Apex 12 Pro 5G",
            "quantity": 1, 
            "purchasePrice": 787.30,
            "categoryName": "Devices, Smartphones"
        }
    ],
    "paymentMethod": "credit_card",
    "trackingNumber": "1Z999AA1234567890"
}

# Write the item to the container
try:
    CONTAINER_CLIENT2.create_item(customer_order)
    print(f"Customer order created successfully!")
    print(f"Order ID: {customer_order['id']}")
    print(f"Total: ${customer_order['totalAmount']}")
    print(f"Status: {customer_order['status']}")
except Exception as e:
    print(f"Error creating order: {e}")

Let's read the new order item

In [None]:
# Point read example - Get the customer order by ID and partition key (customerId)
order_id = customer_order['id']
partition_key = customer_order['customerId']

# Point read using read_item method
order = CONTAINER_CLIENT2.read_item(item=order_id, partition_key=partition_key)

# Display the results
print(f"Order ID: {order['id']}")
print(f"Customer: {order['customerId']}")
print(f"Status: {order['status']}")
print(f"Total: ${order['totalAmount']}")
print(f"Items: {len(order['items'])}")

Lastly, let's change the autoscale throughput range for our container. 5000 RU/s is the default autoscale max throughput set for a new container created through the artifact UI. This can be set up to 50,000 RU/s through the SDK. For anything higher please open a support ticket [here](https://app.fabric.microsoft.com/admin-portal/supportCenter) and our team will help you increase your limits further.

In [None]:
# Get current container throughput settings
throughput_properties = CONTAINER_CLIENT2.get_throughput()
current_autoscale_max = throughput_properties.auto_scale_max_throughput

print(f"Current autoscale max throughput: {current_autoscale_max} RU/s")

# Update container throughput - can be set between 1,000 and 50,000 RU/s for autoscale
new_max_throughput = 6000

CONTAINER_CLIENT2.replace_throughput(
    throughput=ThroughputProperties(
        auto_scale_max_throughput=new_max_throughput,
        #auto_scale_increment_percent=10  # Optional
    )
)

print("Container throughput updated successfully!")
print(f"Updated autoscale max throughput: {new_max_throughput} RU/s")

## Conclusion

This notebook sample demonstrates foundational operations for working with Cosmos DB in Fabric using the Python SDK. You've learned how to:

- **Authenticate** to Cosmos DB in Fabric using the `FabricTokenCredential` class
- **Query data** efficiently using parameterized queries and partition keys
- **Perform point reads** for optimal performance and cost
- **Create containers** with appropriate partition key strategies
- **Insert new documents** and manage different document types
- **Manage container throughput** by updating autoscale settings for performance optimization

### Next Steps

- Explore more advanced querying capabilities with aggregations and joins
- Learn about indexing strategies for optimal performance
- Implement change feed processing for real-time data processing

### Resources

For comprehensive documentation and tutorials, visit:
- **Cosmos DB in Fabric Overview**: [https://learn.microsoft.com/fabric/database/cosmos-db/overview](https://learn.microsoft.com/fabric/database/cosmos-db/overview)
- **Python SDK Documentation**: [https://docs.microsoft.com/azure/cosmos-db/sql/sql-api-python-guide](https://docs.microsoft.com/azure/cosmos-db/sql/sql-api-python-guide)