# Using Azure Table Storage with dynamic data

## Preparation

Include necessary packages:

In [118]:
#r "nuget: Azure.Data.Tables"
#r "nuget: MetadataExtractor"

Define some constants we would need later:

In [None]:
const string AzureConnectionString = "DefaultEndpointsProtocol=https;AccountName=ztechtest;AccountKey=XXX;EndpointSuffix=core.windows.net";
const string RootFolderName = @"D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08";
static string AzureTableName = "files" + Guid.NewGuid().ToString()[..8];

Import namespaces:

In [120]:
using Azure;
using Azure.Data.Tables;
using System.IO;
using System.Text.RegularExpressions;

## Helper function

This function will compute MD5 hash of given string. We will use hashes for partition and row keys (instead of the raw data), because keys have limited lengths (1024 chars) and cannot contain some special characters. So to play it safe, it's generally a good idea to use hashes. We can safely use MD5, because collision attacks are not relevant for this scenario and MD5 is fast, cheap and short.

In [121]:
static string GetMD5Hash(string s) {
    var inputBytes = System.Text.Encoding.UTF8.GetBytes(s);
    var hashBytes = System.Security.Cryptography.MD5.HashData(inputBytes);
    return Convert.ToHexStringLower(hashBytes);
}

## Insert Data

Create table client and ensure table exists:

In [122]:
Console.Write($"Initializing Azure Table Storage client for table {AzureTableName}...");

// Create TableServiceClient
var serviceClient = new TableServiceClient(AzureConnectionString);

// Create TableClient for the specified table
var tableClient = serviceClient.GetTableClient(AzureTableName);

// Ensure the table exists
await tableClient.CreateIfNotExistsAsync();

Console.WriteLine("OK");

Initializing Azure Table Storage client for table filesd6af978d...OK


Get metadata for all files and insert them into table storage. The structure is dynamic, based on metadata available in the file. See [documentation](https://learn.microsoft.com/en-us/rest/api/storageservices/understanding-the-table-service-data-model) for limitations on property names, types and values.

We will insert the data in batches, to speed the process up and make it cheaper. We are billed per transaction and therefore using batches is cheaper. It also lowers the number of HTTP requests we have to do.

There are two limitations to batches:
* Batch can contain up to 100 operations.
* All operations must be over the same partition.

In [123]:
// Prepare to batch insert entities
var batch = new List<TableTransactionAction>();
var partitionKey = string.Empty;
var lastDirectory = string.Empty;
var itemCount = 0;
var batchCount = 0;

// Process all files in the directory 
var rootInfo = new DirectoryInfo(RootFolderName);
var files = rootInfo.GetFiles("*.*", SearchOption.AllDirectories).OrderBy(f => f.FullName);

foreach (var file in files) {
    // Commit batch if it reaches 100 items or if the folder changes
    if (batch.Count == 100 || !lastDirectory.Equals(file.DirectoryName, StringComparison.Ordinal)) {
        if (batch.Count > 0) {
            // Submit the batch of transactions
            Console.WriteLine();
            Console.Write($"Submitting batch of {batch.Count} items...");
            await tableClient.SubmitTransactionAsync(batch);
            batch.Clear();
            Console.WriteLine("OK");
            batchCount++;
        }

        // Update the partition key based on the current folder if it has changed
        if (!lastDirectory.Equals(file.DirectoryName, StringComparison.Ordinal)) {
            lastDirectory = file.DirectoryName ?? string.Empty;
            partitionKey = GetMD5Hash(lastDirectory);
            Console.Write($"Processing folder {lastDirectory} (PK: {partitionKey}):");
        }
    }

    // Process each file and extract metadata
    Console.Write(".");
    var metadata = new Dictionary<string, object> {
        {"PartitionKey", partitionKey},
        {"RowKey", GetMD5Hash(file.Name)},
        {"OriginalFileName", file.FullName},
        {"Size", file.Length}
    };

    // Extract image metadata, if possible
    try {
        static string toPropertyName(string dirName, string tagName) 
            => string.Join("_", Regex.Replace(dirName, @"[^A-Za-z0-9]", ""), Regex.Replace(tagName, @"[^A-Za-z0-9]", ""));

        var tagDirs = MetadataExtractor.ImageMetadataReader.ReadMetadata(file.FullName);
        var tags = tagDirs.SelectMany(dir => dir.Tags.Where(tag => !string.IsNullOrEmpty(tag.Description)).Select(tag => new KeyValuePair<string, object>(toPropertyName(dir.Name, tag.Name), tag.Description)));
        metadata = metadata.Concat(tags).ToDictionary(kvp => kvp.Key, kvp => kvp.Value);
    }
    catch {
        // Ignore errors in metadata extraction
    }

    // Add action to insert or update the metadata in Azure Table Storage
    var entity = new TableEntity(metadata);
    batch.Add(new TableTransactionAction(TableTransactionActionType.UpsertMerge, entity));
    itemCount++;
}

// Submit last batch
Console.WriteLine();
Console.Write($"Submitting batch of {batch.Count} items...");
await tableClient.SubmitTransactionAsync(batch);
batch.Clear();
Console.WriteLine("OK");
batchCount++;

Console.WriteLine($"Processed {itemCount} items in {batchCount} batches.");

Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\01-LedTubeWallMount (PK: 098a51024a36981db9d5097ac897b2ec):............
Submitting batch of 12 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\02-LackAttack (PK: a64c812d4f7d573e3df8725ad27bbd94):........
Submitting batch of 8 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\03-AzureBlobs (PK: 8aebea0da173086b2c2795bdeb19af06):........
Submitting batch of 8 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\04-ExifMetadata (PK: 6d544e1095702b1655d1a87ea0b2670c):..............
Submitting batch of 14 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\05-ExifDotNet (PK: 45cfde900c3d8125be4cba8a013f584e):........
Submitting batch of 8 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues (PK: f102dad1a9a

# Retrieve data

Define function that will retrieve record from the table. 

If we would know the record type in advance, we can directly cast to that type when calling the `GetEntityIfExistsAsync<T>` method. We don't know until we read the `RecordType` property, so we have to use the base `TableEntity` type. We then have to manually construct the appropriate type and will its properties.

In [124]:
async Task DisplayMetadata(string fileName) {
    var partitionKey = GetMD5Hash(Path.GetDirectoryName(fileName) ?? string.Empty);
    var rowKey = GetMD5Hash(Path.GetFileName(fileName) ?? string.Empty);

    Console.WriteLine($"Retrieving entity with PartitionKey: {partitionKey}, RowKey: {rowKey}");

    var entityResponse = await tableClient.GetEntityIfExistsAsync<TableEntity>(partitionKey, rowKey);
    if (entityResponse.HasValue) {
        var entity = entityResponse.Value;
        entity.Display();
    } else {
        Console.WriteLine("Entity not found.");
    }
}

Try to retrieve non-existent record:

In [125]:
await DisplayMetadata(@"D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues\none.txt");

Retrieving entity with PartitionKey: f102dad1a9aa6d66b0bb60bad230f962, RowKey: e0c3486987e432b4bdd1d966459b5fb1
Entity not found.


Retrieve non-image file with common metadata only:

In [126]:
await DisplayMetadata(@"D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues\intro.mp4");

Retrieving entity with PartitionKey: f102dad1a9aa6d66b0bb60bad230f962, RowKey: 06dc0bfdd9942dcf35f4efe7dbdcf110


key,type,value
odata.etag,System.String,"W/""datetime'2025-08-22T15%3A18%3A13.8488137Z'"""
PartitionKey,System.String,f102dad1a9aa6d66b0bb60bad230f962
RowKey,System.String,06dc0bfdd9942dcf35f4efe7dbdcf110
Timestamp,System.DateTimeOffset,2025-08-22 15:18:13Z
OriginalFileName,System.String,D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues\intro.mp4
Size,System.Int64,9008211


Retrieve image file:

In [127]:
await DisplayMetadata(@"D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues\IMG_20250821_214530323.jpg");

Retrieving entity with PartitionKey: f102dad1a9aa6d66b0bb60bad230f962, RowKey: 209f183beee89ddd16447e0c54187d8c


key,type,value
odata.etag,System.String,"W/""datetime'2025-08-22T15%3A18%3A13.8478182Z'"""
PartitionKey,System.String,f102dad1a9aa6d66b0bb60bad230f962
RowKey,System.String,209f183beee89ddd16447e0c54187d8c
Timestamp,System.DateTimeOffset,2025-08-22 15:18:13Z
ExifIFD0_DateTime,System.String,2025:08:21 21:45:31
ExifIFD0_ImageHeight,System.String,1800 pixels
ExifIFD0_ImageWidth,System.String,4000 pixels
ExifIFD0_Make,System.String,motorola
ExifIFD0_Model,System.String,motorola edge 20 lite
ExifIFD0_Orientation,System.String,"Top, left side (Horizontal / normal)"


## Cleanup

In [128]:
Console.Write("Deleting Azure Table Storage table...");
await tableClient.DeleteAsync();
Console.WriteLine("OK");

Deleting Azure Table Storage table...OK
