# Using Azure Table Storage

## Preparation

Include necessary packages:

In [1]:
#r "nuget: Azure.Data.Tables"
#r "nuget: MetadataExtractor"

Define some constants we would need later:

In [None]:
const string AzureConnectionString = "DefaultEndpointsProtocol=https;AccountName=ztechtest;AccountKey=XXX;EndpointSuffix=core.windows.net";
const string AzureTableName = "files";
const string RootFolderName = @"D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025";
var ImageFileExtensions = new[] { ".jpg", ".jpeg" };

Import namespaces:

In [3]:
using Azure;
using Azure.Data.Tables;
using System.IO;

## Data structures

The entities have to implement `ITableEntity` interface, which contains required properties. Let's define `FileMetadata` class containing properties common for all file types:

In [4]:
public class FileMetadata : ITableEntity {

    // Properties required for Azure Table Storage

    public required string PartitionKey { get; set; }

    public required string RowKey { get; set; }

    public DateTimeOffset? Timestamp { get; set; }
    
    public ETag ETag { get; set; }

    // Record type discriminator

    public virtual string RecordType => nameof(FileMetadata);

    // Common properties for all file types

    public required string OriginalFileName { get; set; } 

    public required long Size { get; set; }
    
}

Now define `ImageFileMetadata` class, which is derived from `FileMetadata` and adds some image-specific properties:

In [5]:
#nullable enable

public class ImageFileMetadata : FileMetadata {

    // Record type discriminator 

    public override string RecordType => nameof(ImageFileMetadata);

    // Properties specific to image files

    public int Width { get; set; }

    public int Height { get; set; }

    public string? CameraModel { get; set; }

    public string? ExposureTime { get; set; }

    public string? FNumber { get; set; } // Aperture value

    public string? Iso { get; set; }

}

## Helper functions

This function will compute MD5 hash of given string. We will use hashes for partition and row keys (instead of the raw data), because keys have limited lengths (1024 chars) and cannot contain some special characters. So to play it safe, it's generally a good idea to use hashes. We can safely use MD5, because collision attacks are not relevant for this scenario and MD5 is fast, cheap and short.

In [6]:
static string GetMD5Hash(string s) {
    var inputBytes = System.Text.Encoding.UTF8.GetBytes(s);
    var hashBytes = System.Security.Cryptography.MD5.HashData(inputBytes);
    return Convert.ToHexStringLower(hashBytes);
}

This function will load metadata from generic file:

In [7]:
static FileMetadata ProcessOtherFile(FileInfo file, string partitionKey) => new() {
    PartitionKey = partitionKey,
    RowKey = GetMD5Hash(file.Name),
    OriginalFileName = file.FullName,
    Size = file.Length
};

This function will load metadata from image file:

In [8]:
static FileMetadata ProcessImageFile(FileInfo file, string partitionKey) {
    // Fill in common metadata first
    var metadata = new ImageFileMetadata {
        PartitionKey = partitionKey,
        RowKey = GetMD5Hash(file.Name),
        OriginalFileName = file.FullName,
        Size = file.Length
    };

    // Read metadata using the MetdataExtractor library
    var tagDirs = MetadataExtractor.ImageMetadataReader.ReadMetadata(file.FullName);
    var tags = tagDirs.SelectMany(dir => dir.Tags.Select(tag => new {
        Directory = dir.Name,
        Tag = tag.Name,
        Value = tag.Description
    }));

    // Extract image specific metadata properties
    var widthString = tags.FirstOrDefault(t => t.Directory == "JPEG" && t.Tag == "Image Width")?.Value ?? "0 pixels";
    var heightString = tags.FirstOrDefault(t => t.Directory == "JPEG" && t.Tag == "Image Height")?.Value ?? "0 pixels";
    metadata.Width = int.TryParse(widthString.Split(' ')[0], out var width) ? width : 0;
    metadata.Height = int.TryParse(heightString.Split(' ')[0], out var height) ? height : 0;
    metadata.CameraModel = tags.FirstOrDefault(t => t.Directory == "Exif IFD0" && t.Tag == "Model")?.Value;
    metadata.ExposureTime = tags.FirstOrDefault(t => t.Directory == "Exif SubIFD" && t.Tag == "Exposure Time")?.Value;
    metadata.FNumber = tags.FirstOrDefault(t => t.Directory == "Exif SubIFD" && t.Tag == "F-Number")?.Value;
    metadata.Iso = tags.FirstOrDefault(t => t.Directory == "Exif SubIFD" && t.Tag == "ISO Speed Ratings")?.Value;

    return metadata;
}

## Insert Data

Create table client and ensure table exists:

In [9]:
Console.Write("Initializing Azure Table Storage client...");

// Create TableServiceClient
var serviceClient = new TableServiceClient(AzureConnectionString);

// Create TableClient for the specified table
var tableClient = serviceClient.GetTableClient(AzureTableName);

// Ensure the table exists
await tableClient.CreateIfNotExistsAsync();

Console.WriteLine("OK");

Initializing Azure Table Storage client...OK


Get metadata for all files and insert them into table storage. We will insert the data in batches, to speed the process up and make it cheaper. We are billed per transaction and therefore using batches is cheaper. It also lowers the number of HTTP requests we have to do.

There are two limitations to batches:
* Batch can contain up to 100 operations.
* All operations must be over the same partition.

In [10]:
// Prepare to batch insert entities
var batch = new List<TableTransactionAction>();
var partitionKey = string.Empty;
var lastDirectory = string.Empty;
var itemCount = 0;
var batchCount = 0;

// Process all files in the directory 
var rootInfo = new DirectoryInfo(RootFolderName);
var files = rootInfo.GetFiles("*.*", SearchOption.AllDirectories).OrderBy(f => f.FullName);

foreach (var file in files) {
    // Commit batch if it reaches 100 items or if the folder changes
    if (batch.Count == 100 || !lastDirectory.Equals(file.DirectoryName, StringComparison.Ordinal)) {
        if (batch.Count > 0) {
            // Submit the batch of transactions
            Console.WriteLine();
            Console.Write($"Submitting batch of {batch.Count} items...");
            await tableClient.SubmitTransactionAsync(batch);
            batch.Clear();
            Console.WriteLine("OK");
            batchCount++;
        }

        // Update the partition key based on the current folder if it has changed
        if (!lastDirectory.Equals(file.DirectoryName, StringComparison.Ordinal)) {
            lastDirectory = file.DirectoryName ?? string.Empty;
            partitionKey = GetMD5Hash(lastDirectory);
            Console.Write($"Processing folder {lastDirectory} (PK: {partitionKey}):");
        }
    }

    // Process each file and extract metadata
    Console.Write(".");
    var metadata = ImageFileExtensions.Contains(file.Extension.ToLowerInvariant())
        ? ProcessImageFile(file, partitionKey)
        : ProcessOtherFile(file, partitionKey);
    itemCount++;

    // Add action to insert or update the metadata in Azure Table Storage
    batch.Add(new TableTransactionAction(TableTransactionActionType.UpsertMerge, metadata));
}

// Submit last batch
Console.WriteLine();
Console.Write($"Submitting batch of {batch.Count} items...");
await tableClient.SubmitTransactionAsync(batch);
batch.Clear();
Console.WriteLine("OK");
batchCount++;

Console.WriteLine($"Processed {itemCount} items in {batchCount} batches.");

Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\out\01 (PK: 8a7c8a6bb0d5a2d00e4161570b867b3a):................
Submitting batch of 16 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\out\02 (PK: c8abf999a2c820f78645767c56974682):.................
Submitting batch of 17 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\out\03 (PK: 4b239175b95bef714cbca491dbb39316):..............
Submitting batch of 14 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\out\04 (PK: dc95a4b21c8c26332eb1147f8167f2be):................
Submitting batch of 16 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\out\05 (PK: 74cefd8231976f397f67c5473bf99494):................
Submitting batch of 16 items...OK
Processing folder D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\out\06 (PK: 0a7b66c2dcbfc7ce112afee4d9087844):...............
Submitting batch of 15

# Retrieve data

Define function that will retrieve record from the table. 

If we would know the record type in advance, we can directly cast to that type when calling the `GetEntityIfExistsAsync<T>` method. We don't know until we read the `RecordType` property, so we have to use the base `TableEntity` type. We then have to manually construct the appropriate type and will its properties.

In [None]:
async Task DisplayMetadata(string fileName) {
    var partitionKey = GetMD5Hash(Path.GetDirectoryName(fileName) ?? string.Empty);
    var rowKey = GetMD5Hash(Path.GetFileName(fileName) ?? string.Empty);

    Console.WriteLine($"Retrieving entity with PartitionKey: {partitionKey}, RowKey: {rowKey}");

    var entityResponse = await tableClient.GetEntityIfExistsAsync<TableEntity>(partitionKey, rowKey);
    if (entityResponse.HasValue) {
        var entity = entityResponse.Value;
        var recordType = entity.GetString("RecordType");
        Console.WriteLine($"Found entity of type: {recordType}");
        entity.Display();
    } else {
        Console.WriteLine("Entity not found.");
    }
}

Try to retrieve non-existent record:

In [12]:
await DisplayMetadata(@"D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues\none.txt");

Retrieving entity with PartitionKey: f102dad1a9aa6d66b0bb60bad230f962, RowKey: e0c3486987e432b4bdd1d966459b5fb1
Entity not found.


Retrieve non-image file with common metadata only:

In [13]:
await DisplayMetadata(@"D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues\intro.mp4");

Retrieving entity with PartitionKey: f102dad1a9aa6d66b0bb60bad230f962, RowKey: 06dc0bfdd9942dcf35f4efe7dbdcf110
Found entity of type: FileMetadata


key,type,value
odata.etag,System.String,"W/""datetime'2025-08-22T15%3A05%3A31.8183478Z'"""
PartitionKey,System.String,f102dad1a9aa6d66b0bb60bad230f962
RowKey,System.String,06dc0bfdd9942dcf35f4efe7dbdcf110
Timestamp,System.DateTimeOffset,2025-08-22 15:05:31Z
OriginalFileName,System.String,D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues\intro.mp4
RecordType,System.String,FileMetadata
Size,System.Int64,9008211


Retrieve image file:

In [14]:
await DisplayMetadata(@"D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues\IMG_20250821_214530323.jpg");

Retrieving entity with PartitionKey: f102dad1a9aa6d66b0bb60bad230f962, RowKey: 209f183beee89ddd16447e0c54187d8c
Found entity of type: ImageFileMetadata


key,type,value
odata.etag,System.String,"W/""datetime'2025-08-22T15%3A05%3A31.8183478Z'"""
PartitionKey,System.String,f102dad1a9aa6d66b0bb60bad230f962
RowKey,System.String,209f183beee89ddd16447e0c54187d8c
Timestamp,System.DateTimeOffset,2025-08-22 15:05:31Z
CameraModel,System.String,motorola edge 20 lite
ExposureTime,System.String,20003/1000000 sec
FNumber,System.String,f/1.9
Height,System.Int32,1800
Iso,System.String,598
OriginalFileName,System.String,D:\OneDrive-Altairis\VideoProduction\Z-TECH\Main-2025\src\08\06-AzureQueues\IMG_20250821_214530323.jpg


## Cleanup

In [15]:
Console.Write("Deleting Azure Table Storage table...");
await tableClient.DeleteAsync();
Console.WriteLine("OK");

Deleting Azure Table Storage table...OK
