Skip to content

Content Formatting

NEWDAY\N17781 edited this page Mar 25, 2026 · 13 revisions

Diagram notes can contain a lot of raw HTTP data — headers, JSON bodies, tokens, cookies, HTML responses. Without processing, this often results in diagrams that are noisy, unreadable, or too large to render. The content formatting system lets you transform, redact, and reshape this content before it appears in diagrams.

There are two levels of control:

  1. ReportConfigurationOptions.RequestResponsePostProcessor — A single Func<string, string> applied to both requests and responses after formatting. This is the simplest approach and sufficient for most projects.
  2. DiagramsFetcherOptions — Fine-grained control with separate pre- and post-processors for requests and responses independently. For advanced scenarios.

Quick Start: Using ReportConfigurationOptions

The simplest way to process diagram content is via RequestResponsePostProcessor on ReportConfigurationOptions. This single function is applied to both request and response content after the library has formatted it (JSON pretty-printed, headers laid out):

new ReportConfigurationOptions
{
    SpecificationsTitle = "My API Specifications",
    RequestResponsePostProcessor = content => content
        .RedactBearerTokens()
        .RedactAccessTokens()
        .SplitLongWords()
};

This is what most integration guides show. Internally, the library maps this to both RequestPostFormattingProcessor and ResponsePostFormattingProcessor on DiagramsFetcherOptions.


DiagramsFetcherOptions (Advanced)

For fine-grained control over how request/response bodies and headers are formatted in diagram notes, use DiagramsFetcherOptions directly. This is useful when you need different processing for requests vs responses, or when you need to transform the raw body before the library formats it.

var options = new DiagramsFetcherOptions
{
    PlantUmlServerBaseUrl = "https://plantuml.com/plantuml",
    RequestPreFormattingProcessor = content => content,
    RequestPostFormattingProcessor = content => content,
    ResponsePreFormattingProcessor = content => content,
    ResponsePostFormattingProcessor = content => content,
    ExcludedHeaders = ["Authorization", "X-Api-Key"]
};
Property Type Default Description
PlantUmlServerBaseUrl string "https://plantuml.com/plantuml" Base URL of the PlantUML server.
RequestPreFormattingProcessor Func<string, string>? null Transform raw request body before the library formats it into the PlantUML note.
RequestPostFormattingProcessor Func<string, string>? null Transform the formatted request note after the library has formatted it.
ResponsePreFormattingProcessor Func<string, string>? null Transform raw response body before the library formats it into the PlantUML note.
ResponsePostFormattingProcessor Func<string, string>? null Transform the formatted response note after the library has formatted it.
ExcludedHeaders IEnumerable<string> [] HTTP headers to exclude from diagram notes.
SeparateSetup bool false When true, HTTP calls made before StartAction() are wrapped in a visual "Setup" partition in the diagram. See Diagram Customisation.
HighlightSetup bool true When true (and SeparateSetup is enabled), the setup partition is rendered with a background colour. When false, the partition has no background colour.

Formatting Pipeline

The formatting pipeline for each request/response is:

Raw body → PreFormattingProcessor → Library formatting (JSON pretty-print, header layout) → PostFormattingProcessor → Diagram note
  • Pre-processor — Runs on the raw body text before the library formats it (JSON pretty-printing, header extraction, etc.). Use this to deserialise, decrypt, decompress, or restructure the raw content so the library's formatter can process it correctly.
  • Post-processor — Runs on the fully formatted note text after the library has laid it out. Use this to redact sensitive values, shorten long strings, strip noise, or wrap long lines. This is the one you'll use most often.

The distinction matters because the library's built-in formatting does several things automatically:

  • Pretty-prints JSON bodies with indentation
  • Lays out headers in [Key=Value] format
  • Handles content-type detection

If the raw body isn't in a format the library recognises (e.g. it's compressed, encrypted, or non-JSON), the formatter can't do its job. That's where pre-processors come in.


Pre-Processor Use Cases

Pre-processors operate on the raw body string before the library formats it. They're essential when the raw content needs to be transformed into something the library's JSON pretty-printer can work with, or when the raw format isn't useful in its original form.

Note: Pre-processors are set via DiagramsFetcherOptions (separate RequestPreFormattingProcessor and ResponsePreFormattingProcessor), not via ReportConfigurationOptions. If you only need post-processing, use ReportConfigurationOptions.RequestResponsePostProcessor instead.

Pretty-Printing XML

If your SUT communicates with SOAP or XML-based services, the raw body is typically a single long line of XML. The library's formatter won't pretty-print it (it only handles JSON). Use a pre-processor to format it:

RequestPreFormattingProcessor = body =>
{
    try { return XDocument.Parse(body).ToString(); }
    catch { return body; } // Not XML, leave as-is
}

Before (raw):

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body><GetCustomerResponse><Id>123</Id><Name>John Doe</Name></GetCustomerResponse></soap:Body></soap:Envelope>

After (pre-processed, then shown in diagram):

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <GetCustomerResponse>
      <Id>123</Id>
      <Name>John Doe</Name>
    </GetCustomerResponse>
  </soap:Body>
</soap:Envelope>

Decompressing Gzip/Deflate Bodies

Some APIs return compressed bodies. If the HttpClient pipeline hasn't decompressed them before tracking captures the content, the raw body will be binary gibberish:

ResponsePreFormattingProcessor = body =>
{
    try
    {
        var bytes = Convert.FromBase64String(body);
        using var input = new MemoryStream(bytes);
        using var gzip = new GZipStream(input, CompressionMode.Decompress);
        using var reader = new StreamReader(gzip);
        return reader.ReadToEnd();
    }
    catch { return body; }
}

Decrypting Encrypted Payloads

If a downstream service returns encrypted payloads (e.g. JWE tokens or encrypted JSON fields), the raw body is opaque. A pre-processor can decrypt it so the diagram shows the actual content:

RequestPreFormattingProcessor = body =>
{
    try
    {
        var decrypted = encryptionService.Decrypt(body);
        return decrypted;
    }
    catch { return body; }
}

Decoding Form-Encoded Bodies

POST requests with application/x-www-form-urlencoded content appear as a single line of key=value&key=value pairs. Transform them into a more readable layout:

RequestPreFormattingProcessor = body =>
{
    if (!body.Contains('=') || body.TrimStart().StartsWith('{'))
        return body; // Not form-encoded, or is JSON

    try
    {
        var pairs = body.Split('&')
            .Select(p => p.Split('=', 2))
            .Where(p => p.Length == 2)
            .Select(p => $"{Uri.UnescapeDataString(p[0])}: {Uri.UnescapeDataString(p[1])}");
        return string.Join("\n", pairs);
    }
    catch { return body; }
}

Before:

grant_type=authorization_code&code=abc123&redirect_uri=https%3A%2F%2Fapp.example.com%2Fcallback&client_id=my-app

After:

grant_type: authorization_code
code: abc123
redirect_uri: https://app.example.com/callback
client_id: my-app

Extracting Embedded JSON

Some APIs wrap JSON inside another format (e.g. an envelope or a Base64-encoded field). A pre-processor can extract the inner JSON so the library can pretty-print it:

ResponsePreFormattingProcessor = body =>
{
    try
    {
        var envelope = JsonSerializer.Deserialize<JsonElement>(body);
        if (envelope.TryGetProperty("data", out var data) && data.ValueKind == JsonValueKind.String)
        {
            // "data" contains a JSON string — extract and return it so the library pretty-prints it
            var inner = data.GetString()!;
            return JsonSerializer.Serialize(
                JsonSerializer.Deserialize<JsonElement>(inner),
                new JsonSerializerOptions { WriteIndented = true });
        }
        return body;
    }
    catch { return body; }
}

Combining Pre- and Post-Processors

You can use both simultaneously. A common pattern is pre-processing to make the body parseable, then post-processing to redact sensitive values from the formatted output:

var options = new DiagramsFetcherOptions
{
    // First: decode form-encoded token requests so the library can format them
    RequestPreFormattingProcessor = body =>
    {
        if (!body.Contains('=') || body.TrimStart().StartsWith('{')) return body;
        try
        {
            var pairs = body.Split('&')
                .Select(p => p.Split('=', 2))
                .Where(p => p.Length == 2)
                .Select(p => $"{Uri.UnescapeDataString(p[0])}: {Uri.UnescapeDataString(p[1])}");
            return string.Join("\n", pairs);
        }
        catch { return body; }
    },

    // Then: redact tokens from the final formatted output
    RequestPostFormattingProcessor = content => content
        .RedactMiddle(new Regex(@"(?<=code: )\S+"))
        .RedactMiddle(new Regex(@"(?<=client_secret: )\S+")),

    ResponsePostFormattingProcessor = content => content
        .RedactMiddle(new Regex("(?<=\"access_token\": \")[^\"]+(?=\")"))
        .RedactMiddle(new Regex("(?<=\"refresh_token\": \")[^\"]+(?=\")"))
};

Tip: Always wrap pre-processors in try-catch blocks. The raw body may not be in the format you expect (e.g. a different content type, an error response, or empty). If the pre-processor throws, the diagram generation for that test will fail. Return the original body in the catch block to fall through gracefully.


Post-Processor Use Cases

Post-processors are the most commonly used. They operate on the fully formatted note text — the final string that will appear in the PlantUML diagram note, including headers and body content already laid out by the library.

Redacting Bearer Tokens

Bearer tokens are long, opaque strings that add significant noise to diagrams without providing useful information. A simple regex replacement shortens them:

RequestResponsePostProcessor = content =>
    Regex.Replace(content, @"Bearer [A-Za-z0-9\-._~+/]+=*", "Bearer ***")

Before:

[Authorization=Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJ...]

After:

[Authorization=Bearer ***]

Redacting JSON Token Fields

Tokens also appear in JSON response bodies (access tokens, refresh tokens, ID tokens). Target them with regex that matches the JSON structure:

private static readonly Regex AccessTokenRegex = new("(?<=\"access_token\": \")[^\"]+(?=\")");
private static readonly Regex RefreshTokenRegex = new("(?<=\"refresh_token\": \")[^\"]+(?=\")");
private static readonly Regex IdTokenRegex = new("(?<=\"id_token\": \")[^\"]+(?=\")");

RequestResponsePostProcessor = content => content
    .RedactMiddle(AccessTokenRegex)
    .RedactMiddle(RefreshTokenRegex)
    .RedactMiddle(IdTokenRegex)

Where RedactMiddle keeps the start and end of the token visible for debugging while redacting the bulk:

private static string RedactMiddle(this string value, Regex regex) =>
    regex.Replace(value, m =>
        m.Value.Length > 50
            ? m.Value[..8] + "_REDACTED_" + m.Value[^18..]
            : m.Value);

Before:

"access_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIn0.Sfl..."

After:

"access_token": "eyJhbGc_REDACTED_KxwRJSMeKKF2QT4"

Redacting Cookies

Cookie headers often contain long session tokens that bloat diagrams:

private static readonly Regex SetCookieRegex = new(@"(?<=\[Set-Cookie=)[^\]]+(?=])");
private static readonly Regex CookieRegex = new(@"(?<=\[Cookie=)[^\]]+(?=])");

RequestResponsePostProcessor = content => content
    .RedactEnding(SetCookieRegex, 50)
    .RedactEnding(CookieRegex, 50)

Where RedactEnding keeps only the first N characters:

private static string RedactEnding(this string value, Regex regex, int length = 30) =>
    regex.Replace(value, m =>
        m.Value.Length > 200
            ? m.Value[..length] + "_RedactedEnding"
            : m.Value);

Splitting Long Words

Tokens, Base64-encoded values, and other long unbroken strings can make diagram notes extremely wide, sometimes exceeding PlantUML's rendering limits. Break them across lines:

private static string SplitWordsOverMaxLength(this string value, int maxLength = 200)
{
    var words = value.Split("\n")
        .SelectMany(line => line.Trim().Split(' '))
        .Where(w => !string.IsNullOrWhiteSpace(w));

    foreach (var word in words.Where(w => w.Length > maxLength))
    {
        var chunks = word.Chunk(maxLength).Select(c => new string(c));
        value = value.Replace(word, string.Join("\n", chunks));
    }

    return value;
}

Wrapping Long Embedded Values

Some payloads contain long string values (e.g. serialised JSON inside a JSON field, or EventGrid event payloads) that should be wrapped at a reasonable line length:

private static readonly Regex EventGridValueRegex =
    new("(?<=\"(?:request|response)\": \")[^\"]+");

private static string WrapLongValues(this string value, int maxLineLength = 90) =>
    EventGridValueRegex.Replace(value, match =>
        match.Value.Length <= maxLineLength
            ? match.Value
            : string.Join("\n", match.Value.Chunk(maxLineLength).Select(c => new string(c))));

Redacting Large HTML Responses

Some downstream services (identity providers, OAuth consent pages, etc.) return full HTML pages in their responses. These can be thousands of characters and make diagrams unrenderable. Replace them above a size threshold:

private const int HtmlMaxCharactersBeforeRedacting = 3_000;

private static string RedactLargeHtmlResponses(this string value)
{
    var startTag = "<html";
    var endTag = "</html";

    if (!value.Contains(startTag))
        return value;

    if (value.Length > HtmlMaxCharactersBeforeRedacting)
    {
        var before = value.Split(startTag)[0];
        var afterEnd = value.Split(startTag)[1].Split(endTag)[1];
        return before + startTag + ">...REDACTED..." + endTag + afterEnd.Trim();
    }

    return value;
}

Exposing JWT Claims

Rather than just redacting tokens, you can extract useful information from JWTs and annotate them in the diagram. For example, extracting the auth_level claim:

private static string ExposeAuthLevelsOfAccessTokens(this string value)
{
    var regex = new Regex("(\"access_token\": \")(.*)(\",)");
    return regex.Replace(value, match =>
    {
        try
        {
            var token = new JwtSecurityTokenHandler().ReadJwtToken(match.Groups[2].Value);
            var authLevel = token.Claims.FirstOrDefault(x => x.Type == "auth_level");
            return authLevel is null
                ? match.Value
                : match.Value + $" /* [auth_level={authLevel.Value}] */";
        }
        catch
        {
            return match.Value; // Not a JWT, leave as-is
        }
    });
}

Before:

"access_token": "eyJhbGciOiJSUzI1NiJ9.eyJhdXRoX2xldmVsIjoiMiIsInN1YiI6InVzZXIxIn0.sig...",

After:

"access_token": "eyJhbGc_REDACTED_KxwRJSMeKKF2QT4", /* [auth_level=2] */

This gives you the security-relevant metadata at a glance without the raw token noise.


Full Example: Building a Composable Redaction Pipeline

In practice, you'll combine multiple techniques into a fluent processing pipeline. Here's a complete real-world example for an API that deals with authentication tokens, cookies, and identity provider responses:

public static class DiagramContentProcessor
{
    // --- Token patterns ---
    private static readonly Regex AccessTokenRegex =
        new("(?<=\"access_token\": \")[^\"]+(?=\")");
    private static readonly Regex BearerTokenRegex =
        new(@"(?<=Bearer )([^\]]+)");
    private static readonly Regex RefreshTokenRegex =
        new("(?<=\"refresh_token\": \")[^\"]+(?=\")");
    private static readonly Regex IdTokenRegex =
        new("(?<=\"id_token\": \")[^\"]+(?=\")");

    // --- Cookie patterns ---
    private static readonly Regex SetCookieRegex =
        new(@"(?<=\[Set-Cookie=)[^\]]+(?=])");
    private static readonly Regex CookieRegex =
        new(@"(?<=\[Cookie=)[^\]]+(?=])");

    // --- Session / long value patterns ---
    private static readonly Regex SessionDataRegex =
        new("(?<=\"sessionData\": \")[^\"]+(?=\")");

    /// <summary>
    /// The post-processor function — wire this into ReportConfigurationOptions.
    /// </summary>
    public static Func<string, string> PostProcessor => content => content
        .ExposeAuthLevels()
        .RedactMiddle(AccessTokenRegex)
        .RedactMiddle(BearerTokenRegex)
        .RedactMiddle(RefreshTokenRegex)
        .RedactMiddle(IdTokenRegex)
        .RedactEnding(SetCookieRegex, 50)
        .RedactEnding(CookieRegex, 50)
        .RedactEnding(SessionDataRegex, 30)
        .RedactLargeHtml()
        .SplitLongWords();

    // --- Redaction helpers (extension methods) ---

    private static string RedactMiddle(this string value, Regex regex) =>
        regex.Replace(value, m =>
            m.Value.Length > 50
                ? m.Value[..8] + "_REDACTED_" + m.Value[^18..]
                : m.Value);

    private static string RedactEnding(this string value, Regex regex, int length) =>
        regex.Replace(value, m =>
            m.Value.Length > 200
                ? m.Value[..length] + "_RedactedEnding"
                : m.Value);

    private static string ExposeAuthLevels(this string value)
    {
        var regex = new Regex("(\"access_token\": \")(.*)(\",)");
        return regex.Replace(value, match =>
        {
            try
            {
                var token = new JwtSecurityTokenHandler().ReadJwtToken(match.Groups[2].Value);
                var claim = token.Claims.FirstOrDefault(c => c.Type == "auth_level");
                return claim is null ? match.Value : match.Value + $" /* [auth_level={claim.Value}] */";
            }
            catch { return match.Value; }
        });
    }

    private static string RedactLargeHtml(this string value)
    {
        if (!value.Contains("<html")) return value;
        if (value.Length <= 3_000) return value;
        var before = value.Split("<html")[0];
        var afterEnd = value.Split("<html")[1].Split("</html")[1];
        return before + "<html>...REDACTED...</html" + afterEnd.Trim();
    }

    private static string SplitLongWords(this string value, int maxLength = 200)
    {
        foreach (var word in value.Split('\n').SelectMany(l => l.Split(' ')).Where(w => w.Length > maxLength))
            value = value.Replace(word, string.Join("\n", word.Chunk(maxLength).Select(c => new string(c))));
        return value;
    }
}

Wire it up:

// In your test setup / report configuration
new ReportConfigurationOptions
{
    SpecificationsTitle = "My API Specifications",
    RequestResponsePostProcessor = DiagramContentProcessor.PostProcessor,
    ExcludedHeaders = ["X-Request-Id", "X-Correlation-Id"]
}

Order of Operations

The order you chain redaction steps matters. Recommended order:

  1. Extract useful metadata from tokens (e.g. ExposeAuthLevels) — do this before redacting, so you can still read the JWT
  2. Redact tokens and secrets (RedactMiddle / RedactEnding) — remove sensitive values
  3. Redact cookies and session data — reduce noise from large cookie headers
  4. Redact large HTML — prevent oversized diagrams from identity provider responses
  5. Split long words — break any remaining long strings so PlantUML can render them

If you redact tokens before extracting claims, the JWT will already be truncated and unreadable.


Tips

  • Test your processor by running your test suite and checking the generated PlantUML. If diagrams are still too wide or fail to render, you likely have unredacted long values.
  • Use RedactMiddle for tokens — keeping the start and end visible makes it possible to correlate which token is which across diagram notes.
  • Use RedactEnding for cookies and session data where only the name/start matters.
  • Set a size threshold for HTML redaction (e.g. 3,000 characters) so small HTML fragments still appear but full pages are collapsed.
  • Pre-processors vs post-processors: Use pre-processors only when you need to transform the raw body before the library's JSON pretty-printer runs (e.g. decrypting, decompressing, parsing XML). For everything else, use post-processors — they operate on the final formatted text which is more predictable to regex against.

Home


Demo


Getting Started

Common Tasks

Integration Guides

Extensions

Configuration

Features

Reference

Clone this wiki locally