Skip to content

Runner's Azure Blob uploads stall through HTTPS proxy — BlobClient not configured with proxy transport #4351

@savitha-qs

Description

@savitha-qs

Description

When the GitHub Actions runner is configured with HTTPS_PROXY, uploads to Azure Blob Storage (step logs, summaries, artifacts via the Results Service) stall for ~75 seconds and timeout. The runner's ResultsHttpClient creates BlobClient / AppendBlobClient instances without configuring an HttpClientTransport that uses the proxy, so the Azure SDK falls back to its default transport which may not inherit the runner's proxy configuration correctly.

A standalone .NET 8.0 HttpClient.PutAsync() through the same proxy to the same blob endpoints completes in ~1 second — the issue is specific to how the runner creates Azure SDK blob clients.

Root cause analysis

In src/Sdk/WebApi/WebApi/ResultsHttpClient.cs, the GetBlobClient() and GetAppendBlobClient() methods create blob clients with only retry/timeout options:

private BlobClient GetBlobClient(string url)
{
    var blobUri = ParseSasToken(url);
    var opts = new BlobClientOptions
    {
        Retry =
        {
            MaxRetries = Constants.DefaultBlobUploadRetries,
            NetworkTimeout = TimeSpan.FromSeconds(Constants.DefaultNetworkTimeoutInSeconds)
        }
    };
    return new BlobClient(blobUri.path, new AzureSasCredential(blobUri.sas), opts);
}

No HttpClientTransport is provided in BlobClientOptions, so the Azure SDK uses its default HTTP pipeline. Meanwhile, the runner's own HTTP calls use a custom RunnerWebProxy via VssHttpMessageHandler, but this proxy configuration is not shared with the Azure SDK blob clients.

Suggested fix

Configure the BlobClientOptions.Transport with an HttpClientTransport that uses the runner's proxy settings:

private BlobClient GetBlobClient(string url)
{
    var blobUri = ParseSasToken(url);
    var handler = new HttpClientHandler();
    // Use the runner's proxy, or let the default pick up HTTPS_PROXY env var
    handler.Proxy = WebProxy;  // RunnerWebProxy instance
    handler.UseProxy = true;

    var opts = new BlobClientOptions
    {
        Retry =
        {
            MaxRetries = Constants.DefaultBlobUploadRetries,
            NetworkTimeout = TimeSpan.FromSeconds(Constants.DefaultNetworkTimeoutInSeconds)
        },
        Transport = new HttpClientTransport(handler)
    };
    return new BlobClient(blobUri.path, new AzureSasCredential(blobUri.sas), opts);
}

The same fix is needed for GetAppendBlobClient().

Evidence

Standalone .NET repro — works fine (no stall):

$ dotnet run
HTTPS_PROXY = http://10.129.160.8:443
.NET        = 8.0.26

--- PUT https://productionresultssa6.blob.core.windows.net/test-upload ---
HTTP 409 in 1.1s

--- PUT https://results-receiver.actions.githubusercontent.com/test ---
HTTP 404 in 0.2s

Same proxy, same endpoints — curl and Python also work:

TEST 1a: curl PUT 1MB to blob → HTTP 409 in 0.3s
TEST 2:  Python PUT 1MB to blob → HTTP 409 in 0.6s

The runner process stalls on the same endpoints:
From GCP Secure Web Proxy logs during a real workflow run:

host=productionresultssa6.blob.core.windows.net:443  latency=74.999634s  ua=azsdk-net-Storage.Blobs/12.27.0 (.NET 8.0.25)
host=results-receiver.actions.githubusercontent.com:443  latency=74.999750s  ua=VSServices/2.333.1
host=results-receiver.actions.githubusercontent.com:443  latency=74.998897s  ua=VSServices/2.333.1
host=results-receiver.actions.githubusercontent.com:443  latency=74.999444s  ua=VSServices/2.333.1

Each request stalls for exactly 75 seconds (the proxy's tunnel timeout), indicating the runner is establishing the CONNECT tunnel but then failing to send/receive data through it.

Affected operations

The stall affects these runner-internal operations that use BlobClient / AppendBlobClient:

  • UploadStepSummaryAsync — step summary upload
  • UploadResultsStepLogAsync — step log upload
  • UploadResultsJobLogAsync — job log upload
  • UploadResultsDiagnosticLogsAsync — diagnostic log upload

The upload-artifact action itself is Node.js and uses the JavaScript Azure Storage SDK — it is not affected. The stall comes from the runner process's own results/log uploads.

Environment

  • Runner: GitHubActionsRunner-linux-x64/2.333.1
  • Runtime: .NET 8.0.25 (self-contained)
  • Azure SDK: azsdk-net-Storage.Blobs/12.27.0
  • Proxy: GCP Secure Web Proxy (HTTPS forward proxy, CONNECT tunnel, no TLS inspection)
  • Config: HTTPS_PROXY=http://<proxy-ip>:443
  • Platform: Ubuntu 24.04.4 LTS on GKE (actions-runner-controller v0.13.1)

Workaround

Add the affected hostnames to NO_PROXY to bypass the proxy:

NO_PROXY=.blob.core.windows.net,.actions.githubusercontent.com

This is not ideal because it requires direct internet access from the runner nodes, bypassing the proxy's hostname-based egress control.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions