Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test stability - OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.PrometheusExporterHttpServerIntegration #5283

Closed
alanwest opened this issue Jan 30, 2024 · 2 comments · Fixed by #5304
Assignees
Labels
bug Something isn't working infra Infra work - CI/CD, code coverage, linters

Comments

@alanwest
Copy link
Member

alanwest commented Jan 30, 2024

Multiple test flickers for OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests

OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.PrometheusExporterHttpServerIntegration_UseOpenMetricsVersionHeader

https://github.com/open-telemetry/opentelemetry-dotnet/actions/runs/7704427638/job/20996720289?pr=5270#step:6:5733

Failed OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.PrometheusExporterHttpServerIntegration_UseOpenMetricsVersionHeader [22 ms]
  Error Message:
   System.Net.Http.HttpRequestException : Connection refused (localhost:4388)
---- System.Net.Sockets.SocketException : Connection refused
  Stack Trace:
     at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp11ConnectionAsync(QueueItem queueItem)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.HttpConnectionWaiter`1.WaitForConnectionAsync(Boolean async, CancellationToken requestCancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.RunPrometheusExporterHttpServerIntegrationTest(Boolean skipMetrics, String acceptHeader) in /home/runner/work/opentelemetry-dotnet/opentelemetry-dotnet/test/OpenTelemetry.Exporter.Prometheus.HttpListener.Tests/PrometheusHttpListenerTests.cs:line 154
   at OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.PrometheusExporterHttpServerIntegration_UseOpenMetricsVersionHeader() in /home/runner/work/opentelemetry-dotnet/opentelemetry-dotnet/test/OpenTelemetry.Exporter.Prometheus.HttpListener.Tests/PrometheusHttpListenerTests.cs:line 91
--- End of stack trace from previous location ---
----- Inner Stack Trace -----
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|281_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)

OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.PrometheusExporterHttpServerIntegration_NoOpenMetrics

https://github.com/open-telemetry/opentelemetry-dotnet/actions/runs/7704639557/job/20997296537?pr=5272#step:6:6258

Failed OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.PrometheusExporterHttpServerIntegration_NoOpenMetrics [1 m 40 s]
  Error Message:
   System.Threading.Tasks.TaskCanceledException : The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
---- System.TimeoutException : The operation was canceled.
-------- System.Threading.Tasks.TaskCanceledException : The operation was canceled.
------------ System.IO.IOException : Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
---------------- System.Net.Sockets.SocketException : The I/O operation has been aborted because of either a thread exit or an application request.
  Stack Trace:
     at System.Net.Http.HttpClient.HandleFailure(Exception e, Boolean telemetryStarted, HttpResponseMessage response, CancellationTokenSource cts, CancellationToken cancellationToken, CancellationTokenSource pendingRequestsCts)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.RunPrometheusExporterHttpServerIntegrationTest(Boolean skipMetrics, String acceptHeader) in D:\a\opentelemetry-dotnet\opentelemetry-dotnet\test\OpenTelemetry.Exporter.Prometheus.HttpListener.Tests\PrometheusHttpListenerTests.cs:line 154
   at OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.PrometheusExporterHttpServerIntegration_NoOpenMetrics() in D:\a\opentelemetry-dotnet\opentelemetry-dotnet\test\OpenTelemetry.Exporter.Prometheus.HttpListener.Tests\PrometheusHttpListenerTests.cs:line 85
--- End of stack trace from previous location ---
----- Inner Stack Trace -----

----- Inner Stack Trace -----
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
----- Inner Stack Trace -----
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
----- Inner Stack Trace -----
@alanwest alanwest added bug Something isn't working infra Infra work - CI/CD, code coverage, linters labels Jan 30, 2024
@alanwest alanwest changed the title Test stability - OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.PrometheusExporterHttpServerIntegration_UseOpenMetricsVersionHeader Test stability - OpenTelemetry.Exporter.Prometheus.Tests.PrometheusHttpListenerTests.PrometheusExporterHttpServerIntegration Jan 30, 2024
@reyang
Copy link
Member

reyang commented Jan 31, 2024

Related to #3292 and #4913.

@reyang
Copy link
Member

reyang commented Jan 31, 2024

All of these issues are caused by the race condition

The test code started the listener/server asynchronously, and without properly checking if the service is up and running (which means the service is ready to handle requests), it just made async client calls to the service which could fail randomly depending on timing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working infra Infra work - CI/CD, code coverage, linters
Projects
None yet
3 participants