Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sdk: make test_batch_span_processor_scheduled_delay a bit more robust #3938

Conversation

xrmx
Copy link
Contributor

@xrmx xrmx commented May 30, 2024

Description

It happened that tests failed because the delay was fired some microseconds earlier:

>       self.assertGreaterEqual((export_time - start_time) * 1e3, 500)
E       AssertionError: 499.9737739562988 not greater than or equal to 500

We should probably revise all these skip on Windows Pypy once we have a Python 3.9 baseline and Pypy >= 7.3.12.

Fix #3911

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • tox -e pypy3-opentelemetry-sdk

Does This PR Require a Contrib Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@xrmx xrmx added the Skip Changelog PRs that do not require a CHANGELOG.md entry label May 30, 2024
@xrmx xrmx requested a review from a team as a code owner May 30, 2024 15:44
@xrmx xrmx force-pushed the flaky-test_batch_span_processor_scheduled_delay branch from 2e6fe14 to 8b2cd41 Compare June 3, 2024 12:34
@xrmx xrmx requested a review from ocelotl June 3, 2024 12:34
@xrmx xrmx changed the title sdk: make test_batch_span_processor_scheduled_delay less flaky on pypy sdk: skip test_batch_span_processor_scheduled_delay on pypy / windows Jun 3, 2024
Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happened that tests failed because the delay was fired some microseconds earlier:

>       self.assertGreaterEqual((export_time - start_time) * 1e3, 500)
E       AssertionError: 499.9737739562988 not greater than or equal to 500

For the purposes of this test, I think some microseconds early or late should be perfectly acceptable. Can we instead just update L489 to use assertAlmostEqual() with a reasonable delta?

@aabmass
Copy link
Member

aabmass commented Jun 11, 2024

Ah I see you mentioned this in the issue @xrmx. That would be my preferred fix

@xrmx
Copy link
Contributor Author

xrmx commented Jun 11, 2024

It happened that tests failed because the delay was fired some microseconds earlier:

>       self.assertGreaterEqual((export_time - start_time) * 1e3, 500)
E       AssertionError: 499.9737739562988 not greater than or equal to 500

For the purposes of this test, I think some microseconds early or late should be perfectly acceptable. Can we instead just update L489 to use assertAlmostEqual() with a reasonable delta?

The first version was:

self.assertGreaterEqual(round(export_time - start_time) * 1e3), 500)

but was asked to change it

@aabmass
Copy link
Member

aabmass commented Jun 11, 2024

Thanks. @ocelotl I see your comment above. I imagine this test can be flaky on any platform since it depends on real world timing. IMO the PyPy behavior is working as intended.

self.assertGreaterEqual((export_time - start_time) * 1e3, 500)

The other issue, if the actual time is way larger 500 the test passes. IMO assertAlmostEqual would be an improvement across the board. Wdyt?

@xrmx xrmx force-pushed the flaky-test_batch_span_processor_scheduled_delay branch from 762f4da to b115232 Compare June 12, 2024 07:19
@xrmx xrmx requested a review from aabmass June 12, 2024 07:21
@xrmx xrmx force-pushed the flaky-test_batch_span_processor_scheduled_delay branch from b115232 to dcec3b0 Compare June 12, 2024 07:24
@xrmx
Copy link
Contributor Author

xrmx commented Jun 12, 2024

Updated following @aabmass suggestion

Copy link
Contributor

@lzchen lzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ocelotl PTAL

@xrmx xrmx changed the title sdk: skip test_batch_span_processor_scheduled_delay on pypy / windows sdk: make test_batch_span_processor_scheduled_delay a bit more robust Jun 12, 2024
@ocelotl
Copy link
Contributor

ocelotl commented Jun 12, 2024

The other issue, if the actual time is way larger 500 the test passes.

I was thinking the same thing, approved.

@ocelotl
Copy link
Contributor

ocelotl commented Jun 12, 2024

Well, still not enough:

>       self.assertAlmostEqual((export_time - start_time) * 1e3, 500, places=0)
E       AssertionError: 500.6556510925293 != 500 within 0 places (0.6556510925292969 difference)

@ocelotl
Copy link
Contributor

ocelotl commented Jun 12, 2024

Ok, 2 things, in my opinion:

  1. This test has probably been failing randomly since forever. And since forever we have probably just been running it again until it passes.
  2. We now are ok with this test having some margin for error.

So:

  1. Let's just automate our manual rerunning of this test case.
  2. Maybe we are ok with even a greater margin for error for this test case. Let's use the greatest margin of error we are ok with. We probably would want to use the delta argument instead of places.

@xrmx xrmx force-pushed the flaky-test_batch_span_processor_scheduled_delay branch 2 times, most recently from ce74344 to fc24d14 Compare June 13, 2024 07:59
@xrmx
Copy link
Contributor Author

xrmx commented Jun 13, 2024

me: Bumped the delta to 16ms as seen on CI

pypy: hold my beer:
E AssertionError: 2253.103017807007 != 500 within 16 delta (1753.1030178070068 difference)

@xrmx xrmx force-pushed the flaky-test_batch_span_processor_scheduled_delay branch 2 times, most recently from 57be3e0 to eca91f4 Compare June 13, 2024 14:57
xrmx added 2 commits June 13, 2024 17:34
It happened that tests failed because the delay was fired some
microseconds earlier:

>       self.assertGreaterEqual((export_time - start_time) * 1e3, 500)
E       AssertionError: 499.9737739562988 not greater than or equal to 500

Use assertAlmostEqual to accept a similar enough value (delta=25) and
avoid too big values.

Skip tests on windows pypy because of random huge spikes:

E AssertionError: 2253.103017807007 != 500 within 25 delta (1744.1030178070068 difference)

Fix open-telemetry#3911
The last metric collection after the thread has been notified to
shutdown is not handling the submission to get a MetricsTimeoutError
exception. Handle that to match what we are doing in the usual loop
collection.

See in TestBatchSpanProcessor.test_batch_span_processor_scheduled_delay
failing with:

opentelemetry-sdk/tests/metrics/test_periodic_exporting_metric_reader.py::TestPeriodicExportingMetricReader::test_metric_timeout_does_not_kill_worker_thread
  \_pytest\threadexception.py:73: PytestUnhandledThreadExceptionWarning: Exception in thread OtelPeriodicExportingMetricReader

  Traceback (most recent call last):
    File "C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\threading.py", line 1045, in _bootstrap_inner
      self.run()
    File "C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\threading.py", line 982, in run
      self._target(*self._args, **self._kwargs)
    File "D:\a\opentelemetry-python\opentelemetry-python\opentelemetry-sdk\src\opentelemetry\sdk\metrics\_internal\export\__init__.py", line 522, in _ticker
      self.collect(timeout_millis=self._export_interval_millis)
    File "D:\a\opentelemetry-python\opentelemetry-python\opentelemetry-sdk\tests\metrics\test_periodic_exporting_metric_reader.py", line 87, in collect
      raise self._collect_exception
  opentelemetry.sdk.metrics._internal.exceptions.MetricsTimeoutError: test timeout
@xrmx xrmx force-pushed the flaky-test_batch_span_processor_scheduled_delay branch from 3704cfb to d600e87 Compare June 13, 2024 15:34
@xrmx
Copy link
Contributor Author

xrmx commented Jun 13, 2024

Opened this for revising skipping tests on pypy / windows: #3967

Today I've added another commit in this PR noticing that in some failing pypy tests the MetricsTimeoutError was raised for the last collection. Since we are catching it in the loop I think we should catch it there too. You can find the stacktrace here:


opentelemetry-sdk/tests/metrics/test_periodic_exporting_metric_reader.py::TestPeriodicExportingMetricReader::test_metric_timeout_does_not_kill_worker_thread
  \_pytest\threadexception.py:73: PytestUnhandledThreadExceptionWarning: Exception in thread OtelPeriodicExportingMetricReader

  Traceback (most recent call last):
    File "C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\threading.py", line 1045, in _bootstrap_inner
      self.run()
    File "C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\threading.py", line 982, in run
      self._target(*self._args, **self._kwargs)
    File "D:\a\opentelemetry-python\opentelemetry-python\opentelemetry-sdk\src\opentelemetry\sdk\metrics\_internal\export\__init__.py", line 522, in _ticker
      self.collect(timeout_millis=self._export_interval_millis)
    File "D:\a\opentelemetry-python\opentelemetry-python\opentelemetry-sdk\tests\metrics\test_periodic_exporting_metric_reader.py", line 87, in collect
      raise self._collect_exception
  opentelemetry.sdk.metrics._internal.exceptions.MetricsTimeoutError: test timeout

@ocelotl ocelotl merged commit b50ac84 into open-telemetry:main Jun 13, 2024
269 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Skip Changelog PRs that do not require a CHANGELOG.md entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flaky test in opentelemetry-sdk
5 participants