Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] testRandomTimeIntervalRounding Failure #56400

Closed
tvernum opened this issue May 8, 2020 · 9 comments · Fixed by #56433
Closed

[CI] testRandomTimeIntervalRounding Failure #56400

tvernum opened this issue May 8, 2020 · 9 comments · Fixed by #56433
Labels
:Analytics/Aggregations Aggregations Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test-failure Triaged test failures from CI

Comments

@tvernum
Copy link
Contributor

tvernum commented May 8, 2020

Build scan: https://gradle-enterprise.elastic.co/s/plit7g4fdojhi/tests/

Repro line:

./gradlew ':server:test' \
--tests "org.elasticsearch.common.RoundingTests.testRandomTimeIntervalRounding" \
-Dtests.seed=DE09A1CC10383EB0 -Dtests.security.manager=true \
-Dtests.locale=ca-ES -Dtests.timezone=Etc/GMT+1 \
-Dcompiler.java=14 -Druntime.java=11

Reproduces locally?: No

Applicable branches: master

Failure excerpt:

Expected [148 MINUTES in America/Menominee] to \
 round [2008-11-02T07:50:16.470Z] \
    to [2008-11-02T06:56:00Z] but instead rounded \
    to [2008-11-02T07:56:00Z]
at __randomizedtesting.SeedInfo.seed([DE09A1CC10383EB0:7D4B1688B8C503D4]:0)
at org.junit.Assert.fail(Assert.java:88)
at org.elasticsearch.common.RoundingTests.testRandomTimeIntervalRounding(RoundingTests.java:410)
@tvernum tvernum added :Core/Infra/Core Core issues without another label >test-failure Triaged test failures from CI labels May 8, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Core)

@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label May 8, 2020
@tvernum
Copy link
Contributor Author

tvernum commented May 8, 2020

Ping: @nik9000 because this looks related to #56371

@rjernst
Copy link
Member

rjernst commented May 8, 2020

Reproduces locally?: No

Not sure why, but this does reproduce for me.

@nik9000
Copy link
Member

nik9000 commented May 8, 2020 via email

@nik9000 nik9000 added :Analytics/Aggregations Aggregations and removed :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team labels May 8, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 8, 2020
@nik9000
Copy link
Member

nik9000 commented May 8, 2020

This doesn't reproduce for me, but I bet it is variable depending on the tzdb file that is in the distribution. Fun times. I was able to make some tests fail by running the random tests a ton of times. I'm going to fix those and hope that fixes this.

@nik9000
Copy link
Member

nik9000 commented May 8, 2020

OK! It looks like I made some assumptions about the shape of the "current transition list" vs the "future transition rules" that was incorrect. It caused duplicate transitions in my list which oddly, never broke with time unit rounding, but does break with time interval rounding. They do use different strategies, especially around overlaps. Anyway, the tzdb for America/Tijuana has a similar behavior for me locally so I can reproduce this issue. Probably.

I'll open a PR with assertions that'll trip if we hit this.

@martijnvg
Copy link
Member

nik9000 added a commit to nik9000/elasticsearch that referenced this issue May 8, 2020
This fixes a bug in the interval rounding and two test bugs that showed
up when I ran 1000s of iterations of the tests. The bug was that we
could end up with duplicate transitions if we only needed the last
transition from the list of fully defined transitions and the
transitions overlapped with the rules. This happens. Its ok. We had
defence against that if we needed more than one transition but forgot it
in this case. Now we have it and assertions to make sure we catch any
similar mistakes.

One test bug had to do with the randomized test calling
`round(round(min))` because sometimes the first `round` call will return
a time less than min. Specifically if `min` is near daylight savings
time. Anyway, this change fixes it by building an extra-wide prepared
rounding just in case.

The other test bug came up when adding a test for the real bug, namely
that `assertRoundingAtOffset` made an assertion that wasn't true if the
time to round ended up being in an overlap. This just drops that
assertion. We have similar assertions in cases where we *know* what kind
of offsets we're working with in other tests.

Closes elastic#56400
@nik9000
Copy link
Member

nik9000 commented May 8, 2020

I opened a PR to fix it. It looks like about 10% of the tests fail my machine. I'm kind of surprised I didn't bump into this.

nik9000 added a commit that referenced this issue May 8, 2020
This fixes a bug in the interval rounding and two test bugs that showed
up when I ran 1000s of iterations of the tests. The bug was that we
could end up with duplicate transitions if we only needed the last
transition from the list of fully defined transitions and the
transitions overlapped with the rules. This happens. Its ok. We had
defence against that if we needed more than one transition but forgot it
in this case. Now we have it and assertions to make sure we catch any
similar mistakes.

One test bug had to do with the randomized test calling
`round(round(min))` because sometimes the first `round` call will return
a time less than min. Specifically if `min` is near daylight savings
time. Anyway, this change fixes it by building an extra-wide prepared
rounding just in case.

The other test bug came up when adding a test for the real bug, namely
that `assertRoundingAtOffset` made an assertion that wasn't true if the
time to round ended up being in an overlap. This just drops that
assertion. We have similar assertions in cases where we *know* what kind
of offsets we're working with in other tests.

Closes #56400
nik9000 added a commit to nik9000/elasticsearch that referenced this issue May 8, 2020
This fixes a bug in the interval rounding and two test bugs that showed
up when I ran 1000s of iterations of the tests. The bug was that we
could end up with duplicate transitions if we only needed the last
transition from the list of fully defined transitions and the
transitions overlapped with the rules. This happens. Its ok. We had
defence against that if we needed more than one transition but forgot it
in this case. Now we have it and assertions to make sure we catch any
similar mistakes.

One test bug had to do with the randomized test calling
`round(round(min))` because sometimes the first `round` call will return
a time less than min. Specifically if `min` is near daylight savings
time. Anyway, this change fixes it by building an extra-wide prepared
rounding just in case.

The other test bug came up when adding a test for the real bug, namely
that `assertRoundingAtOffset` made an assertion that wasn't true if the
time to round ended up being in an overlap. This just drops that
assertion. We have similar assertions in cases where we *know* what kind
of offsets we're working with in other tests.

Closes elastic#56400
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants