Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] GeoIpDownloaderCliIT.testInvalidTimestamp failure #75221

Open
dimitris-athanasiou opened this issue Jul 12, 2021 · 16 comments · Fixed by #75330
Open

[CI] GeoIpDownloaderCliIT.testInvalidTimestamp failure #75221

dimitris-athanasiou opened this issue Jul 12, 2021 · 16 comments · Fixed by #75330
Assignees
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP medium-risk An open issue or test failure that is a medium risk to future releases Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI

Comments

@dimitris-athanasiou
Copy link
Contributor

Build scan: https://gradle-enterprise.elastic.co/s/sotnvdenrgysw

Repro line: ./gradlew ':modules:ingest-geoip:internalClusterTest' --tests "org.elasticsearch.ingest.geoip.GeoIpDownloaderCliIT.testInvalidTimestamp" -Dtests.seed=9DC568B1B022BF2 -Dtests.locale=sr -Dtests.timezone=Asia/Seoul -Druntime.java=11

Reproduces locally?: No

Applicable branches: master

Failure history: Failed a few times in the past month, see here

Failure excerpt:

java.lang.AssertionError: 	
Expected: not a collection containing "GeoLite2-ASN.mmdb"	
     but: was <[GeoLite2-ASN.mmdb_LICENSE.txt, GeoLite2-Country.mmdb, GeoLite2-ASN.mmdb_README.txt, GeoLite2-ASN.mmdb_COPYRIGHT.txt, GeoLite2-ASN.mmdb, GeoLite2-City.mmdb]>	
at __randomizedtesting.SeedInfo.seed([9DC568B1B022BF2:12C6D17C9532BB4]:0)	
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)	
at org.junit.Assert.assertThat(Assert.java:956)	
at org.junit.Assert.assertThat(Assert.java:923)	
at org.elasticsearch.ingest.geoip.GeoIpDownloaderIT.lambda$cleanUp$2(GeoIpDownloaderIT.java:109)	
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:973)	
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:946)	
at org.elasticsearch.ingest.geoip.GeoIpDownloaderIT.cleanUp(GeoIpDownloaderIT.java:104)	
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)	
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)	
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)	
at java.lang.reflect.Method.invoke(Method.java:566)	
at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1004)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)	
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)	
at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)	
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)	
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:824)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:475)	
at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)	
at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)	
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)	
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)	
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)	
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)	
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)	
at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)	
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)	
at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)	
at java.lang.Thread.run(Thread.java:834)
@dimitris-athanasiou dimitris-athanasiou added :Analytics/Geo Indexing, search aggregations of geo points and shapes >test-failure Triaged test failures from CI labels Jul 12, 2021
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 12, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@gwbrown
Copy link
Contributor

gwbrown commented Jul 14, 2021

Another instance of this: https://gradle-enterprise.elastic.co/s/c2sdhoiwrf2fo

@gwbrown
Copy link
Contributor

gwbrown commented Jul 14, 2021

I'll also note that the repro line for this test does not actually run the test. Running the class from IntelliJ also does not actually run the test. Running ./gradlew :modules:ingest-geoip:check also does not run the test. I remain mystified as to how to actually run this test locally.

probakowski added a commit that referenced this issue Jul 14, 2021
This change adds additional assertion in GeoIpDownloaderIT.testInvalidTimestamp which makes sure that validity checks work both ways (so going out of validity and back) and it should fix race in cleanUp method leading to occasional failures.

Closes #75221
Closes #74358
elasticsearchmachine pushed a commit to elasticsearchmachine/elasticsearch that referenced this issue Jul 14, 2021
This change adds additional assertion in GeoIpDownloaderIT.testInvalidTimestamp which makes sure that validity checks work both ways (so going out of validity and back) and it should fix race in cleanUp method leading to occasional failures.

Closes elastic#75221
Closes elastic#74358
elasticsearchmachine pushed a commit to elasticsearchmachine/elasticsearch that referenced this issue Jul 14, 2021
This change adds additional assertion in GeoIpDownloaderIT.testInvalidTimestamp which makes sure that validity checks work both ways (so going out of validity and back) and it should fix race in cleanUp method leading to occasional failures.

Closes elastic#75221
Closes elastic#74358
elasticsearchmachine added a commit that referenced this issue Jul 15, 2021
… (#75353)

* Make GeoIpDownloaderIT.testInvalidTimestamp more robust (#75330)

This change adds additional assertion in GeoIpDownloaderIT.testInvalidTimestamp which makes sure that validity checks work both ways (so going out of validity and back) and it should fix race in cleanUp method leading to occasional failures.

Closes #75221
Closes #74358

* compilation fix

Co-authored-by: Przemko Robakowski <przemko.robakowski@elastic.co>
@danhermann
Copy link
Contributor

danhermann commented Jul 15, 2021

This test failed again on master in https://gradle-enterprise.elastic.co/s/wzbkq4juyucvs

FWIW, I could not reproduce that one locally.

@danhermann danhermann reopened this Jul 15, 2021
@astefan
Copy link
Contributor

astefan commented Jul 15, 2021

And another one: https://gradle-enterprise.elastic.co/s/apsspqdlx7jqm in 7.14

elasticsearchmachine added a commit that referenced this issue Jul 15, 2021
#75354)

* Make GeoIpDownloaderIT.testInvalidTimestamp more robust (#75330)

This change adds additional assertion in GeoIpDownloaderIT.testInvalidTimestamp which makes sure that validity checks work both ways (so going out of validity and back) and it should fix race in cleanUp method leading to occasional failures.

Closes #75221
Closes #74358

* compilation fix

Co-authored-by: Przemko Robakowski <przemko.robakowski@elastic.co>
@gwbrown
Copy link
Contributor

gwbrown commented Jul 15, 2021

Another one, from a PR against 7.x: https://gradle-enterprise.elastic.co/s/qtxzi5x664h4s

@probakowski
Copy link
Contributor

probakowski commented Jul 15, 2021

It fails frequently - I'm gonna mute it in #75398
I wasn't able to reproduce it locally in few 1000s runs but CI seems to have no problems reproducing.

masseyke pushed a commit to masseyke/elasticsearch that referenced this issue Jul 16, 2021
This change adds additional assertion in GeoIpDownloaderIT.testInvalidTimestamp which makes sure that validity checks work both ways (so going out of validity and back) and it should fix race in cleanUp method leading to occasional failures.

Closes elastic#75221
Closes elastic#74358
@martijnvg martijnvg added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP and removed :Analytics/Geo Indexing, search aggregations of geo points and shapes Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Oct 27, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Oct 27, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@martijnvg martijnvg self-assigned this Oct 27, 2021
@martijnvg
Copy link
Member

I'm also unable to reproduce this locally. The teardown of this test class has also changed recently.
I'm going to enable this test again with trace logging enabled for the geoip package.

@martijnvg
Copy link
Member

The test has been enabled for some time now and hasn't failed. I will close this issue, if for some reason this test fails again then this issue can be re-opened or a new issue can be created.

@andreidan
Copy link
Contributor

Another occurrence in 8.0 https://gradle-enterprise.elastic.co/s/4mua3tcr5q2c4

@andreidan andreidan reopened this Dec 21, 2021
@valeriy42
Copy link
Contributor

Another occurrence in 8.1 https://gradle-enterprise.elastic.co/s/c4jzxbgt6slka

@astefan
Copy link
Contributor

astefan commented Feb 24, 2022

Another failure today on master: https://gradle-enterprise.elastic.co/s/exq2xfl2mgrmy

@dakrone dakrone added >test-failure Triaged test failures from CI :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP and removed >test-failure Triaged test failures from CI :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team labels Jan 5, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Jan 5, 2023
@masseyke
Copy link
Member

This looks like #92888.

@csoulios
Copy link
Contributor

Multiple failures again today:

All failures are on Windows instances and the following error is thrown for all .mmdb databases

[2023-02-23T08:24:58,912][INFO ][o.e.i.g.DatabaseReaderLazyLoader] [node_s1] evicted [0] entries from cache after reloading database [C:\Users\jenkins\workspace\platform-support\198\modules\ingest-geoip\build\testrun\internalClusterTest\temp\geoip-databases\7d--vNbhSe6TuwH5iuD2Wg\GeoLite2-Country.mmdb]
[2023-02-23T08:24:58,912][INFO ][o.e.i.g.DatabaseReaderLazyLoader] [node_s1] deleting [C:\Users\jenkins\workspace\platform-support\198\modules\ingest-geoip\build\testrun\internalClusterTest\temp\geoip-databases\7d--vNbhSe6TuwH5iuD2Wg\GeoLite2-Country.mmdb]
[2023-02-23T08:24:58,912][DEBUG][o.e.i.g.DatabaseNodeService] [node_s1] database [GeoLite2-ASN.mmdb] no longer exists, cleaning up...
[2023-02-23T08:24:58,912][INFO ][o.e.i.g.DatabaseReaderLazyLoader] [node_s1] evicted [0] entries from cache after reloading database [C:\Users\jenkins\workspace\platform-support\198\modules\ingest-geoip\build\testrun\internalClusterTest\temp\geoip-databases\7d--vNbhSe6TuwH5iuD2Wg\GeoLite2-ASN.mmdb]
[2023-02-23T08:24:58,912][INFO ][o.e.i.g.DatabaseReaderLazyLoader] [node_s1] deleting [C:\Users\jenkins\workspace\platform-support\198\modules\ingest-geoip\build\testrun\internalClusterTest\temp\geoip-databases\7d--vNbhSe6TuwH5iuD2Wg\GeoLite2-ASN.mmdb]
[2023-02-23T08:24:58,913][DEBUG][o.e.i.g.DatabaseNodeService] [node_s1] database [GeoLite2-City.mmdb] no longer exists, cleaning up...
[2023-02-23T08:24:58,913][INFO ][o.e.i.g.DatabaseReaderLazyLoader] [node_s1] evicted [0] entries from cache after reloading database [C:\Users\jenkins\workspace\platform-support\198\modules\ingest-geoip\build\testrun\internalClusterTest\temp\geoip-databases\7d--vNbhSe6TuwH5iuD2Wg\GeoLite2-City.mmdb]
[2023-02-23T08:24:58,913][INFO ][o.e.i.g.DatabaseReaderLazyLoader] [node_s1] deleting [C:\Users\jenkins\workspace\platform-support\198\modules\ingest-geoip\build\testrun\internalClusterTest\temp\geoip-databases\7d--vNbhSe6TuwH5iuD2Wg\GeoLite2-City.mmdb]
[2023-02-23T08:24:58,914][INFO ][o.e.i.g.DatabaseReaderLazyLoader] [node_s2] evicted [1] entries from cache after reloading database [C:\Users\jenkins\workspace\platform-support\198\modules\ingest-geoip\build\testrun\internalClusterTest\temp\geoip-databases\7ykYStW4R6GQZKa0xJ0YxQ\GeoLite2-Country.mmdb]
[2023-02-23T08:24:58,915][INFO ][o.e.i.g.DatabaseReaderLazyLoader] [node_s2] deleting [C:\Users\jenkins\workspace\platform-support\198\modules\ingest-geoip\build\testrun\internalClusterTest\temp\geoip-databases\7ykYStW4R6GQZKa0xJ0YxQ\GeoLite2-Country.mmdb]
[2023-02-23T08:24:58,916][ERROR][o.e.i.g.DatabaseNodeService] [node_s2] failed to clean database [GeoLite2-Country.mmdb]
java.nio.file.FileSystemException: C:\Users\jenkins\workspace\platform-support\198\modules\ingest-geoip\build\testrun\internalClusterTest\temp\geoip-databases\7ykYStW4R6GQZKa0xJ0YxQ\GeoLite2-Country.mmdb: The process cannot access the file because it is being used by another process
	at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) ~[?:?]
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) ~[?:?]
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:108) ~[?:?]
	at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:275) ~[?:?]
	at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:105) ~[?:?]
	at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.delete(FilterFileSystemProvider.java:134) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
	at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.delete(FilterFileSystemProvider.java:134) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
	at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.delete(FilterFileSystemProvider.java:134) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
	at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.delete(FilterFileSystemProvider.java:134) ~[lucene-test-framework-9.4.2.jar:9.4.2 858d9b437047a577fa9457089afff43eefa461db - jpountz - 2022-11-17 12:56:39]
	at java.nio.file.Files.delete(Files.java:1152) ~[?:?]
	at org.elasticsearch.ingest.geoip.DatabaseReaderLazyLoader.doClose(DatabaseReaderLazyLoader.java:235) ~[main/:?]
	at org.elasticsearch.ingest.geoip.DatabaseReaderLazyLoader.close(DatabaseReaderLazyLoader.java:224) ~[main/:?]
	at org.elasticsearch.ingest.geoip.DatabaseReaderLazyLoader.close(DatabaseReaderLazyLoader.java:218) ~[main/:?]
	at org.elasticsearch.ingest.geoip.DatabaseNodeService.removeStaleEntries(DatabaseNodeService.java:385) ~[main/:?]
	at org.elasticsearch.ingest.geoip.DatabaseNodeService.checkDatabases(DatabaseNodeService.java:258) ~[main/:?]
	at org.elasticsearch.ingest.geoip.DatabaseNodeService.lambda$initialize$0(DatabaseNodeService.java:161) ~[main/:?]
	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListener(ClusterApplierService.java:559) ~[elasticsearch-8.6.3-SNAPSHOT.jar:8.6.3-SNAPSHOT]
	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListeners(ClusterApplierService.java:545) ~[elasticsearch-8.6.3-SNAPSHOT.jar:8.6.3-SNAPSHOT]
	at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:504) ~[elasticsearch-8.6.3-SNAPSHOT.jar:8.6.3-SNAPSHOT]
	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:428) ~[elasticsearch-8.6.3-SNAPSHOT.jar:8.6.3-SNAPSHOT]
	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:154) ~[elasticsearch-8.6.3-SNAPSHOT.jar:8.6.3-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:850) ~[elasticsearch-8.6.3-SNAPSHOT.jar:8.6.3-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:257) ~[elasticsearch-8.6.3-SNAPSHOT.jar:8.6.3-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:223) ~[elasticsearch-8.6.3-SNAPSHOT.jar:8.6.3-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
	at java.lang.Thread.run(Thread.java:1589) ~[?:?]

@dakrone dakrone added the medium-risk An open issue or test failure that is a medium risk to future releases label Oct 12, 2023
@joegallo joegallo assigned joegallo and unassigned martijnvg Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP medium-risk An open issue or test failure that is a medium risk to future releases Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI
Projects
None yet