Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non fatal npe in warmer #8140

Closed
jillesvangurp opened this issue Oct 17, 2014 · 7 comments
Closed

non fatal npe in warmer #8140

jillesvangurp opened this issue Oct 17, 2014 · 7 comments
Assignees

Comments

@jillesvangurp
Copy link
Contributor

I've seen the stacktrace below in our integration tests a couple of time. We're starting elasticsearch as an embedded node. The error appears to be non fatal since our integration tests pass anyway. I've seen this stacktrace twice in the past week but can't reproduce it reliably.

We are running our maven tests concurrently and in randomized order, so there are a lot of integration tests hitting our elasticsearch node all at once right after it starts and reports a green status.

Using elasticsearch 1.4.0 Beta1

17-10-2014T15:56:40+0200 W warmer - [test-node-gstJI] [inbot_users_v27][2] failed to load random access for [_type:usercontact]
org.elasticsearch.common.util.concurrent.UncheckedExecutionException: java.lang.NullPointerException
at org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2203) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.common.cache.LocalCache.get(LocalCache.java:3937) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache.getAndLoadIfNotPresent(FixedBitSetFilterCache.java:132) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache.access$100(FixedBitSetFilterCache.java:75) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache$FixedBitSetFilterWarmer$1.run(FixedBitSetFilterCache.java:284) ~[elasticsearch-1.4.0.Beta1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0]
at java.lang.Thread.run(Thread.java:744) [na:1.8.0]
Caused by: java.lang.NullPointerException: null
at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache$2.call(FixedBitSetFilterCache.java:157) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.index.cache.fixedbitset.FixedBitSetFilterCache$2.call(FixedBitSetFilterCache.java:132) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282) ~[elasticsearch-1.4.0.Beta1.jar:na]
at org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2197) ~[elasticsearch-1.4.0.Beta1.jar:na]
... 8 common frames omitted

@clintongormley
Copy link

thanks for reporting @jillesvangurp - we'll take a look

@martijnvg
Copy link
Member

@jillesvangurp I think I found a small issue where there is a small window of time that a field is unset, but the warmer that needs is running. In order to confirm this can you share how you start the embedded node and run your test? (for example you wait for green status before running?) Sharing code snippets how the node is brought up before the tests run would even be more helpful.

@jillesvangurp
Copy link
Contributor Author

Sure no problem:

            String defaultIndexDirectory = "target/data-"+UUID.randomUUID().toString();
            String indexDir = config.getString("estestserver.indexdir",defaultIndexDirectory);
            String logDir = config.getString("estestserver.logdir",defaultIndexDirectory + "/logs");
            String esPort = config.getString("estestserver.port","9299");
            File file = new File(defaultIndexDirectory);
            file.mkdirs();
            LOG.info("using " + file.getAbsolutePath() + " for es data and logs");
            Settings settings = ImmutableSettings.settingsBuilder()
                    .put("name", "test-node-"+RandomStringUtils.randomAlphabetic(5))
                    .put("cluster.name", "linko-dev-cluster-"+RandomStringUtils.randomAlphabetic(5))
                    .put("index.gateway.type", "none")
                    .put("gateway.type", "none")
                    .put("discovery.zen.ping.multicast.ping.enabled", "false")
                    .put("discovery.zen.ping.multicast.enabled", "false")
                    .put("path.data", indexDir)
                    .put("path.logs", logDir)
                    .put("foreground", "true")
                    .put("http.port", esPort)
                    .put("http.cors.enabled", "true")
                    .put("http.cors.allow-origin","/https?:\\/\\/(localhost|kibana.*\\.linko\\.io)(:[0-9]+)?/")
                    .build();

            LOG.info(settings.toDelimitedString(';'));

            NodeBuilder nodeBuilder = NodeBuilder.nodeBuilder()
                    .settings(settings)
                    .loadConfigSettings(false);
            node = nodeBuilder
                    .build();


            // register a shutdown hook
            Runtime.getRuntime().addShutdownHook(new Thread() {
                @Override
                public void run() {
                    node.close();
                }
            });
            node.start();

            // wait until the shards are ready
            node.client().admin().cluster().prepareHealth().setWaitForGreenStatus().execute().actionGet();

@jillesvangurp
Copy link
Contributor Author

One additional bit of information that I just realized may be relevant here is that we have a parent child relation between user and usercontact. So, the exception is happening when it is doing something with the child type. For reference, here's a gist with the full mapping for the index: https://gist.github.com/jillesvangurp/d0cd29573b876f9cc4d3

@jillesvangurp
Copy link
Contributor Author

Also, we're using testNg and surefire. The elasticsearch node is started in a @BeforeSuite. We use a very large threadcount and randomized order to surface any issues related to inter test dependencies and stability of our system. This pretty much means all our integration test classes are starting at the same time. We generally use randomized test data and there are a lot of calls to /_refresh to ensure indices are committed in each tests.

<plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <configuration>
                    <parallel>classes</parallel>
                    <threadCount>50</threadCount>
                    <runOrder>random</runOrder>
                    <argLine>-Xms1024m -Xmx2048m</argLine>
                    <properties>
                        <property>
                            <name>listener</name>
                            <value>io.linko.ng.testutil.TestProgressLogger</value>
                        </property>
                    </properties>
                </configuration>
            </plugin>

@martijnvg
Copy link
Member

Thanks for the provided information @jillesvangurp

I opened: #8168 for this issue. Are you able to verify if the non fatal NPE doesn't occur any more with this fix in your test infrastructure?

@jillesvangurp
Copy link
Contributor Author

No problem. Unfortunately, most of our builds don't trigger this exception; so it is a bit hard for me to confirm. I've only spotted it twice out of dozens of test runs over the past week. If you issue another beta, I'll be able to depend on that at least.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Oct 22, 2014
… prevent possible NPE.

The IndicesWarmer gets set before the InternalIndexService gets set, which can lead to a small time window were InternalIndexService isn't set

Closes elastic#8140
Closes elastic#8168
martijnvg added a commit that referenced this issue Oct 22, 2014
… prevent possible NPE.

The IndicesWarmer gets set before the InternalIndexService gets set, which can lead to a small time window were InternalIndexService isn't set

Closes #8140
Closes #8168
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
… prevent possible NPE.

The IndicesWarmer gets set before the InternalIndexService gets set, which can lead to a small time window were InternalIndexService isn't set

Closes elastic#8140
Closes elastic#8168
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants