Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Opensearch Dashboards getting 500 internal server error while fetching search aggregation data #8115

Closed
manasvinibs opened this issue Jun 16, 2023 · 6 comments · Fixed by #8206
Labels
bug Something isn't working untriaged

Comments

@manasvinibs
Copy link
Member

manasvinibs commented Jun 16, 2023

Describe the bug
Referring to the below commit which is making version specific changes to 2.8.0. Looks like for Dashboards versions above 2.8.0, this is a breaking change. Currently, we are seeing functional test failures which might be related to this on both windows and linux while running on main (3.0.0) version. We don't see this issue in our 2.x branch (which is currently not bumped to 2.8.1 yet).

#7514

To Reproduce
Steps to reproduce the behavior:

We can reproduce the error in local with sample flight test data and load the Dashboard page.

Screenshot 2023-06-16 at 11 20 08 AM
{
	"statusCode": 500,
	"error": "Internal Server Error",
	"message": "search_phase_execution_exception: [response_handler_failure_transport_exception] Reason: java.lang.NullPointerException: Cannot invoke \"org.opensearch.common.io.stream.DelayableWriteable.asSerialized(org.opensearch.common.io.stream.Writeable$Reader, org.opensearch.common.io.stream.NamedWriteableRegistry)\" because the return value of \"org.opensearch.search.query.QuerySearchResult.aggregations()\" is null",
	"attributes": {
		"error": {
			"root_cause": [{
				"type": "response_handler_failure_transport_exception",
				"reason": "java.lang.NullPointerException: Cannot invoke \"org.opensearch.common.io.stream.DelayableWriteable.asSerialized(org.opensearch.common.io.stream.Writeable$Reader, org.opensearch.common.io.stream.NamedWriteableRegistry)\" because the return value of \"org.opensearch.search.query.QuerySearchResult.aggregations()\" is null"
			}],
			"type": "search_phase_execution_exception",
			"reason": "",
			"phase": "fetch",
			"grouped": true,
			"failed_shards": [{
				"shard": 0,
				"index": "opensearch_dashboards_sample_data_flights",
				"node": "91-QImtGTB2vg2_m2Y07HA",
				"reason": {
					"type": "response_handler_failure_transport_exception",
					"reason": "java.lang.NullPointerException: Cannot invoke \"org.opensearch.common.io.stream.DelayableWriteable.asSerialized(org.opensearch.common.io.stream.Writeable$Reader, org.opensearch.common.io.stream.NamedWriteableRegistry)\" because the return value of \"org.opensearch.search.query.QuerySearchResult.aggregations()\" is null",
					"caused_by": {
						"type": "null_pointer_exception",
						"reason": "Cannot invoke \"org.opensearch.common.io.stream.DelayableWriteable.asSerialized(org.opensearch.common.io.stream.Writeable$Reader, org.opensearch.common.io.stream.NamedWriteableRegistry)\" because the return value of \"org.opensearch.search.query.QuerySearchResult.aggregations()\" is null"
					}
				}
			}],
			"caused_by": {
				"type": "null_pointer_exception",
				"reason": "Cannot invoke \"org.opensearch.search.aggregations.InternalAggregations.getSerializedSize()\" because \"reducePhase.aggregations\" is null"
			}
		}
	}
}

Expected behavior
Dashboards should be able to consume search pipeline API response without any issue.

Opensearch Dashboards version > 2.8.0

Additional context
Add any other context about the problem here.

@manasvinibs manasvinibs added bug Something isn't working untriaged labels Jun 16, 2023
@manasvinibs manasvinibs changed the title [BUG] Opensearch Dashboards getting 500 internal server error while fetching search pipeline data [BUG] Opensearch Dashboards getting 500 internal server error while fetching search aggregation data Jun 16, 2023
@msfroh
Copy link
Collaborator

msfroh commented Jun 16, 2023

What is the request being sent from OpenSearch Dashboards to OpenSearch?

This is an exception being thrown by OpenSearch in response to a specific request. With that request, it will be easier to determine why OpenSearch is throwing an exception.

@manasvinibs
Copy link
Member Author

Request payload

{
	"params": {
		"index": "opensearch_dashboards_sample_data_flights",
		"body": {
			"aggs": {},
			"size": 0,
			"stored_fields": ["*"],
			"script_fields": {
				"hour_of_day": {
					"script": {
						"source": "doc['timestamp'].value.hour",
						"lang": "painless"
					}
				}
			},
			"docvalue_fields": [{
				"field": "timestamp",
				"format": "date_time"
			}],
			"_source": {
				"excludes": []
			},
			"query": {
				"bool": {
					"must": [],
					"filter": [{
						"match_all": {}
					}, {
						"match_phrase": {
							"FlightDelay": {
								"query": true
							}
						}
					}, {
						"range": {
							"timestamp": {
								"gte": "2023-06-15T17:50:18.814Z",
								"lte": "2023-06-16T17:50:18.814Z",
								"format": "strict_date_optional_time"
							}
						}
					}],
					"should": [],
					"must_not": []
				}
			}
		},
		"preference": 1686937797764
	}
}

@AMoo-Miki
Copy link

@sohami Could this be related to #7514 ?

@ananzh
Copy link
Member

ananzh commented Jun 21, 2023

@sohami this is the PR breaking us and cause the issue. I used ./gradlew run to build from scratch and set one commit before and after this #7514 and this is the one start to see the errors.

@ananzh
Copy link
Member

ananzh commented Jun 21, 2023

This is the error from OS side

[2023-06-21T17:21:20,329][INFO ][o.o.a.s.QueryPhaseResultConsumer] [runTask-0] Created ReducedQueryPhase with totalHits: 0 hits, numReducePhases: 1
[2023-06-21T17:21:20,329][WARN ][r.suppressed             ] [runTask-0] path: /opensearch_dashboards_sample_data_flights/_search, params: {ignore_unavailable=true, preference=1687368079458, index=opensearch_dashboards_sample_data_flights, timeout=30000ms, track_total_hits=true}
org.opensearch.action.search.SearchPhaseExecutionException: 
        at org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:664) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:128) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:54) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.threadpool.TaskAwareRunnable.doRun(TaskAwareRunnable.java:78) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:59) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
        at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: java.lang.NullPointerException: Cannot invoke "org.opensearch.search.aggregations.InternalAggregations.getSerializedSize()" because "reducePhase.aggregations" is null
        at org.opensearch.action.search.QueryPhaseResultConsumer.reduce(QueryPhaseResultConsumer.java:171) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:137) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:123) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
        ... 8 more

@ananzh
Copy link
Member

ananzh commented Jun 21, 2023

If we add some logs below, we could see aggList is 0

    public SearchPhaseController.ReducedQueryPhase reduce() throws Exception {
    if (pendingMerges.hasPendingMerges()) {
        throw new AssertionError("partial reduce in-flight");
    } else if (pendingMerges.hasFailure()) {
        throw pendingMerges.getFailure();
    }

    // ensure consistent ordering
    pendingMerges.sortBuffer();

    // consuming data
    final SearchPhaseController.TopDocsStats topDocsStats = pendingMerges.consumeTopDocsStats();
    logger.info("TopDocsStats: " + topDocsStats.toString());
    final List<TopDocs> topDocsList = pendingMerges.consumeTopDocs();
    logger.info("Size of TopDocsList: " + topDocsList.size());
    final List<InternalAggregations> aggsList = pendingMerges.consumeAggs();
    logger.info("Size of AggsList: " + aggsList.size());
    long breakerSize = pendingMerges.circuitBreakerBytes;
    if (hasAggs) {
        breakerSize = pendingMerges.addEstimateAndMaybeBreak(pendingMerges.estimateRamBytesUsedForReduce(breakerSize));
    }

...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged
Projects
None yet
4 participants