Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SearchContext maintains a size length int[] when there are fewer than size results #4156

Closed
markelliot opened this issue Nov 12, 2013 · 2 comments

Comments

@markelliot
Copy link

Reproduce with:

Node node = NodeBuilder.nodeBuilder().node();
Client client = node.client();

client.prepareIndex("twitter", "twitter")
    .setSource("{ \"user\" : \"kimchy\", \"post_date\" : \"2009-11-15T14:12:12\", \"message\" : \"trying out Elastic Search\" }")
    .execute()
    .get();

WildcardQueryBuilder query = QueryBuilders.wildcardQuery("message", "trying");
SearchRequestBuilder request = client.prepareSearch("twitter")
        .setSearchType(SearchType.DFS_QUERY_AND_FETCH)
        .setQuery(query)
        .setSize(100000)
        .setScroll("1m");
SearchResponse response = request.execute().get();
for (SearchHit hit : response.hits()) {
    System.out.println(hit.getId());
}

client.close();
node.close();

Observe by setting a breakpoint in SearchContext#docIdstoLoad(int[], int, int) and notice that despite creating a brand new index with only one result the int[] is sized at 100,000, to match the specified size.

This is leading to unwanted memory pressure, and appears to be fixable by editing SearchService#shortcutDocIdsToLoad(SearchContext) to size the docIdsToLoad array to the actual number of results.

@markelliot
Copy link
Author

I ran into an issue where an application that makes extensive use of Elasticsearch was making high-rate scroll-type searches with a size set to 100,000, but for many of the searches we were seeing only 0 or 1 results. Regardless of the result size, Elasticsearch was creating int[100000], and eventually OOM'd as a result. We've been running the patch in afb2a5a for about a week now and despite the application continuing to make the same kind of requests, we're no longer experiencing OOMs.

It seems like this would help in terms of stability in memory use. I'm new to open source contribution and to ES, so I'm not sure what the standard process looks like, but I've filled out and submitted the forms described on the contributions page.

Any thoughts around if/when this PR would get incorporated, or something I should be doing in order to help the process move along?

@ghost ghost assigned martijnvg Nov 18, 2013
@martijnvg
Copy link
Member

@markelliot Thanks for opening and fixing this issue! This looks good and I'll pull it in.

martijnvg pushed a commit that referenced this issue Nov 18, 2013
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants