Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MB-31358: Estimation of memory needed for query with highlighting #1554

Merged
merged 2 commits into from
Feb 8, 2021

Conversation

abhinavdangeti
Copy link
Member

@abhinavdangeti abhinavdangeti commented Feb 5, 2021

  • Check this test ..
func TestEstimate(t *testing.T) {
    q := []byte(`{"from":8996,"highlight":{},"query":{"bool":false,"field":"free_parking"},"size":405}`)
    var sr *SearchRequest
    err := json.Unmarshal(q, &sr)
    if err != nil {
        t.Fatal(err)
    }

    x := MemoryNeededForSearchResult(sr)
    fmt.Println(x)
}
  • For the query, the MemoryNeededForSearchResult API claims that it
    needs .. 6363433784 bytes (that's over 6GB!).
  • Fixing the size calculation estimate when the query requests for
    "fields" to be returned or "highlighting", basing the number of
    results returned on the collector's PreAllocSizeSkipCap.
  • Also removing the factor "numDocMatches" to the document size in
    this scenario.

+ Check this test ..
```
func TestEstimate(t *testing.T) {
    q := []byte(`{"from":8996,"highlight":{},"query":{"bool":false,"field":"free_parking"},"size":405}`)
    var sr *SearchRequest
    err := json.Unmarshal(q, &sr)
    if err != nil {
        t.Fatal(err)
    }

    x := MemoryNeededForSearchResult(sr)
    fmt.Println(x)
}
```

+ For the query, the MemoryNeededForSearchResult API claims that it
  needs .. 6363433784 bytes (that's over 6GB!).
+ Fixing the size calculation estimate when the query requests for
  "fields" to be returned or "highlighting", basing the number of
  results returned on the collector's PreAllocSizeSkipCap.
+ Also removing the factor "numDocMatches" to the document size in
  this scenario.
Copy link
Contributor

@mschoch mschoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code is clearly wrong as it seems end up with something close to (size+from)^2 ...

However, I think this is wrong as well. Stored/highlight is only ever done on the final docs returned (size - from), so even numDocMatches will over estimate it.

Further, why are we adding in a loop, can we not just multiply?

I apologize in advance if I'm missing something fundamental here, the original code feels so wrong it's hard to discern the original intent...

@abhinavdangeti
Copy link
Member Author

Haha you're right.
Although stored/highlight would return just size and not (size - from).

@mschoch
Copy link
Contributor

mschoch commented Feb 6, 2021

@abhinavdangeti yeah you're right, just size.

@abhinavdangeti abhinavdangeti merged commit 36861dd into blevesearch:master Feb 8, 2021
@abhinavdangeti abhinavdangeti deleted the memEst branch February 8, 2021 15:19
@mschoch mschoch added this to the v2.0.2 milestone Feb 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants