New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add checks in term and terms queries that input terms are not too long #99818
Conversation
Hi @romseygeek, I've created a changelog YAML for you. |
Pinging @elastic/es-search (Team:Search) |
…bug/long-term-queries
server/src/main/java/org/elasticsearch/index/query/AbstractQueryBuilder.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @romseygeek for a quick fix, LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* @param input an input BytesRef | ||
* @return a String prefix | ||
*/ | ||
public static String safeStringPrefix(BytesRef input, int prefixLength) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this method intended to be used elsewhere or only a helper method for checkIndexableLength(...)
?
In the latter case, maybe make this a private helper method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's a good call, will make this private.
Should this change be back ported to the 8.10 branch? |
Yes, good idea. |
💚 Backport successful
|
elastic#99818) Lucene indexes do not allow terms of greater than 32k bytes long. Any queries that contain terms that exceed this length will by definition not match anything, and can cause cluster instability by consuming large amounts of heap. They are also generally always a user error (for example, a termsquery that concatenates all its inputs into a single string rather than splitting them into json arrays). This commit adds some checking to Term and Terms query builders that will throw an exception if any of their input terms are greater than the maximum allowed length by the lucene IndexWriter. Fixes elastic#99802
#99818) (#99863) Lucene indexes do not allow terms of greater than 32k bytes long. Any queries that contain terms that exceed this length will by definition not match anything, and can cause cluster instability by consuming large amounts of heap. They are also generally always a user error (for example, a termsquery that concatenates all its inputs into a single string rather than splitting them into json arrays). This commit adds some checking to Term and Terms query builders that will throw an exception if any of their input terms are greater than the maximum allowed length by the lucene IndexWriter. Fixes #99802
Lucene indexes do not allow terms of greater than 32k bytes long. Any queries that
contain terms that exceed this length will by definition not match anything, and can
cause cluster instability by consuming large amounts of heap. They are also generally
always a user error (for example, a termsquery that concatenates all its inputs into
a single string rather than splitting them into json arrays).
This commit adds some checking to Term and Terms query builders that will throw an
exception if any of their input terms are greater than the maximum allowed length by
the lucene IndexWriter.
Fixes #99802