New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix concurrency bug in AbstractStringScriptFieldAutomatonQuery #106678
Fix concurrency bug in AbstractStringScriptFieldAutomatonQuery #106678
Conversation
Pinging @elastic/es-search (Team:Search) |
Hi @javanna, I've created a changelog YAML for you. |
.../src/main/java/org/elasticsearch/search/runtime/AbstractStringScriptFieldAutomatonQuery.java
Outdated
Show resolved
Hide resolved
Back when we introduced queries against runtime fields, Elasticsearch did not support inter-segment concurrency yet. At the time, it was fine to assume that segments will be searched sequentially. AbstractStringScriptFieldAutomatonQuery used to have a BytesRefBuilder instance shared across the segments, which gets re-initialized when each segment starts its work. This is no longer possible with inter-segment concurrency. Closes elastic#105911
90f73db
to
870faf2
Compare
870faf2
to
4e3c3d2
Compare
assertFalse(query.matches(List.of("faaa"), scratch)); | ||
} | ||
|
||
public void testConcurrentMatches() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These concurrency test are kind of artificial now that they rely on the different matches method... we could rely on integration tests perhaps instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you built a Scorer
or something that could work. That's kind of integration-y. You'd need to make a lucene index, but sometimes that's life.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I later realized that we already have coverage for all of this in our script field type tests. Only, those did not leverage concurrency as we never ended up running them against multiple segments. I have added an addDocument method to the base class that more aggressively flushes, this is ok as we are only ever adding a few documents. The random flush within RandomIndexWriter needs a minimum number of 10 docs which is never reached in these tests.
With this adjustment, I was able to reproduce the problem and ensure that it is now fixed.
assertFalse(query.matches(List.of("faaa"), scratch)); | ||
} | ||
|
||
public void testConcurrentMatches() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you built a Scorer
or something that could work. That's kind of integration-y. You'd need to make a lucene index, but sometimes that's life.
@@ -64,6 +65,13 @@ public abstract class AbstractScriptFieldTypeTestCase extends MapperServiceTestC | |||
|
|||
protected abstract String typeName(); | |||
|
|||
protected static <T extends IndexableField> void addDocument(RandomIndexWriter iw, Iterable<T> indexableFields) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you stick a javadoc on this so a reader can mouse over the method and get a quick sense of what it does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm full of good ideas. And bad ones.
💚 Backport successful
|
Back when we introduced queries against runtime fields, Elasticsearch did not support inter-segment concurrency yet. At the time, it was fine to assume that segments will be searched sequentially. AbstractStringScriptFieldAutomatonQuery used to have a BytesRefBuilder instance shared across the segments, which gets re-initialized when each segment starts its work. This is no longer possible with inter-segment concurrency. Closes #105911
Back when we introduced queries against runtime fields, Elasticsearch did not support
inter-segment concurrency yet. At the time, it was fine to assume that segments will be
searched sequentially. AbstractStringScriptFieldAutomatonQuery used to have a BytesRefBuilder
instance shared across the segments, which gets re-initialized when each segment starts its work.
This is no longer possible with inter-segment concurrency.
Closes #105911