Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix concurrency bug in AbstractStringScriptFieldAutomatonQuery #106678

Merged
merged 6 commits into from Mar 26, 2024

Conversation

javanna
Copy link
Member

@javanna javanna commented Mar 22, 2024

Back when we introduced queries against runtime fields, Elasticsearch did not support
inter-segment concurrency yet. At the time, it was fine to assume that segments will be
searched sequentially. AbstractStringScriptFieldAutomatonQuery used to have a BytesRefBuilder
instance shared across the segments, which gets re-initialized when each segment starts its work.
This is no longer possible with inter-segment concurrency.

Closes #105911

@javanna javanna added >bug :Search/Search Search-related issues that do not fall into other categories v8.13.1 v8.14.0 labels Mar 22, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Mar 22, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @javanna, I've created a changelog YAML for you.

javanna and others added 3 commits March 25, 2024 18:01
Back when we introduced queries against runtime fields, Elasticsearch did not support
inter-segment concurrency yet. At the time, it was fine to assume that segments will be
searched sequentially. AbstractStringScriptFieldAutomatonQuery used to have a BytesRefBuilder
instance shared across the segments, which gets re-initialized when each segment starts its work.
This is no longer possible with inter-segment concurrency.

Closes elastic#105911
@javanna javanna force-pushed the fix/runtime_fields_automata_concurrency branch from 90f73db to 870faf2 Compare March 25, 2024 17:03
@javanna javanna force-pushed the fix/runtime_fields_automata_concurrency branch from 870faf2 to 4e3c3d2 Compare March 25, 2024 20:22
@javanna javanna requested a review from nik9000 March 25, 2024 20:22
assertFalse(query.matches(List.of("faaa"), scratch));
}

public void testConcurrentMatches() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These concurrency test are kind of artificial now that they rely on the different matches method... we could rely on integration tests perhaps instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you built a Scorer or something that could work. That's kind of integration-y. You'd need to make a lucene index, but sometimes that's life.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I later realized that we already have coverage for all of this in our script field type tests. Only, those did not leverage concurrency as we never ended up running them against multiple segments. I have added an addDocument method to the base class that more aggressively flushes, this is ok as we are only ever adding a few documents. The random flush within RandomIndexWriter needs a minimum number of 10 docs which is never reached in these tests.
With this adjustment, I was able to reproduce the problem and ensure that it is now fixed.

assertFalse(query.matches(List.of("faaa"), scratch));
}

public void testConcurrentMatches() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you built a Scorer or something that could work. That's kind of integration-y. You'd need to make a lucene index, but sometimes that's life.

@@ -64,6 +65,13 @@ public abstract class AbstractScriptFieldTypeTestCase extends MapperServiceTestC

protected abstract String typeName();

protected static <T extends IndexableField> void addDocument(RandomIndexWriter iw, Iterable<T> indexableFields) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you stick a javadoc on this so a reader can mouse over the method and get a quick sense of what it does?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm full of good ideas. And bad ones.

@javanna javanna added the auto-backport Automatically create backport pull requests when merged label Mar 26, 2024
@javanna javanna merged commit 08d7542 into elastic:main Mar 26, 2024
14 checks passed
@javanna javanna deleted the fix/runtime_fields_automata_concurrency branch March 26, 2024 15:35
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.13

javanna added a commit that referenced this pull request Mar 26, 2024
Back when we introduced queries against runtime fields, Elasticsearch did not support
inter-segment concurrency yet. At the time, it was fine to assume that segments will be
searched sequentially. AbstractStringScriptFieldAutomatonQuery used to have a BytesRefBuilder
instance shared across the segments, which gets re-initialized when each segment starts its work.
This is no longer possible with inter-segment concurrency.

Closes #105911
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.13.1 v8.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data race condition in automaton queries
3 participants