Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix QueryContext race condition #13049

Merged
merged 1 commit into from
Sep 12, 2022
Merged

Conversation

paul-rogers
Copy link
Contributor

@paul-rogers paul-rogers commented Sep 7, 2022

Fixes #12955, #12739.

Ensures that JDBC makes a copy of the QueryContext stored on a connection when creating statements. Doing so avoids race conditions which produced flaky results in DruidAvaticaHandlerTest.testConcurrentQueries().

An earlier version of this PR first took a hack approach. @gianm correctly pointed out that the problem is really with a flaw in JDBC. A second version explored the alternative of replacing QueryContext. That is a good idea, and expands on work that @FrankChen021 started in #13022, but was overkill for the specific problem in question. The refactoring was moved to a new PR, #13071. In the end, the simplest fix was good enough, and is safe, so this PR presents just that minimal fix.

The fix was verified by running testConcurrentQueries() many times in a loop until it failed due to an OOM. That OOM is a separate issue to be addressed elsewhere.


This PR has:

  • been self-reviewed.
  • been tested in a test Druid cluster.

@imply-cheddar
Copy link
Contributor

What was the stack trace that this is now fixing? I'm curious as to the source of the NPE...

I say that because races inside of that if should still set the value to be non-null, so a race there shouldn't generate an NPE. I just noticed that mergedParams is not marked as volatile, so it could actually be an issue with memory cache lines across cores as the non-volatile update wouldn't force a cache-line update. Just changing the if clause to return merged (something guaranteed to be non-null) instead of rolling out to the final return mergedParams would probably actually achieve the same result. That said, the more better fix for thise would be to make mergedParams volatile given that it is changing (or, even better, make it an AtomicReference<Map<>> to make it super explicit that we expect multiple threads).

A perhaps even nicer fix would be a DelegatingMap class or something like that which just wraps the chain of Maps and does the resolution at read time instead. In that world we could create it as a final at construction time and just reuse that. But, if such a class doesn't already exist, creating it and proving that it is correct is a lot more difficult than just switching to an AtomicReference.

@gianm
Copy link
Contributor

gianm commented Sep 8, 2022

This enhancement is a back-door that converts QueryContext from immutable to mutable.

QueryContext is definitely mutable even without that: it's got a bunch of methods for adding and removing context parameters. They each call invalidateMergedParams(), which sets mergedParams to null. It's believable that this is what leads to the NPEs: some thread is calling getMergedParams() while some other thread is calling one of the mutator methods. It's also believable it's due to some weirdness due to mergedParams not being volatile, and two threads entering getMergedParams() at the same time even though there are no mutations going on.

Either way, it feels sketchy to patch over it by synchronizing invalidateMergedParams() alone. In particular, there's still two remaining areas where callers need to worry about concurrency safety:

  • nothing intrinsic to QueryContext guarantees that mutators like removeUserParam can run safely concurrently with getMergedParams; it's up to the caller to avoid doing this
  • nothing intrinsic to QueryContext guarantees that there is a happens-before relationship between mutations and calls to getMergedParams; it's up to the caller to ensure this is set up properly

For these reasons the change in the patch feels like it's halfway between two good solutions. In this patch, QueryContext takes on some responsibility for concurrency safety, where previously it had none. But it still leaves a lot up to the caller. I'd rather either leave it all up to the caller, which means not doing this change, and instead, doing another change where we prevent callers from using the context in such a way; or, leave none of it to the caller, which means making QueryContext fully concurrency-safe.

@imply-cheddar
Copy link
Contributor

Oh man, you are right, QueryContext went and became mutable but has basically zero controls for thread-safety. Even the maps underlying each of the different types of parameters are not thread-safe currently. So it's not just about the creation of the merged map but the general thread-safety of QueryContext itself.

Why did we change it to be mutable in the first place? With every mutation, if we are invalidating the merged map, we are either

  1. Creating a world where we will rebuild the merged map over and over and over again as we mutate and then read and then mutate and then read, OR
  2. The code is setup to do all of the mutations necessary in one place and then is attempting to cascade that down.

In either case, it's actually no different than having an immutable object that effectively copies itself every time a "mutation" happens. For the former, we already end up copying the entire object to rebuild the merged map. For the latter case, if the mutations are all bundled together in one place, it's pretty easy to build that in a Map first and then pass into a brand new QueryContext.

I'm wondering if we shouldn't move QueryContext back to being an immutable object?

@gianm
Copy link
Contributor

gianm commented Sep 9, 2022

What was the stack trace that this is now fixing? I'm curious as to the source of the NPE...

I saw this one when running the test locally in repeat-until-failure mode:

2022-09-09T14:52:33,089 ERROR [qtp857100926-8122] org.apache.druid.sql.avatica.DruidMeta - Cannot invoke "java.util.Map.get(Object)" because "context" is null
java.lang.NullPointerException: Cannot invoke "java.util.Map.get(Object)" because "context" is null
	at org.apache.druid.query.QueryContexts.parseInt(QueryContexts.java:482) ~[classes/:?]
	at org.apache.druid.query.QueryContexts.getInSubQueryThreshold(QueryContexts.java:357) ~[classes/:?]
	at org.apache.druid.query.QueryContexts.getInSubQueryThreshold(QueryContexts.java:352) ~[classes/:?]
	at org.apache.druid.sql.calcite.planner.PlannerFactory.buildFrameworkConfig(PlannerFactory.java:149) ~[classes/:?]
	at org.apache.druid.sql.calcite.planner.PlannerFactory.createPlanner(PlannerFactory.java:113) ~[classes/:?]
	at org.apache.druid.sql.DirectStatement.plan(DirectStatement.java:206) ~[classes/:?]
	at org.apache.druid.sql.DirectStatement.execute(DirectStatement.java:181) ~[classes/:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at java.lang.Thread.run(Thread.java:833) ~[?:?]

The trace is confusing because it happens in a separate thread and is thrown without a wrapper. I added a wrapper, ran the tests until failure again, and got:

2022-09-09T14:58:04,679 ERROR [qtp1261714285-10131] org.apache.druid.sql.avatica.DruidMeta - Failed to execute statement
java.lang.RuntimeException: Failed to execute statement
	at org.apache.druid.sql.avatica.DruidJdbcResultSet.execute(DruidJdbcResultSet.java:121) ~[classes/:?]
	at org.apache.druid.sql.avatica.DruidJdbcStatement.execute(DruidJdbcStatement.java:58) ~[classes/:?]
	at org.apache.druid.sql.avatica.DruidMeta.prepareAndExecute(DruidMeta.java:332) ~[classes/:?]
	at org.apache.calcite.avatica.remote.LocalService.apply(LocalService.java:206) ~[avatica-core-1.17.0.jar:1.17.0]
	at org.apache.calcite.avatica.remote.Service$PrepareAndExecuteRequest.accept(Service.java:927) ~[avatica-core-1.17.0.jar:1.17.0]
	at org.apache.calcite.avatica.remote.Service$PrepareAndExecuteRequest.accept(Service.java:879) ~[avatica-core-1.17.0.jar:1.17.0]
	at org.apache.calcite.avatica.remote.AbstractHandler.apply(AbstractHandler.java:94) ~[avatica-core-1.17.0.jar:1.17.0]
	at org.apache.calcite.avatica.remote.ProtobufHandler.apply(ProtobufHandler.java:46) ~[avatica-core-1.17.0.jar:1.17.0]
	at org.apache.calcite.avatica.server.AvaticaProtobufHandler.handle(AvaticaProtobufHandler.java:126) ~[avatica-server-1.17.0.jar:1.17.0]
	at org.apache.druid.sql.avatica.DruidAvaticaProtobufHandler.handle(DruidAvaticaProtobufHandler.java:61) ~[classes/:?]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.server.Server.handle(Server.java:516) ~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487) ~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732) ~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479) ~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) ~[jetty-server-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) ~[jetty-io-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) ~[jetty-io-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) ~[jetty-io-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) ~[jetty-util-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) ~[jetty-util-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) ~[jetty-util-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) ~[jetty-util-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409) ~[jetty-util-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) ~[jetty-util-9.4.48.v20220622.jar:9.4.48.v20220622]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) ~[jetty-util-9.4.48.v20220622.jar:9.4.48.v20220622]
	at java.lang.Thread.run(Thread.java:833) ~[?:?]
Caused by: java.lang.NullPointerException: Cannot invoke "java.util.Map.get(Object)" because "context" is null
	at org.apache.druid.query.QueryContexts.parseBoolean(QueryContexts.java:492) ~[classes/:?]
	at org.apache.druid.query.QueryContexts.isTimeBoundaryPlanningEnabled(QueryContexts.java:362) ~[classes/:?]
	at org.apache.druid.sql.calcite.run.NativeSqlEngine.feature(NativeSqlEngine.java:106) ~[classes/:?]
	at org.apache.druid.sql.calcite.planner.PlannerContext.engineHasFeature(PlannerContext.java:400) ~[classes/:?]
	at org.apache.druid.sql.calcite.rel.DruidQuery.toTimeBoundaryQuery(DruidQuery.java:858) ~[classes/:?]
	at org.apache.druid.sql.calcite.rel.DruidQuery.computeQuery(DruidQuery.java:822) ~[classes/:?]
	at org.apache.druid.sql.calcite.rel.DruidQuery.<init>(DruidQuery.java:165) ~[classes/:?]
	at org.apache.druid.sql.calcite.rel.DruidQuery.fromPartialQuery(DruidQuery.java:250) ~[classes/:?]
	at org.apache.druid.sql.calcite.rel.PartialDruidQuery.build(PartialDruidQuery.java:309) ~[classes/:?]
	at org.apache.druid.sql.calcite.rel.DruidQueryRel.toDruidQuery(DruidQueryRel.java:128) ~[classes/:?]
	at org.apache.druid.sql.calcite.rel.DruidRel.runQuery(DruidRel.java:54) ~[classes/:?]
	at org.apache.druid.sql.calcite.planner.DruidPlanner.lambda$planWithDruidConvention$3(DruidPlanner.java:481) ~[classes/:?]
	at org.apache.druid.sql.calcite.planner.PlannerResult.run(PlannerResult.java:62) ~[classes/:?]
	at org.apache.druid.sql.DirectStatement$ResultSet.run(DirectStatement.java:106) ~[classes/:?]
	at org.apache.druid.sql.DirectStatement.execute(DirectStatement.java:181) ~[classes/:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	... 1 more

@paul-rogers
Copy link
Contributor Author

All good points. More fundamentally, context variables only make sense to set in the planning thread, then only read in the execution threads. It is inherently unstable to change a value in one thread concurrently with reading the context in another.

I’ll do more debug code to track down the issue and enforce proper semantics.

@gianm
Copy link
Contributor

gianm commented Sep 9, 2022

I think I see the problem. DruidConnection (object that represents a JDBC connection on the server side) uses a single QueryContext object to create all DruidJdbcStatements associated with that connection. The query lifecycle code associated with each statement mutates the contexts, which isn't safe because they are shared. I believe the bug was introduced in #12396, since prior to that patch, the context in DruidConnection was an ImmutableMap, and query planning copies the map rather than mutating it in-place; for example, when initializing.

Oh man, you are right, QueryContext went and became mutable but has basically zero controls for thread-safety. Even the maps underlying each of the different types of parameters are not thread-safe currently. So it's not just about the creation of the merged map but the general thread-safety of QueryContext itself.

It looks like the one in BaseQuery to be a Map<String, Object> prior to #12396. Looking through the diff, there are some places where the old code copied that map before mutating it, and the new code writes to the map in-place. So we could keep it mutable (like the old Map was) but restore the copies in the places where they used to be happening. We could also go fully-immutable with a little more refactoring.

@paul-rogers
Copy link
Contributor Author

@gianm, @imply-cheddar, the mess goes deeper. The native query "engine" adjusts context values as it proceeds. But, the places that do the adjusting don't know if they are working with "default", "system" or "user" parameters: they just want to call Query.getContext(), set what they need, and set the new values with Query.withOverriddenContext(Map<String, Object> contextOverride). getContext() returns the merged values, then the code parties on top of of these values, writing these back to the context as a new QueryContext. This loses all information about which were defaults or system.

I hate to say it, but this design is a complete muddle: it won't work as currently coded. The race condition is the least of our worries: the greater worry is that the logic itself is inconsistent.

There is somewhat the same muddle in the SQL layer that I've been wrestling with around using QueryContext with the new SqlQueryPlus class: sometimes code wants to adjust the context, but which of the three maps should it adjust?

It seems we introduced the "default" and "system" parameters to work around some issue in doing a security check on parameters? If so, then perhaps we should just solve that problem directly, which we can do in the SQL layer (if that is the only place we enforce context security.) The new rules are simple:

  • Keep track of the context the user provides. Use that for security.
  • Maintain a separate "live" context that queries can party on to their heart's content. Don't try to divide up values here, it is just a simple map.

So, the question for this group: do you know of any place, other than SQL context value security, where we need to separate the kinds of parameters?

I'll try the above approach to see if it hits any roadblocks. This approach will, unfortunately, change public APIs, so we may need to devise solutions for extensions not in the Druid code base.

@paul-rogers paul-rogers changed the title Synchronize query context getMergedParams() Remove QueryContext class Sep 10, 2022
@paul-rogers
Copy link
Contributor Author

Eliminated QueryContext to address the root clause of the muddle we find ourselves in. Let's let the build find issues (there will likely be some).

@paul-rogers paul-rogers changed the title Remove QueryContext class Revise QueryContext class Sep 10, 2022
@paul-rogers paul-rogers mentioned this pull request Sep 11, 2022
4 tasks
@paul-rogers paul-rogers changed the title Revise QueryContext class Fix QueryContext race condition Sep 12, 2022
@paul-rogers
Copy link
Contributor Author

Revised again to only fix the race condition which @gianm identified. The refactoring moved to a new PR. The result is a two-line change.

@gianm
Copy link
Contributor

gianm commented Sep 12, 2022

This targeted fix here looks good to me!

As to the bigger issue of what to do with query contexts generally — looking through the history of #12396, there seems to have been a bunch of approaches evaluated, such as in this thread: #12396 (comment). We can continue that conversation on #13071.

@gianm gianm merged commit 80b97ac into apache:master Sep 12, 2022
clintropolis pushed a commit to clintropolis/druid that referenced this pull request Oct 28, 2022
@kfaraz kfaraz added this to the 24.0.1 milestone Oct 29, 2022
kfaraz pushed a commit that referenced this pull request Oct 29, 2022
Co-authored-by: Paul Rogers <paul-rogers@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FLAKY UT : DruidAvaticaProtobufHandlerTest>DruidAvaticaHandlerTest.testConcurrentQueries:766
5 participants