Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryDSL not thread-safe, Alias.$(alias.getId()) returns null #2671

Closed
hohwille opened this issue Oct 23, 2020 · 16 comments · Fixed by #2673
Closed

QueryDSL not thread-safe, Alias.$(alias.getId()) returns null #2671

hohwille opened this issue Oct 23, 2020 · 16 comments · Fixed by #2673
Assignees
Milestone

Comments

@hohwille
Copy link

I have been using QueryDSL in various projects over the years. Now I have two concrete projects that are struggling with a severe bug issue that QueryDSL:
As it seems Alias stuff has a concurrency problem leading to Alias.$ function to return null where it should not.
That bug only occurrs in complex multi-threaded batch processing situations and is not so easy to reproduce.

General code pattern of the affected query was something like this (nothing really exciting or exotic):

FooEntity foo = Alias.alias(FooEntity.class);
BarEntity bar = Alias.alias(BarEntity.class);
JPAQuery<FooEntity> query = new JPAQuery<FooEntity>(getEntityManager()).from(Alias.$(foo));
query.join(Alias.$(bar)).on(Alias.$(bar.getFoo().getId()).eq(Alias.$(foo.getId())));
...

We could even reproduce the bug locally and see in the debugger what is going on:

@jwgmeligmeyling
Copy link
Member

I am about to throw out the loading cache anyway as we’re trying to eliminate Guava as a required dependency. So I will be touching this code any time soon.

I have a question though: whats the purpose of Alias.$ in JPQL queries? Why don’t you just use the static metamodel?

@hohwille
Copy link
Author

I have a question though: whats the purpose of Alias.$ in JPQL queries? Why don’t you just use the static metamodel?

From my case I have very bad experience with such code-generation in large scale teams/projects. Whenever you pull from git you might need re-generation and that slows down your development cycle. If you do that, you should also commit the generated code into git and ensure to regenerate and commit it whenever your entities changed.
Using Alias stuff frees us from all this mess even though it is not as convenient to read and write. So more or less just different flavors with pros and cons.

@hohwille
Copy link
Author

I am about to throw out the loading cache anyway as we’re trying to eliminate Guava as a required dependency. So I will be touching this code any time soon.

Awesome. I was already wondering what amount of performance this proxyCache would actually bring. IMHO absolutely neglectable if thead-safeness is fixed instead.
This leads me to the workaround that we could bypass Alias.alias calls for entities by directly using this code:

protected <A> A createProxy(Class<A> cl, Expression<?> path) {

This way we can fix our current problems until the new QueryDSL version with fix of this issue is released.

@jwgmeligmeyling
Copy link
Member

jwgmeligmeyling commented Oct 23, 2020

From my case I have very bad experience with such code-generation in large scale teams/projects. Whenever you pull from git you might need re-generation and that slows down your development cycle.

True, although proxying adds the same overhead in runtime which is considerably bad for large applications as well 😉

If you do that, you should also commit the generated code into git and ensure to regenerate and commit it whenever your entities changed.

Ideally the metamodel is kept up-to-date using a build plugin and the generated code is not commited. Although: you could decide to commit it to not have this build time penalty.

In fact: you don't need to (re)generate the code at all, you can just program your own static metamodel.

Using Alias stuff frees us from all this mess even though it is not as convenient to read and write. So more or less just different flavors with pros and cons.

Yes I am just surprised to see this obscure API used in conjunction with JPA at all and personally I absolutely dislike the look of it. It seems it was mostly developed as a convenience counter part to querydsl-collections, but even then it has quirks. Only a few methods are actually mapped (mostly getters). And methods that return non-final types can never be mapped. Abstracting away too much from the actual query being generated really turns out to confuse our userbase that turns out to be less and less experienced with the actual dialect queried against.

@jwgmeligmeyling
Copy link
Member

jwgmeligmeyling commented Oct 23, 2020

I was already wondering what amount of performance this proxyCache would actually bring. IMHO absolutely neglectable if thead-safeness is fixed instead.

Creating new proxies costs quite a bit in terms of performance, so it makes sense that the proxies are cached. Caching by class and expressions should be fine, because expressions are immutable and equal. The problem here probably lies in the datastructure not being thread safe. Probably the easiest fix is to replace suspicious maps with their thread safe counterparts.

Alternatively we could add more state to the Aliases themselves (which is then effectively thread local as well).

I'll definitely have to look at the code in more detail to see what has to be done here specifically.

@jwgmeligmeyling
Copy link
Member

FYI the issue becomes reproducible after adding the following Rule to AliasTest (making all tests run themselves a couple of times parallel):

public static final int COUNT = 4;

@Rule
public TestRule concurrencyRule = new TestRule() {
    @Override
    public Statement apply(final Statement base, Description description) {
        return new Statement() {
            @Override
            public void evaluate() throws Throwable {
                final CountDownLatch countDownLatch = new CountDownLatch(COUNT);
                ExecutorService executorService = Executors.newFixedThreadPool(COUNT);
                try {
                    List<Future<?>> results = new ArrayList<Future<?>>(COUNT);
                    for (int i = 0; i < COUNT; i++) {
                        results.add(executorService.submit(new Callable<Object>() {
                            @Override
                            public Object call() throws Exception {
                                try {
                                    countDownLatch.countDown();
                                    countDownLatch.await();
                                    base.evaluate();
                                    return null;
                                } catch (Exception e) {
                                    throw e;
                                } catch (Throwable t) {
                                    throw new Exception(t);
                                }
                            }
                        }));
                    }

                    for (Future<?> result : results) {
                        result.get();
                    }
                } finally{
                    executorService.shutdownNow();
                }
            }
        };
    }
};

@jwgmeligmeyling
Copy link
Member

diff --git a/querydsl-core/src/main/java/com/querydsl/core/alias/AliasFactory.java b/querydsl-core/src/main/java/com/querydsl/core/alias/AliasFactory.java
index b2ee2cf3c..dac65ba4e 100644
--- a/querydsl-core/src/main/java/com/querydsl/core/alias/AliasFactory.java
+++ b/querydsl-core/src/main/java/com/querydsl/core/alias/AliasFactory.java
@@ -13,22 +13,19 @@
  */
 package com.querydsl.core.alias;

-import java.util.concurrent.ExecutionException;
-
-import javax.annotation.Nullable;
-
-import com.google.common.cache.CacheBuilder;
-import com.google.common.cache.CacheLoader;
-import com.google.common.cache.LoadingCache;
-import com.mysema.commons.lang.Pair;
 import com.querydsl.core.QueryException;
-import com.querydsl.core.types.EntityPath;
 import com.querydsl.core.types.Expression;
+import com.querydsl.core.types.Path;
 import com.querydsl.core.types.PathMetadataFactory;
-
 import net.sf.cglib.proxy.Enhancer;
 import net.sf.cglib.proxy.MethodInterceptor;

+import javax.annotation.Nullable;
+import java.util.Collections;
+import java.util.Map;
+import java.util.WeakHashMap;
+import java.util.concurrent.ConcurrentHashMap;
+
 /**
  * {@code AliasFactory} is a factory class for alias creation
  *
@@ -42,30 +39,11 @@ class AliasFactory {

     private final TypeSystem typeSystem;

-    // caches top level paths (class/var as key)
-    private final LoadingCache<Pair<Class<?>,String>, EntityPath<?>> pathCache;
-
-    private final LoadingCache<Pair<Class<?>,Expression<?>>, ManagedObject> proxyCache =
-        CacheBuilder.newBuilder().build(
-            new CacheLoader<Pair<Class<?>,Expression<?>>,ManagedObject>() {
-                @Override
-                public ManagedObject load(Pair<Class<?>, Expression<?>> input) {
-                    return (ManagedObject) createProxy(input.getFirst(), input.getSecond());
-                }
-            });
+    private final ConcurrentHashMap<Class<?>, Map<Expression<?>, ManagedObject>> proxyCache = new ConcurrentHashMap<>();

     public AliasFactory(final PathFactory pathFactory, TypeSystem typeSystem) {
         this.pathFactory = pathFactory;
         this.typeSystem = typeSystem;
-        this.pathCache = CacheBuilder.newBuilder().build(
-            new CacheLoader<Pair<Class<?>, String>, EntityPath<?>>() {
-                @Override
-                public EntityPath<?> load(Pair<Class<?>, String> input) {
-                    return (EntityPath<?>) pathFactory.createEntityPath(
-                            input.getFirst(),
-                            PathMetadataFactory.forVariable(input.getSecond()));
-                }
-            });
     }

     /**
@@ -79,8 +57,9 @@ class AliasFactory {
     @SuppressWarnings("unchecked")
     public <A> A createAliasForExpr(Class<A> cl, Expression<? extends A> expr) {
         try {
-            return (A) proxyCache.get(Pair.<Class<?>, Expression<?>>of(cl, expr));
-        } catch (ExecutionException e) {
+            final Map<Expression<?>, ManagedObject> expressionCache = proxyCache.computeIfAbsent(cl, a -> Collections.synchronizedMap(new WeakHashMap<>()));
+            return (A) expressionCache.computeIfAbsent(expr, e -> (ManagedObject) createProxy(cl, expr));
+        } catch (ClassCastException e) {
            throw new QueryException(e);
         }
     }
@@ -105,14 +84,9 @@ class AliasFactory {
      * @param var variable name for the underlying expression
      * @return alias instance
      */
-    @SuppressWarnings("unchecked")
     public <A> A createAliasForVariable(Class<A> cl, String var) {
-        try {
-            Expression<?> path = pathCache.get(Pair.<Class<?>, String>of(cl, var));
-            return (A) proxyCache.get(Pair.<Class<?>, Expression<?>>of(cl, path));
-        } catch (ExecutionException e) {
-            throw new QueryException(e);
-        }
+        final Path<A> expr = pathFactory.createEntityPath(cl, PathMetadataFactory.forVariable(var));
+        return createAliasForExpr(cl, expr);
     }

     /**

This is a working patch. I'll integrate it in the 5.0.0 release.

@hohwille
Copy link
Author

This is a working patch.

I have to disagree that this will solve our problem. As described our problem arises from the proxy-objects that themselves are not thread-safe but get shared accross threads due to the static character of Alias[Factory] and the fact that a global cache is used. Chaning the cache to ConcurrentHashMap does not affect this problem at all.
Also making the Maps inside the proxy (PropertyAccessInvocationHandler) theadsafe is IMHO not the best solution. If creating the proxy is expensive I assume this is because of its metadata that has to be produced by deep-reflection on the class. That metadata can/should IMHO be cached and reused but still every proxy with its state should be created as new object. Surely just my personal five cents on what I observed. I do not try to dictate what do ;)

@hohwille
Copy link
Author

I found a temporary workaround we will use whilst waiting for a fixed release that I would like to share in case someone else currently runs into the problem and gets blocked:
You sub-class AliasFactory with a new class, say NonCachingAliasFactory (has to be put in the same package due to visibility). Here we can override createAliasForVariable and createAliasForExpr and bypass the caches (directly make use of createProxy). Now there are two options:

  1. Use a dirty hack to replace the aliasFactory instance inside Alias class from QueryDSL (will not work when security manager is in place).
  2. Create your own replacement of Alias by more or less copying Alias to say NonCachingAlias and here use your custom NonCachingAliasFactory instead of AliasFactory.

Note: As stated by @jwgmeligmeyling bypassing the cache will have some performance implications.

@jwgmeligmeyling
Copy link
Member

Also making the Maps inside the proxy (PropertyAccessInvocationHandler) theadsafe is IMHO not the best solution.

Why? Anything the proxy returns is completely immutable and therefore ought to be thread-safe.

If creating the proxy is expensive I assume this is because of its metadata that has to be produced by deep-reflection on the class.

Creating a proxy is intensive because its a byte code generating action in the JVM. Its unrelated to the logic QueryDSL does.

As it appears that Alias.* design is intrinsically flawed, I am considering to remove it from querydsl-core and instead introduce a new artifact querydsl-alias in experimental state as we cannot support "ideas" that were never fully worked out.

@jwgmeligmeyling
Copy link
Member

I created a new patch where PropertyAccessInvocationHandler is made thread-safe as well: #2673

@hohwille
Copy link
Author

hohwille commented Nov 5, 2020

For the record: We have even done a performance-test for plain Alias.alias as currently is compared to our workaround of bypassing the cache that I want to share:
The total milliseconds elapsed by creating 1,000,000 aliases was:

  • with Alias.alias: 16,112 ms
  • with our NonCachingAlias.alias: 22,795 ms

So the difference of 6.5 seconds per 1 million created aliases is meassurable but not significat compared to everything else (actually creating the query, sending to DB, parsing, executing, etc.).

IMHO making things thread-safe by adding synchronized keywords or ConcurrentHashMap may seem the obvious answer. However, the overhead of synchronization for every access may be significantly worse.

@hohwille
Copy link
Author

hohwille commented Nov 5, 2020

Why? Anything the proxy returns is completely immutable and therefore ought to be thread-safe.

I do not want to convince anyone here. But just to aswer the question:
The alias instance will put expressions in a ThreadLocal so by design these instances and their effects are not meant to be shared by the end-users accross threads.
However, feel free to make them thread-safe. We have seen developers unfamiliar with QueryDSL who came to the idea to create a static final constant for the alias and therby share this accross threads. I teached them that this is dead wrong and leads to evil concurrency bugs. With your change even such odd things might work properly...

Anyhow, I was also referring to this post https://groups.google.com/g/querydsl/c/WPX2loA_TGk?pli=1

@hohwille
Copy link
Author

hohwille commented Nov 5, 2020

As it appears that Alias.* design is intrinsically flawed, I am considering to remove it from querydsl-core and instead introduce a new artifact querydsl-alias in experimental state as we cannot support "ideas" that were never fully worked out.

For support this might make sense from a developers and maintenance point of view. However, from the few of your users, you are breaking compatibility and later on remove a feature that was provided. I do not see why this can't be fixed easily (we have already deployed our workaround meanwhile) and it has to be "experimental". The effect may be that you alienate some of your users that relied on this feature in a really large code-base in productive apps...

@jwgmeligmeyling
Copy link
Member

We have seen developers unfamiliar with QueryDSL who came to the idea to create a static final constant for the alias and therby share this accross threads. [...] The alias instance will put expressions in a ThreadLocal so by design these instances and their effects are not meant to be shared by the end-users accross threads.

The latest computation from an alias invocation is indeed passed by a thread local variable. So invoking $(c.getMate().getName()) will fail for $(someExpr) where someExpr is computed from another thread. As long as there is a single interaction it should be fine. So $(someAliasFromAnotherThread.getName()) would actually work. I still see how this is error prone and why this use should indeed be discouraged as long as the implementation relies on a threadlocal.

I am actually not sure why this design was chosen. The expression could just as well be exposed through the proxy as a method, hereby making it completely thread safe...

I teached them that this is dead wrong and leads to evil concurrency bugs. With your change even such odd things might work properly... Anyhow, I was also referring to this post https://groups.google.com/g/querydsl/c/WPX2loA_TGk?pli=1

That thread is from 2003. The issue with the faulty ConstantImpl implementation was resolved in QueryDSL 3.0.0. As stated in that forum post: all expression types ought to be immutable and thus completely thread safe. Alias.* is just a factory for expressions with little to no need for any state. It therefore has the potential to be fully thread-safe as well.

@jwgmeligmeyling
Copy link
Member

jwgmeligmeyling commented Nov 5, 2020

Based on your remarks I've adjusted PR #2673 so that the returned value of ManagedObject#__mappedPath is always prefered over the thread local context, making it safe to constantify proxies in your application. (Mind that it won't work for value types for obvious reasons),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants