Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Spel Expressions leaks classes [SPR-15460] #20020
SpelCompiler / SpelExpression can leak classes when a compiled expression reverts back to interpreted mode and then once again compiles.
Example attached, prints out the number of loaded classes when the same expression is repeatedly evaluated for just 2 types of root object (simplest case).
One possible solution might be to have the SpelCompiler "store" classes for specific bytecode and check for identical bytecode generated (excepting compiled expression classname), returning the class that matches the bytecode instead of "defining" it again.
Referenced from: commits c1edb3b
Andy Clement commented
The absolute simplest thing to do is just to use a new ChildClassLoader for each expression class. They are only used to load the bytes of the compiled expression, nothing else. This would mean when recompilation occurs and the compiledAst is replaced in the SpelExpression then there would be no other remaining live references to the old class and it is a candidate for GC. Just tried it with your test program (thanks for creating that!) and it works as expected. The number of loaded classes goes up but then comes down on a gc. It is perhaps a little wasteful on classloaders, we could switch expressions to using their own if we see them doing this flip/flop behaviour (falling backed to interpreted from compiled repeatedly) - but still we do have an existing 'slow leak' as using a single classloader for all the compiled expressions is going to grow and grow anyway.
Whether to do more than that perhaps depends on how often this situation comes up in the wild. Even with a caching mechanism as described, you aren't going to get 'full speed' all the time, it is just tackling the leak problem. On encountering a different 'environment' (in this case a change of root context object, but it could be change in the type of a variable the expression references, or similar) evaluation will fall back to interpreter mode for at least 100 evaluations. Then the next evaluation would flip to compiled mode and use the 'cached class' but then the immediately following evaluation, if using a different environment again, will flip it back to interpreted. In that setup only 1 in every 100 runs will be 'compiled speed' - effectively meaning it isn't worth using compiled mode at all - it'll possibly be going slower than just using interpreted mode all the time. The mixed mode is more intended for occasional changes to the environment the expression is being used in but that then settling down for a while.
To get optimal speed in the sample program, you would use two copies of the expression, one for each kind of input object. After 100 evaluations of each they'll each get compiled once and then fly at full speed. But obviously that puts the need to manage expressions in the users hands.
Andy Clement commented
I've committed what I think is a reasonable solution to this for now. I didn't like the idea of too many classloaders so I've made it so that classloaders are used 100 times (to load a compiled expression) before a new one is used. When the 'old one' is orphaned it enables any compiled expressions no longer being referenced from expression objects to be garbage collected. In the sample here it means you don't see it ever increasing, it will periodically drop back as GC occurs. It doesn't mean you couldn't engineer a situation that is still a problem but this is an improvement.
I'm open to doing more work here if we collect up some realistic scenarios that lead to more problems.
Andrew Rowbottom commented
I'm proposing an alternative approach for consideration only (obvious optimizations and wrapper objects ommited for brevity) ..
Now createExpressionClass() could look like:
Plenty of potential upsides and downsides in this approach..