Memory Leak in PredictionContextCache #499

mikedehaan · 2014-03-20T01:34:44Z

After using an antlr generated parser to process several files, the JVM eventually throws an out of memory exception. Analyzing the heap show the PredictionContextCache containing a HashMap with over 1 million items. There needs to be some (hopefully thread safe) way to clear this cache between runs.

https://github.com/antlr/antlr4/blob/master/runtime/Java/src/org/antlr/v4/runtime/atn/PredictionContextCache.java#L40-66

mikedehaan · 2014-03-20T01:58:14Z

After a bit more research...this issue is caused by two variables that are declared static in the generated java parser. Changing these to instance variables (and moving the initializer of "decisionToDFA" to the constructor) has solved my issue.

protected final DFA[] _decisionToDFA;
protected final PredictionContextCache _sharedContextCache =
    new PredictionContextCache();

sharwell · 2014-03-20T12:13:39Z

That change would result in an enormous (negative) performance impact when parsing large numbers of files. If you don't want to use the shared PredictionContextCache, you need to initialize your parser with your own.

MyParser parser = new MyParser(tokens);
DFA[] decisionToDFA = parser.getInterpreter().decisionToDFA;
parser.setInterpreter(new ParserATNSimulator(
  parser, parser.getATN(), decisionToDFA, new PredictionContextCache()));

parrt · 2014-03-20T16:21:42Z

Sam, should we make a function that wipes the DFA cache? Otherwise they grow forever in a long running server. It'll simply adapt to new input.

sharwell · 2014-03-20T16:35:55Z

I'm planning to modify the caching mechanisms in a later release. Changing it now would just increase the chances that the future change will break users applications. Considering that a workaround already exists for server applications and such, I don't think there's any need to do anything else at this point.

mikedehaan · 2014-03-20T16:45:11Z

Given that I can provide my own implementation of the cache, would it cause a problem if items disappear from the cache? In other words, what happens when get() returns null?

Can I write a SoftReference cache?

mikedehaan · 2014-03-20T20:41:46Z

I tried your suggested workaround, however, I'm still getting an out of memory exception. I haven't been able to track exactly why, but I think it has something to do with the DFA array ("decisionToDFA"). The problem with the PredictionContextCache goes away when using my own, but something else is still running free causing the DFA array to grow.

I'm going to have to stick with making those variables instance rather than static for the time being. I may lose performance, but I won't crash.

Please consider re-opening this issue unless there's something else with your workaround that I'm missing.

The antlr team have plans in the future to enhance the caching mechanism. In the meantime, this workaround will clean the cache for each instantiation. There may be performance implications, but certainly better than crashing. antlr/antlr4#499

daniellansun · 2016-09-26T08:29:47Z

@parrt @sharwell
Hi Terence, Sam
Is it possible to make DFAState.edges hold SoftReference instances and make PredictionContextCache have some cache eviction policy(e.g. LRU)?

sharwell · 2016-09-26T19:52:59Z

@danielsun1106 It would be difficult to use a cache eviction policy for PredictionContextCache that only partially clears it. One purpose of the cache is to reduce the number of times these instances get referenced.

The use of SoftReference instances in DFAState.edges would result in a 30-byte memory overhead per edge in the DFA relative to what we have now (54 bytes on 64-bit JVMs where compressed OOPS are not enabled).

daniellansun · 2016-09-27T00:05:36Z

IMO, the increased memory usage because of the use of SoftReference could be accepted than the OutOfMemoryError :)
As of the cache eviction policy, I tried to replace the default cache with the LRU cache based on guava while developping the Groovy parser about some months ago, but the performance was not much improved...

@sharwell
Sam, could you help me to confirm whether PredictionContextCache instance holds any DFA data? In other words, any relationship exists between PredictionContextCache and DFA? There are 2 causes of OutOfMemory(PredictionContextCache and DFA cache), I wonder that any problem will occur if we resolve one of the two causes? Thanks in advance.

sharwell · 2016-09-27T11:20:42Z

@danielsun1106 There can easily be hundreds of thousands or even millions of DFAState instances, so the difference could be hundreds of MiB in a running application. A PredictionContext is a key element of an ATNConfig instance, and a DFAState has one or more ATNConfig in it (aside from DFAState.EMPTY). These instances are required for the ability of the parser to resume DFA construction from a point where it previously left off. Without them, if a missing edge is encountered during lookahead, the prediction algorithm would need to rewind the input all the way to the start of the current prediction and recreate all the edges leading to the missing edge.

The PredictionContextCache caches PredictionContext instances to eliminate duplicate instances in the DFA. In the reference release, this cache is a close approximation to the optimal elimination of duplicates. In my optimized fork it is a fully-reduced (optimal) graph.

For reference, are you using the reference release or my fork for your evaluation? The most substantial work in my fork is on memory reduction. It's not uncommon to see 100:1 ratios in memory requirements between the two for large grammars. In many parsing scenarios the memory requirements are so small it doesn't make much difference, but it sounds like you may fall into the group of known exceptions.

daniellansun · 2016-09-27T12:52:04Z

@sharwell
While developping the Groovy parser, I used both version of Antlr4(reference release and your optimized release), which both have the OutOfMemoryError if we do not clear cache or recreate cache.

When we try to apply two-stage parsing strategy, the reference release hangs but the optimized release is quite efficient, which is very impressive to me.

It's pity that the optimized release have to deserialize the ATN string by call the following code to avoid OutOfMemoryError. new ATNDeserializer().deserialize(GroovyLangLexer._serializedATN.toCharArray())

As a result, the performance of optimized release(applied two-stage parsing strategy) is almost same with the reference release(not applied two-stage parsing strategy). I wish optimized release would provide some constructor like the following ones in the future, then we could create decisionToDFA array from the deserialized ATN and manage the PredictionContextCache by ourselves.

public ParserATNSimulator(Parser parser, ATN atn, DFA[] decisionToDFA, PredictionContextCache sharedContextCache)
public LexerATNSimulator(Lexer recog, ATN atn, DFA[] decisionToDFA, PredictionContextCache sharedContextCache)

As of the usage of memory, how about wrapping the DFA with a DfaWrapper. The count of DFA is not very many(about _ATN.getNumberOfDecisions()), the reference code is shown as follows:
PredictionContextLruCache:
daniellansun/groovy-parser@1b1eb48
DfaWrapper(depends on encapsulating the DFA's public fields into getter and setter methods):

/**
 * The rationale of DfaWrapper is to use SoftReference to avoid DFA cache growing forever. 
 * If DFA instance is GCed, recreate one when it is needed
 */
class DfaWrapper extends DFA {
      private volatile SoftReference<DFA> dfaSR
      private ATN atn;
      private int decision;
      public DfaWrapper(ATN atn, int decision) {
             this.dfaSR = new SoftReference(atn.getDecisionState(decision), decision);
             this.atn = atn;
             this.decision = decision;
      }
      public DFA getDFA() {
              DFA dfa = dfaSR.get();
              if (null != dfa) return dfa;

              synchronized (this) {
                   if (null == dfaSR.get()) {
                            dfaSR = new SoftReference(atn.getDecisionState(decision), decision);
                   }

                   return dfaSR.get()
              }
      }

      // deletate DfaWrapper' methods to DFA's methods
      public List<DFAState> getStates() {
            return this.getDFA().getStates();
      }
      // ...
}

// in the generated GroovyParser
// create the _decisionToDFA 
    static {
            _decisionToDFA = new DfaWrapper[_ATN.getNumberOfDecisions()];
            for (int i = 0; i < _ATN.getNumberOfDecisions(); i++) {
                    _decisionToDFA[i] = new DfaWrapper(_ATN.getDecisionState(i), i);
            }
    }

ps: The brand new groovy parser's repository is hosted at https://github.com/danielsun1106/groovy-parser
(Currently I am using the reference release and plan to migrate to the optimized release later to gain better performance)

hsyuan · 2020-07-05T16:36:56Z

@parrt @sharwell
We have consistently seen the generated parser swallows up a lot of memory in our database system, because of the global cache in DFA.states and PredictionContextCache.

public class XXXParser extends Parser {
	static { RuntimeMetaData.checkVersion("4.5.3", RuntimeMetaData.VERSION); }

	protected static final DFA[] _decisionToDFA;
	protected static final PredictionContextCache _sharedContextCache =
		new PredictionContextCache();
...
}

I think switching to WeakHashMap (with Dummy value) or Guava WeakInterner will solve the memory issue. Although they can introduce a little bit of memory overhead because of the Reference, the cached object will be garbage collected when it is not strongly referenced anymore, better than running out of memory.

Thoughts?

parrt · 2020-07-05T17:12:26Z

@hsyuan well we can't really make these DFA states weak refs I don't think and would be a huge change, which I can't consider at this time.

parrt · 2020-07-05T17:12:51Z

BTW, I'm assuming you know to clear the cache?

hsyuan · 2020-07-05T17:23:47Z

Thank you for your quick response, @parrt.

I'm assuming you know to clear the cache?

I am not sure if I understand your question correctly. Clearing the cache is the last thing I'd like to do, because that may impact the performance.

we can't really make these DFA states weak refs I don't think and would be a huge change

By 'huge change', do you mean it is a breaking change, not because the amount of code to be changed? I see DFA.states is public and PredictionContextCache.cache is protected.

parrt · 2020-07-05T17:25:23Z

I'd graph the memory growth and clear if it gets close to too big. not sure why it's growing forever but it could with huge grammar (SQL?) and long running server. Huge change means possibly breaking, possibly slower, and too much for me to consider doing.

hsyuan · 2020-07-05T17:48:09Z

not sure why it's growing forever but it could with huge grammar (SQL?) and long running server.

Correct. Tons of huge SQL queries and long running server.

hsyuan · 2020-07-05T17:51:34Z

Looks like Presto faces similar issue: trinodb/trino#3186

xuanziranhan · 2023-08-06T14:02:49Z

We have long running servers with large queries too.
I've tried:

use new cache every time as @sharwell suggested, it didn't help much, but it didn't hurt performance too much.
Tried PrestoParser parser = new PrestoParser(tokens); DFA[] dicisionDFA = new DFA[parser.getATN().getNumberOfDecisions()]; for (int i = 0; i < parser.getATN().getNumberOfDecisions(); i++) { ecisionToDFAParser[i] = new DFA(parser.getATN().getDecisionState(i), i); } parser.setInterpreter(new ParserATNSimulator(parser, parser.getATN(), pdicisionDFA, new PredictionContextCache()));
The memory leak issue went away, but the performance was affected pretty badly.
Also tried to use clearDFA() periodically, but the memory didn't seem to be affected.

Do we consider changing the DFA array to a different structure? Also allowing flexible cache/structure for DFA states? So that we could try to plugin external cache(like redis) to help?

sharwell · 2023-08-07T16:23:25Z

@xuanziranhan Typically when the DFA is getting very large, one or more rules in the DFA is requiring long lookahead to make the decision. ANTLR only caches the minimum amount of information necessary to make decisions for the actual inputs seen since the last time the DFA was cleared. Restructuring the rules to reduce lookahead in those cases could also reduce the DFA size.

Another option you might try is using my optimized fork of ANTLR. It contains some logic to reduce the size of the cached DFA for some common cases we've seen. It often doesn't matter in practice, but for the edge cases where a grammar happens to fall into a specific pattern which is very bad in the reference version of ANTLR but closely matches the optimizations in my fork, you could see an order of magnitude working set reduction or better.

mikedehaan mentioned this issue Mar 20, 2014

Memory leak antlrjavaparser/antlr-java-parser#5

Open

sharwell closed this as completed Mar 20, 2014

sharwell added the status:wontfix label Mar 20, 2014

sharwell self-assigned this Mar 20, 2014

sharwell added this to the ANTLR 4.2.1 milestone Mar 20, 2014

jespersm added a commit to jespersm/groovy that referenced this issue May 1, 2016

Fix memory leak due to antlr/antlr4#499

5515dd4

daniellansun mentioned this issue Jun 22, 2016

Improve the performance of the new parser jespersm/groovy#37

Closed

daniellansun unassigned sharwell Sep 26, 2016

garzon mentioned this issue Dec 22, 2017

[CPP] Cache keep growing and will not be reset while parsing multiple files, just behaves like memory leak #2182

Open

pettyjamesm mentioned this issue Mar 20, 2020

Add SqlBaseParser initializer hook and refreshable ATN caches trinodb/trino#3186

Merged

blacelle mentioned this issue Nov 9, 2021

OutOfMemory following CheckStyle 9.X upgrade checkstyle/checkstyle#10934

Closed

BaurzhanSakhariev mentioned this issue Mar 31, 2023

Fix memory leak caused by antlr cache growing over time crate/crate#13901

Closed

Changochen mentioned this issue Apr 16, 2023

Reduce memory usage when parsing lots of different inputs. #4232

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Leak in PredictionContextCache #499

Memory Leak in PredictionContextCache #499

mikedehaan commented Mar 20, 2014

mikedehaan commented Mar 20, 2014

sharwell commented Mar 20, 2014

parrt commented Mar 20, 2014

sharwell commented Mar 20, 2014

mikedehaan commented Mar 20, 2014

mikedehaan commented Mar 20, 2014

daniellansun commented Sep 26, 2016

sharwell commented Sep 26, 2016

daniellansun commented Sep 27, 2016

sharwell commented Sep 27, 2016 •

edited

daniellansun commented Sep 27, 2016 •

edited

hsyuan commented Jul 5, 2020

parrt commented Jul 5, 2020

parrt commented Jul 5, 2020

hsyuan commented Jul 5, 2020 •

edited

parrt commented Jul 5, 2020

hsyuan commented Jul 5, 2020

hsyuan commented Jul 5, 2020

xuanziranhan commented Aug 6, 2023 •

edited

sharwell commented Aug 7, 2023 •

edited

Memory Leak in PredictionContextCache #499

Memory Leak in PredictionContextCache #499

Comments

mikedehaan commented Mar 20, 2014

mikedehaan commented Mar 20, 2014

sharwell commented Mar 20, 2014

parrt commented Mar 20, 2014

sharwell commented Mar 20, 2014

mikedehaan commented Mar 20, 2014

mikedehaan commented Mar 20, 2014

daniellansun commented Sep 26, 2016

sharwell commented Sep 26, 2016

daniellansun commented Sep 27, 2016

sharwell commented Sep 27, 2016 • edited

daniellansun commented Sep 27, 2016 • edited

hsyuan commented Jul 5, 2020

parrt commented Jul 5, 2020

parrt commented Jul 5, 2020

hsyuan commented Jul 5, 2020 • edited

parrt commented Jul 5, 2020

hsyuan commented Jul 5, 2020

hsyuan commented Jul 5, 2020

xuanziranhan commented Aug 6, 2023 • edited

sharwell commented Aug 7, 2023 • edited

sharwell commented Sep 27, 2016 •

edited

daniellansun commented Sep 27, 2016 •

edited

hsyuan commented Jul 5, 2020 •

edited

xuanziranhan commented Aug 6, 2023 •

edited

sharwell commented Aug 7, 2023 •

edited