optin optimizations

Opt-In optimizations (from v3.1.0)

CSLY comes with some optimizations that can be enabled for parsers. Theses optimizations are not always relevant according to your parser structure. An optimization may produce interesting results for an expression grammar and causes performance losses for a JSON parser (JSON is relatively easy to parse and does not need any special optimization to be fast enough)

Optimizations are enabled with the help of C# attributes on the parser class. Many optimizations can be used simultaneously.

For now 2 optimizations are available.

Use Memoization

The idea of this optimization is to cache intermediary parsing result to limit backtracking, at the cost of memory use. When a grammar does not need too much bactracking this optimization might be a bad fit.

The C# attribute is UseMemoization

[UseMemoization]
public class MyParser {
    // parser definition
}

Broaden token window

Natively CSLY only use one token to choose production rules. With this optimization it can look ahead 2 tokens in some cases which might limit backtracking. Once again this optimization must be tested as it may not be a good choice

This optimization may be dwarfed by the Use Memoization optimization.

The C# attribute is BroadenTokenWindow

[BroadenTokenWindow]
public class MyParser {
    // parser definition
}

sample benchmarking

here are the results of some benchmarking with both optimizations. It clearly shows that optimizations produce different results depending on grammar.

json parser

Method	UseMemoization	BroadenTokenWindow	Mean	Error	StdDev	Gen0	Gen1	Gen2	Allocated
TestJson	False	False	88.18 ms	1.689 ms	1.579 ms	9833.3333	4000.0000	1500.0000	56.26 MB
TestJson	False	True	87.36 ms	1.686 ms	2.524 ms	10000.0000	3500.0000	1333.3333	58.68 MB
TestJson	True	False	152.28 ms	5.975 ms	17.143 ms	12000.0000	4750.0000	2250.0000	68.88 MB
TestJson	True	True	164.18 ms	4.283 ms	12.495 ms	12333.3333	4666.6667	2000.0000	71.29 MB

JSON is a highly structured language. Some tokens like {or [ uniquely denotes structures which leads to limited backtracking. Hence both Use Memoization and [toekn window broadening] do not add any performance. In fact they cause performance loss due to memory consumption (memoization) or processing time (token window)

indented while

Method	Memoize	Broaden	Mean	Error	StdDev	Median	Gen0	Gen1	Gen2	Allocated
TestWhile	False	False	2.864 ms	0.1012 ms	0.2788 ms	2.780 ms	964.8438	304.6875	-	4.2 MB
TestWhile	False	True	2.963 ms	0.1412 ms	0.3816 ms	2.883 ms	949.2188	304.6875	-	4.24 MB
TestWhile	True	False	7.963 ms	0.2302 ms	0.6677 ms	7.777 ms	921.8750	453.1250	218.7500	5.62 MB
TestWhile	True	True	9.692 ms	0.7434 ms	2.1330 ms	9.460 ms	937.5000	421.8750	203.1250	5.65 MB

As for JSON While language is a structured language. So the same as JSON is observed.

parser from #414

Method	UseMemoization	BroadenTokenWindow	Mean	Error	StdDev	Median	Gen0	Gen1	Gen2	Allocated
TestBackTrack	False	False	927.96 ms	33.831 ms	94.307 ms	898.35 ms	183000.0000	39000.0000	7000.0000	978.48 MB
TestBackTrack	False	True	890.39 ms	27.614 ms	79.673 ms	878.01 ms	180000.0000	39000.0000	8000.0000	971.88 MB
TestBackTrack	True	False	32.14 ms	1.680 ms	4.849 ms	31.75 ms	2000.0000	2000.0000	1666.6667	23.81 MB
TestBackTrack	True	True	32.33 ms	1.964 ms	5.698 ms	31.63 ms	1785.7143	1714.2857	1428.5714	23.8 MB

This parser is mainly an expression parser that does not use the expression parsing feature. It causes heavy backtracking because of the many precednece levels. Use Memoization limits the baktracking and toekn window broadening does not add evident benefice.

Menu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly