Failure to parse `ddck` file. #176

zuckerruebe · 2024-01-05T13:25:51Z

As a user I'd like pytrnsys to correctly parse attached ddck file and replace :unitAssigned correctly.

Background
TRNSYS' deck grammar is context-sensitive. The LABELS "directive" is a prime example for that. The following is correct "TRNSYS syntax"

UNIT 1 TYPE 2
INPUTS 1
0 0
LABELS 6
HELLO
EQUATIONS 1
x = 7

resulting in labels EQUATIONS, 1, x, = and 7.

Compare this to

UNIT 1 TYPE 2
INPUTS 1
0 0
LABELS 1
HELLO
EQUATIONS 1
x = 7

which would result in the equation x=7. However, it's not possible to write an unambiguous context-free grammar for parsing the above two versions correctly, as the correct parse depends on the context "the number of labels expected" (1 or 6, here).

Most grammar compilers out there are for context free grammars - the library for parsing ddcks which we're using, lark, definitely is - and most programming languages have a context free grammar. More precisely, lark can accept ambiguous grammars and will then try and generate all possible parses of the input and finally select the most "probable" one as the parse it returns to the user. I don't know how it deems one parse more "probable" than another. We should think of this as an implementation detail.

That's what makes this bug very insidious. Just by changing the version of the lark package we depend upon, the user can get different results. Also rearranging stuff in the ddck file can lead to different results - moving the UNIT with LABELS to the very end of the file might save the day for the example attached. So things are very brittle - and slow, on top of that. Being slow is also a result of lark having to generate all possible parses of the input.

Possible remedies

Write your own context-sensitive parser (started here but very early days)
Extend the lark grammar to not allow "keywords" like EQUATIONS for LABELS, etc.

1 Would have the advantage of accuracy and speed and the disadvantage of having to do quite some work.
2 Would be less work, still potentially not very fast and may not ultimately be the most flexible way.

Only 1. would be able to ensure the right number of equations, labels, inputs etc. at parse stage. This is not possible with approach 2.

The text was updated successfully, but these errors were encountered:

zuckerruebe · 2024-01-05T13:38:13Z

@ahobeost FYI

zuckerruebe added this to pytrnsys Kanban Jan 5, 2024

zuckerruebe added the bug Something isn't working label Jan 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure to parse `ddck` file. #176

Failure to parse `ddck` file. #176

zuckerruebe commented Jan 5, 2024 •

edited

Loading

zuckerruebe commented Jan 5, 2024

Failure to parse ddck file. #176

Failure to parse ddck file. #176

Comments

zuckerruebe commented Jan 5, 2024 • edited Loading

zuckerruebe commented Jan 5, 2024

Failure to parse `ddck` file. #176

Failure to parse `ddck` file. #176

zuckerruebe commented Jan 5, 2024 •

edited

Loading