Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to parse ddck file. #176

Open
zuckerruebe opened this issue Jan 5, 2024 · 1 comment
Open

Failure to parse ddck file. #176

zuckerruebe opened this issue Jan 5, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@zuckerruebe
Copy link
Contributor

zuckerruebe commented Jan 5, 2024

As a user I'd like pytrnsys to correctly parse attached ddck file and replace :unitAssigned correctly.

Background
TRNSYS' deck grammar is context-sensitive. The LABELS "directive" is a prime example for that. The following is correct "TRNSYS syntax"

UNIT 1 TYPE 2
INPUTS 1
0 0
LABELS 6
HELLO
EQUATIONS 1
x = 7

resulting in labels EQUATIONS, 1, x, = and 7.

Compare this to

UNIT 1 TYPE 2
INPUTS 1
0 0
LABELS 1
HELLO
EQUATIONS 1
x = 7

which would result in the equation x=7. However, it's not possible to write an unambiguous context-free grammar for parsing the above two versions correctly, as the correct parse depends on the context "the number of labels expected" (1 or 6, here).

Most grammar compilers out there are for context free grammars - the library for parsing ddcks which we're using, lark, definitely is - and most programming languages have a context free grammar. More precisely, lark can accept ambiguous grammars and will then try and generate all possible parses of the input and finally select the most "probable" one as the parse it returns to the user. I don't know how it deems one parse more "probable" than another. We should think of this as an implementation detail.

That's what makes this bug very insidious. Just by changing the version of the lark package we depend upon, the user can get different results. Also rearranging stuff in the ddck file can lead to different results - moving the UNIT with LABELS to the very end of the file might save the day for the example attached. So things are very brittle - and slow, on top of that. Being slow is also a result of lark having to generate all possible parses of the input.

Possible remedies

  1. Write your own context-sensitive parser (started here but very early days)
  2. Extend the lark grammar to not allow "keywords" like EQUATIONS for LABELS, etc.

1 Would have the advantage of accuracy and speed and the disadvantage of having to do quite some work.
2 Would be less work, still potentially not very fast and may not ultimately be the most flexible way.

Only 1. would be able to ensure the right number of equations, labels, inputs etc. at parse stage. This is not possible with approach 2.

@zuckerruebe zuckerruebe added the bug Something isn't working label Jan 5, 2024
@zuckerruebe
Copy link
Contributor Author

@ahobeost FYI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

1 participant