Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

advice requested on implementing a language "feature" #145

Open
terryheidelberg opened this issue Sep 25, 2022 · 1 comment
Open

advice requested on implementing a language "feature" #145

terryheidelberg opened this issue Sep 25, 2022 · 1 comment
Labels

Comments

@terryheidelberg
Copy link

terryheidelberg commented Sep 25, 2022

  • parglare version: 0.15
  • Python version: 3.8.10
  • Operating System: Ubuntu 20.04.5 LTS / Linux 5.11.0-41-lowlatency x86_64

Description

I am trying to translate an old dialect of BCPL into a more recent version.
One of the quirks of this language is the allowed omission of certain language
elements, in certain situations. My current attempts are to try to "inject"
the missing (omitted) elements into the input stream, but that doesn't look
possible with the current implementation of Parglare custom recognizers,
as input_str is type str and thus immutable.
Is that correct?
If so, is there another way of attacking this problem?
Thanks.

Here is the relevant doc extract, edited from the old-dialect manual:
Insertion of missing symbols during parse:

 (1) The symbol DO is inserted between pairs of items if they appear on the same line and 
 if the first is from the set of items which may end an expression, namely:
                )     element     ]
            
 and the second is from the set of items which must start a command, namely:
                TEST FOR IF UNLESS UNTIL WHILE GOTO
                RESULTIS CASE DEFAULT BREAK RETURN 
                FINISH SWITCHON   [


 (2) The compiler inserts a semicolon between adjacent items if they appear on 
 different lines and if the first is from the set of symhols which may end a command, namely:
                BREAK RETURN FINISH REPEAT
                )     element     ]
    
 and the second is from the set of items which may start a command,  namely:
                 TEST FOR IF UNLESS UNTIL WHILE GOTO
                 SWITCHON   (   RV   element
                 RESULTIS CASE DEFAULT BREAK RETURN
                 FINISH    [

Where in the above text means:
element : character_constant | string_constant | number | Identifier | "TRUE" | "FALSE" ;

@igordejanovic
Copy link
Owner

I think there are two viable approaches:

  1. Preprocessing of the input before parsing and inserting required text - this might be tricky if context-free recognition is required to check for the insertion conditions.
  2. Use custom recognizers to return "virtual tokens".

I think the second option would be "less hacky". But, the problem is that parglare currently expect a slice of the input to be returned by recognizers. In this case that slice would be empty so parglare takes that to mean unrecognized token. See here. What you can try as a workaround is to inherit the list type and make it evaluate to True in boolean context even if it is empty (search for __bool__ dunder method). Then you return an empty instance of your new list if the token is not in the input but condition for its insertion are satisfied. This will make the test in parglare to pass and token to be treated as non-empty while still the length would be 0 which is important for the parser during advancing the position.

This is just an idea, I haven't tested it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants