Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asm-blocks parser #837

Open
anton-trunov opened this issue Sep 16, 2024 · 1 comment
Open

asm-blocks parser #837

anton-trunov opened this issue Sep 16, 2024 · 1 comment
Assignees
Labels
scope: asm Tact-embedded assembly functions scope: parser

Comments

@anton-trunov
Copy link
Member

The parser should produce ASTs for asm-blocks. This will ensure we are in control of error messages and also we can later implement a type-checker for the asm-blocks.

@anton-trunov anton-trunov added scope: parser scope: asm Tact-embedded assembly functions labels Sep 16, 2024
@anton-trunov anton-trunov added this to the v1.6.0 milestone Sep 16, 2024
@novusnota
Copy link
Member

novusnota commented Sep 20, 2024

TL;DR: Most of the work had to be scrapped as the somewhat reliable AST of TVM asm cannot be produced without having a hand-made parser, which would do typechecking and partial evaluation of the Fift-asm. Also see the UPD2.

Wrote a significant chunk of Ohm parser's code and parts of later stage hacks to support defining new words (instructions), including the active words, which affect the parsing in a context-sensitive way — the active words take the following word as an argument to them!

But then I realized, that:

  1. Shadowing is allowed (and occurs tacitly/silently), meaning that even built-in words can be re-defined while doing ANY parsing, so my updates to grammar.ohm itself are wasted on that. Moreover, its possible to define new words that affect the predictive parsing (the active words I described above).
  2. forget can remove the words. It can even forget forget!
  3. word, (word) and (word-prefix-find) which change the subsequent parsing depending on the previous entry on the stack — the first one (word) either uses it as a character to parse until, or consumes the word ahead. And mind you, they are used a lot in all of Fift's .fif lib, including Fift.fif and Asm.fif
  4. Adding to the 2nd point, there's a (create) that can create new words based on stack content, and (forget), which can remove words based on stack content.

A workaround for some of the issues above is the recursive-descent parser and a dictionary to keep track of. And it would require to keep its own stack, with type checking of items added to it and all that stuff. Otherwise we're left with just a slight expansion of possibilities of the current parser and a better recognition of built-in words in Fift-asm.

UPD: Thinking on prohibiting the forget word in the first place. And prohibiting shadowing the {, }, ({), (}), [ and ] words, for the sanity of the parser. The 3rd and 4th issues are the real blocker here — can't really parse if the stack is unknown.

UPD2: So, let's introduce just the minor update here and make very primitive ASTs, not attributing for types or anything, as they can get really incorrect considering the 3rd point above (even if we get rid of the first two by restricting the capabilities).

novusnota added a commit that referenced this issue Sep 20, 2024
Note, that while it's possible to support the syntax via Ohm without
constucting ASTs, it's impossible to support the AST generation without
outsourcing that part of the parser to the recursive-descent one, which
would also keep track of dictionary of words/instructions and the TVM
stack, at least approximating it. [See this for more
info](#837 (comment)).
@novusnota novusnota removed this from the v1.6.0 milestone Sep 23, 2024
anton-trunov pushed a commit that referenced this issue Sep 24, 2024
…855)

Note, that while it's possible to support the syntax via Ohm without
constucting ASTs, it's impossible to support the AST generation without
outsourcing that part of the parser to the recursive-descent one, which
would also keep track of dictionary of words/instructions and the TVM
stack, at least approximating it. [See this for more
info](#837 (comment)).

* test: add Fift libraries from TON Blockchain monorepo

And apply a small fix to the parser — we cannot prohibit shadowing or
removal of instructions because both are used in Fift library and
beyond.

However, that means that out of place `{` and `}` are possible when
shadowing them. See the comment in `embed-fift-fif.tact` about such
case, and why it's ok to move forward with it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope: asm Tact-embedded assembly functions scope: parser
Projects
None yet
Development

No branches or pull requests

2 participants