Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement syntax for importing rules #292

Closed
wants to merge 2 commits into from
Closed

Conversation

Mingun
Copy link
Member

@Mingun Mingun commented Jun 11, 2022

Draft that implements syntax for importing rules. Does not implement actual importing. Design questions should be discussed in the #239 first.

I open this PR so anyone can experiment with implementing importing.

@reverofevil
Copy link

Could you please set the branch with ranges as base to this PR, so that it shows only the relevant diff?

@@ -46,16 +46,42 @@
// ---- Syntactic Grammar -----

Grammar
= __ topLevelInitializer:(@TopLevelInitializer __)? initializer:(@Initializer __)? rules:(@Rule __)+ {
= __ imports:(@Import __)* topLevelInitializer:(@TopLevelInitializer __)? initializer:(@Initializer __)? rules:(@Rule __)+ {
Copy link

@reverofevil reverofevil Jun 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports, top-level initializer and initializer have to be in strict order, and it might be not that obvious, what it is. Can we just consider all of the things mentioned in this rule as top-level statements, and parse them in no particular order?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, it is possible, although I would still not use such flexibility and would not recommend that anyone do this. Current order reflects order in which statements will be executed:

  • importing grammars (in specified order)
  • running global initializer
  • running per-parse initializer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the imported grammars effectively go at the end of the AST? You usually don't want them to be the default start rule, and if you do, you write a pass-through rule as your first rule.

Copy link
Member Author

@Mingun Mingun Jun 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current PR defines only syntax, not semantics. You are free to choose what to do with that AST. Currently AST only contains information, that it referenced to some external grammar. The pass that will deal with that information is to be written.

One of possible implementation of imports (which, for example, I've made for myself almost ten years ago) is to include them into one big AST; of course it is better to logically include new rules at the end of AST.

src/parser.pegjs Outdated
topLevelInitializer,
initializer,
rules,
location: location()
};
}

Import
= 'import' __
'{' __ rules:ImportedRule|.., __ ',' __| (__ ',')? __ '}' __
Copy link

@reverofevil reverofevil Jun 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine a situation where you have a grammar that defines an expression rule (for example, this exact grammar) that imports another grammar with expression rule (ecmascript grammar for actions). Even though reexporting it as EcmaExpression might be fine, prefixing every imported name might be quite tedious.

I think it would be best just to reuse the whole import rule from JS, and emit it verbatim into generated file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best just to reuse the whole import rule from JS, and emit it verbatim into generated file.

I deliberately chose to support only that subset of JS import clause where the rule names are explicitly indicated. This will allow us to make basic checks (e.g. that the rules we refer to exist, or that we do not have two rules with the same name from different imports -- you will need to rename one of them using as)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that all rules are exported from every .peggy file? There's no way to mark something private, or to explicitly export?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the question, that should be investigated; yes, without annotations we have no way to mark what we should export, and what not (at least, inside the grammar).

@Mingun
Copy link
Member Author

Mingun commented Jun 11, 2022

Could you please set the branch with ranges as base to this PR, so that it shows only the relevant diff?

Unfortunately, this is impossible, because branch ranges in my repository, and this PR against peggy repository. Of course, GitHub does not allow me to set it as a base. But you can select, what commits you want to see on the Files changed tab.

@hildjj
Copy link
Contributor

hildjj commented Jul 4, 2022

I think we should do a release with #291 in it, then finish the design and implementation of this. This might be big enough to warrant a 3.0, in which case we could slide whatever breaking change we decide on in #302 into that release as well.

… in the compiler:

```
import { rule1, rule2 as alias2, ..., ruleN } from '<string with path to the .peggy file>';
```

Import clauses expected before (top) initializer code block.

All import clauses appeared in the `grammar` AST node in the `imports` node property.
This property contains array of AST `import` nodes with the`rules` and `path` properties.
`rules` contains array of the `imported_rule` AST nodes.
@hildjj
Copy link
Contributor

hildjj commented Dec 12, 2023

I think #417 is kind of the inverse of this approach. it defines semantics, but no syntax. One of the problems I've got with the import syntax is that either the core peggy code needs to know how to read files and do path munging, or you need to pass in primitives that do so. In node, you'd use fs and path. In the browser, you'd use fetch and URL. In the browser you'd either need to know the fully-qualified URL of the input document, or only allow imports of fully-qualified URLs.

I wonder if we actually need syntax for this, or if #417's approach of passing in an array in the same shape that GrammarError.format() takes will be enough?

@hildjj hildjj mentioned this pull request Jan 8, 2024
@hildjj
Copy link
Contributor

hildjj commented Jan 27, 2024

Fixed in #456

@hildjj hildjj closed this Jan 27, 2024
@Mingun Mingun deleted the import branch January 27, 2024 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants