Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference parser stores TermReference sigil inside its name #142

Closed
zbraniecki opened this issue Jun 4, 2018 · 10 comments
Closed

Reference parser stores TermReference sigil inside its name #142

zbraniecki opened this issue Jun 4, 2018 · 10 comments
Labels

Comments

@zbraniecki
Copy link
Collaborator

zbraniecki commented Jun 4, 2018

Extracting from projectfluent/fluent.js#206 -

Reference parser stores term reference sigil in its name:

{
    "type": "TermReference",
    "name": "-term0",
}

Variables don't carry the variable sigil, and Comments don't carry the # sigil either.

In the EBNF it clearly is represented as:

Comment            ::= ("#" comment_line)+
TermIdentifier      ::= "-" identifier
VariableIdentifier ::= "$" identifier

In JS, Python, PHP and Rust if a bit of information is encoded by the AST node type, it is not stored in any node field's content.

I believe that in the same vein, TermReference should be stored as either:

{
    "type": "TermReference",
    "name": "term0",
}

or:

{
    "type": "TermReference",
    "identifier": {
        "type": "Identifier",
        "name": "term0"
    },
}

Out of the two, I believe the former is easier to work with, while the latter is closer to EBNF.

@stasm
Copy link
Contributor

stasm commented Jun 5, 2018

This is a question of whether the dash - is a sigil or part of the term's identifier. I lean towards the latter. Interestingly, in case of $variables, I'd say $ is a sigil :)

@stasm
Copy link
Contributor

stasm commented Jun 5, 2018

BTW I filed #144 with a suggestion to simplify the AST wrt. storing identifiers.

@Pike
Copy link
Contributor

Pike commented Jun 5, 2018

The train of thought I followed here is what I expect to be more helpful in syntax highlighting: Giving the leading - in a term identifier a different color than the tail or the same color. I wish I could use color here, but gfm doesn't allow that.

My perception is that having the same color for the - is going to make things easier to understand, and get right.

Which leaves me with the same idea that stas has, - isn't a sigil. Also, yeah, $ probably is. In my book because the names the programmer hands don't include the $.

@stasm
Copy link
Contributor

stasm commented Jun 5, 2018

Should we consider changing the MessageContext API to take $-prefixed variables?

@Pike
Copy link
Contributor

Pike commented Jun 5, 2018

Should we consider changing the MessageContext API to take $-prefixed variables?

I'd say no.

@zbraniecki
Copy link
Collaborator Author

I agree with the syntax coloring UX described by @Pike.
But I was under the impression that we have spans to denote Node's start/end, and we don't have to rely on Nodes field value for that.

Was my assumption wrong?

As to what @stasm said - I believe this approach was designed when we wanted to have terms to be regular identifiers in AST and only treat them separately in tooling.
With AST separating them now, I feel like keeping both - separate Node, and its identifying sigil as part of the field's property we're making them different from all other Nodes in FTL and from other programming languages that I know.

Do you know of a precedence for such behavior? Do you have a strong reason for holding this position?

@stasm
Copy link
Contributor

stasm commented Jun 5, 2018

Our AST keeps them separate because by design it does a little bit of static analysis. In a regular programming language we'd only have Identifier nodes rather than MessageReferece, TermReference and VariableReference, and we'd let the runtime and the static analysis tools figure out the meaning of those identifiers. Fluent doesn't have a runtime and the syntax and the parser need to compensate for it. Which is why I don't mind the slight redundancy here.

@zbraniecki
Copy link
Collaborator Author

In a regular programming language we'd only have Identifier nodes rather than MessageReferece, TermReference and VariableReference

Interesting. That is not aligned with my opinion (I believe those should be separate nodes), but I'm not going to pursue this change any further :)

Feel free to close if you're not interested in changes I suggested.

@zbraniecki
Copy link
Collaborator Author

I bumped into @Manishearth in a hallways and asked him for feedback on this. Manish - do you have any thoughts on our AST design here? You can find the ebnf here: https://github.com/projectfluent/fluent/blob/master/spec/fluent.ebnf

@Manishearth
Copy link

I've never seen sigils be stored alongside identifiers in an AST (in a lex tree, sure, but this seems to be an AST)

If there is a difference between TermReference and MessageReference at a syntactic level, they should probably be stored as separate nodes without any sigils. OTOH if they're similar enough that it makes sense to store them as the same kind of node, that works too, and you can use the - sigil to differentiate. I think the middle ground of having separate nodes and also using the - sigil to differentiate is a bit weird.

(If I'm understanding the situation correctly)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants