-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUESTION] Parsing comments/documentation and attach it to function declarations #79
Comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
That's a good problem to think through. Here is what I'd do: Don't model whitespace and comments as tokens. Instead, follow the Roslyn model where those are considered trivia. Trivia is associated with a token. A token can have leading and trailing trivia. Trailing trivia only goes until the end of line, all subsequent trivia is considered leading trivia for the following token. Therefore, only the first token on a line can have leading trivia. Comments at the bottom of a file are leading trivia of the synthetic end-of-file token. Why this way? Because it follows how we think of comments:
Let's look at a few examples: // Comment 1
function Foo(a: int, // Comment 2
b: int) // Comment 3
{
let x = a /* Comment 4 */ + /* Comment 5 */ b
/* Comment 6 */ let y = x // Comment 7
// Comment 8
}
It's worth noting that comments are just one kind of trivia. Others are:
Rough sketch: partial class SyntaxToken
{
public ImmutableArray<SyntaxTrivia> LeadingTrivia { get;}
public ImmutableArray<SyntaxTrivia> TrailingTrivia { get; }
}
class SyntaxTrivia
{
public SyntaxKind Kind { get; }
public string Text { get; }
} Now add a few APIs to
With that, your problem becomes pretty straight forward. When declaring a function, you do something like that: void BindFunctionDeclaration(FunctionDeclarationSyntax syntax)
{
var comments = new List<string>();
var leadingTrivia = syntax.GetLeadingTrivia();
foreach (var trivia in leadingTrivia)
{
if (trivia.Kind == SyntaxKind.DocumentationCommentTrivia)
{
// Get text without slashes
var text = trivia.Text.Substring(3).Trim();
comments.Add(text);
}
}
var summary = string.Join(Environment.NewLine, comments);
...
var symbol = new FunctionSymbol(..., summary);
...
} Does this make sense? |
To all of you parser developers! I'm making a parser for a custom language. This language uses JSDoc as way of adding function documentation and metadata. My idea was to tokenize the input with the following types of comments:
then I would strip them from the List...
now the
code
tokens can be normally parsed as intended...now in the parser I want to be able to, at any time, check if there is a token previous to a declaration and attach it to the declaration model as metadata... for example...
even though the tokens are no longer next to each other.. (I stripped the original list) I was thinking of keeping them linked using a pointer to the previous token (during the tokenization process).
my problem is I maybe breaking the model structure here... as the tokens will have access to each other and they will need to have a
TryMatchPrevious
method it doesn't really make sense because a token shouldn't have a match functionality.on the other hand I can just put the function in the
Parser
and have it's signature be:TryMatchPrevious(kindToMatch, out var token);
or even
TryMatchPrevious(startToken, kindToMatch, out var token);
what do you think of this approach? am I overthinking it? is this too much? is there a simple way of implementing this!!
The text was updated successfully, but these errors were encountered: