Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for trailing line comments #991

Open
mojavelinux opened this issue Jul 5, 2014 · 27 comments
Open

Add support for trailing line comments #991

mojavelinux opened this issue Jul 5, 2014 · 27 comments
Assignees
Milestone

Comments

@mojavelinux
Copy link
Member

Add support for inline comments that appear at the end of a line (trailing comments). This is a feature common to most programming languages. Technical authors expect to be able to use comments in documentation in the same way.

Example:

normal content   // skipped comment

To avoid breaking existing content, a compatibility flag should be added to the processor that allows the feature to be switched off.

@mojavelinux mojavelinux added this to the v1.6.0 milestone Jul 5, 2014
@mojavelinux mojavelinux self-assigned this Jul 5, 2014
@mojavelinux mojavelinux modified the milestones: v1.6.0, v1.7.0 Dec 21, 2015
@janicesignalfx
Copy link

+1 but would this require using an escape character for urls? or would it be controlled by whitespace?

@elextr
Copy link

elextr commented Mar 21, 2018

There are two solutions, either require "// " as the token, or wait for the "proper" parser that would know its inside a URL after recognising the scheme eg https: and not recognise // as a comment.

@mojavelinux
Copy link
Member Author

My idea was to require a leading space (and be the last // in the line), which would address the issue at hand.

@mojavelinux
Copy link
Member Author

we could even enforce a space on both sides.

@elextr
Copy link

elextr commented Mar 21, 2018

Last on line is not so good, that means significant lookahead is required in my lexer/your regex.

@mojavelinux
Copy link
Member Author

I was not planning to use regexp. probably a seek from end of string kind of thing. but I'm not sure yet.

@elextr
Copy link

elextr commented Mar 22, 2018

I guess it has to scan for the end of line to terminate the comment anyway, so long as it doesn't happen twice every comment.

This will have to be excluded from several of the delimited blocks since // is a comment in many programming languages.

@mojavelinux
Copy link
Member Author

It would not be applied to verbatim blocks (or comment blocks for that matter since it would be pointless).

@elextr
Copy link

elextr commented Mar 25, 2018

I was thinking about this some more, in the following cases a decision is needed:

Does the following comment eat all the whitespace before it?

blah blah blah              // comment

I would say yes, but it does not consume the eol after it.

Is it allowed on directives that usually occupy a whole line?

ifdef::abc[]   // comment

and is it allowed on other whole-line markup or only on lines ending in text? For example:

----// comment on listing
code
----

I would say only on lines ending in text (and inline markup) which is a simple rule (for both humans and processors) but rules out both of the above. But in both cases having to put the comment on the line before is not a major imposition.

@janicesignalfx
Copy link

Putting aside any technical reasoning and just going by what seems intuitive to a user, supporting it on any of the block delimeters seems wrong to me but I'd be inclined to expect it to work on things like conditionals. I'm not entirely sure why those cases are different, but they feel that way to me. I'm not sure what's gained by removing the whitespace before a comment as the comment is not part of the output document; I'd be inclined to leave it as is for that reason (but, again, am not sure why it matters for this case)

@elextr
Copy link

elextr commented Mar 25, 2018

I'm not entirely sure why those cases are different, but they feel that way to me.

From my point of view the distinction is indeed somewhat technical, remember I'm working on another Asciidoc implementation which uses what @mojavelinux has been calling a "proper" parser, and part of that is getting a more formal definition of the language, but the fact that it feels different from a user point of view is interesting.

Both constructs are currently defined to occupy a whole line and are terminated by the end of line.

The difference (from my point of view) is that the directive could be terminated by the ] not the end of line (albeit with issues, see below), so allowing the comment is possible, whereas the block delimiter is only terminated by end of line since it can have as many -es as the user wants, and it would be a big change to the language definition to allow comments on the line.

But what is this?

ifdef::abc[do not use ] // is used for comments]

Is it:

  1. a directive containing do not use and a comment that happens to have ] in it, remember comments can have any characters in them, or
  2. a directive containing the text do not use ] // is used for comments, and
  3. what if the first ] is backslash escaped?

Having complex rules about interactions between comments and other constructs starts to make the language difficult for users, so I would push for retaining the simple "block directives and delimiters must occupy the whole line" rule and disallow inline comments on those constructs. Those are already block constructs, so using a line comment before them won't have further structuring impact.

@janicesignalfx
Copy link

Ah, I see your point. As long as the rules are clearly documented I an live with them.

FYI, the place where I most want this ability is in my variables file; I would prefer to use this structure:

//comment naming a section
//comment describing/providing more info about section
//another section-wide comment
:var1: // note about var1
:var2:
:var3: // note about var3
:var4:

It's a lot cleaner than:

//comment naming a section
//comment describing/providing more info about section
//another section-wide comment
// note about var1
:var1:
:var2:
// note about var3
:var3: 
:var4:

@elextr
Copy link

elextr commented Mar 26, 2018

A big part of language definition is to make edge cases have a consistent, useful and simple definition.

@janicesignalfx, thats another problematical one, you are good at this :)

Attribute definitions are also whole line constructs. Your examples had no values. What about:

:attr: with a value // comment

is that an attribute with value with a value or value with a value // comment, or more difficult:

:attr: with a pass:[ // ] value

Attribute values cannot be parsed until substitution, consider for example:

[subs=-macros]
blah {attr} blah

would be expected to give literally blah with a pass:[ // ] value blah whereas:

[subs=+macros]
blah {attr} blah

would be expected to give blah with a // value blah. (Note I suspect Asciidoctor has a bug here, see #2630)

This means that the parser can't rely on the existence of the pass (or any other markup) to protect the // from being recognised as a comment in the definition of attr. So attribute values couldn't contain // if comments are allowed.

@janicesignalfx
Copy link

:) I'm really good at finding weird bugs too.

Yeah, some of those would have values - I was just being lazy, thinking what I did was enough to convey the idea. And that's exactly why I was asking about URLs above - because I put most of them in variables.

So I guess I'm left with asking where support would be added. It seems like we're ruling out nearly every place I could imagine wanting to use them.

Re: this specific case, would the use of an escape character be sufficient? so:

:var1: foo // I am a comment

where {var1} -> "foo"

vs

:var2: foo \// do not interpret me as a comment

where {var2} -> "foo // do not interpret me as a comment"

I'm not sure of the exact behavior for pass[{varN}] but couldn't the intent be maintained if it's strictly defined as above?

@elextr
Copy link

elextr commented Mar 26, 2018

And that's exactly why I was asking about URLs above - because I put most of them in variables.

The suggestion by @mojavelinux that there be whitespace around the // should protect URLs

It seems like we're ruling out nearly every place I could imagine wanting to use them.

Sorry, I also had the initial reaction of "thats a good idea", its just that further thought raised these corner cases.

The use of backslash escaping is certainly possible from a formalism point of view, but I would allow it in any cases where a comment is allowed, eg in normal text as well

C++ uses // for comments, vs
C++ uses \// for comments

produces output

C++ uses
C++ uses // for comments

@mojavelinux sorry to have sort of hijacked your issue, but hopefully the discussion has provided some useful info when it comes up for implementation.

@janicesignalfx
Copy link

Oh, I'd want it to work generally if at all. I just wanted to make sure it solved this particular problem because I don't fully understand things like the passthrough macro in any kind of real technical detail.

@mojavelinux
Copy link
Member Author

Fantastic input. Thanks to both of you.

I think we need to look at this using the rules of programming languages (since that establishes precedence and will be a familiar anchor point for understanding). In programming languages, a trailing line comment is absolute. There is no escaping it and context doesn't matter. This is even true for YAML.

That would leave us with an initial set of rules for trailing line comments in AsciiDoc as follows:

  1. The line comment must be prefixed by at least one whitespace character (to avoid breaking URLs, but also for aesthetics). This is consistent with how line comments work in YAML.
  2. The line comment cannot be followed by a forward slash (to be consistent with how line comments work today)
  3. The line comment will not be recognized on lines with restricted characters, such as block delimiters or inside of verbatim blocks.
  4. Once the line comment is removed, trailing whitespace is again stripped.

Let's continue to refine these rules by building on this rule set.

The one thing we're missing is the idea of a string. In programming languages, a line comment inside a string is ignored. That's most similar to our inline passthrough. However, I think we have other ways to escape, such as {blank}// or pass:[//] (which leverages the leading space requirement).

@elextr
Copy link

elextr commented Oct 28, 2018

  1. agree

  2. not sure I agree, I don't see where would extra slashes be a problem? Once its a comment its a comment, and any characters are allowed, and no further parsing shall be entered into until end of comment. Note that as a leading space is required it can't be confused with a block comment delimiter or a non-trailing line comment.

  3. agree

  4. agree

Maybe there should be a subs=comments to allow comments in blocks containing code, as you say // is a common comment token.

@mojavelinux
Copy link
Member Author

As per (2), that's always been a rule in AsciiDoc, so we'd be breaking something with pretty little value. I don't think it's worth trying to change because it's rarely, if ever used anyway. The reason it was done is because it has to do with how comment blocks are detected. Call it a parsing quirk, as far as I can tell. Though it does have the benefit that it allows us to avoid looking ahead too many characters to differentiate between a line command a comment block delimiter.

@mojavelinux
Copy link
Member Author

Maybe there should be a subs=comments to allow comments in blocks containing code, as you say // is a common comment token.

Perhaps. Though, really, verbatim blocks should be verbatim. And since we encourage verbatim content to be externalized over the long term, it makes even less sense to modify it. I think commenting around the verbatim block should be sufficient. I'll keep thinking about it, though.

@elextr
Copy link

elextr commented Oct 28, 2018

The reason it was done is because it has to do with how comment blocks are detected. Call it a parsing quirk, as far as I can tell.

To be clear I am talking about the formal definition.

Really quirks that depend on specific implementations should not be propagated. Thats not to say that existing implementations need to change, but a formal definition should not forcing a specific implementation quirk on all future implementations, especially for something like this which, as you correctly say, will rarely get hit.

A previous post on this issue suggested a space after the // and thinking about it some more, requiring a space after the // for trailing comments would be more symmetrical with the existing line comment syntax and would remove this point. Effectively line comments and trailing comments could be the same thing, ie (BOL|ws)//ws which is good since to users they use the same syntax, so they should be the same thing.

@mojavelinux
Copy link
Member Author

requiring a space after the // for trailing comments would be more symmetrical with the existing line comment syntax and would remove this point

I like it.

@mojavelinux mojavelinux changed the title Add support for trailing inline comments Add support for trailing line comments Oct 28, 2018
@reitzig
Copy link

reitzig commented Dec 20, 2023

@mojavelinux This has been sitting here for a while. Since I just stumbled over this, any chance to get it from M3 to v2.x as an opt-in feature? 🙏

Use case: I use editorconfig-checker (with a limit on line length) and I kind of want this to work:

Have a lot at
    link:https://www.amazon.de/-/en/Rick-Roll-T-Shirt/dp/B09HP9BTW2/ref=sr_1_2?keywords=rick+roll+shirt&qid=1703070363&sr=8-2[this site] // editorconfig-checker-disable-line
and despair!

This workaround is not nearly as nice, impairing reading/editing flow considerably:

Have a lot at
// editorconfig-checker-disable
    link:https://www.amazon.de/-/en/Rick-Roll-T-Shirt/dp/B09HP9BTW2/ref=sr_1_2?keywords=rick+roll+shirt&qid=1703070363&sr=8-2[this site] 
// editorconfig-checker-enable
and despair!

@mojavelinux
Copy link
Member Author

Since I just stumbled over this, any chance to get it from M3 to v2.x as an opt-in feature?

No, this won't go into Asciidoctor until it is resolved in the AsciiDoc Language project. We are working on it there, but there are a lot of open questions. You are welcome to get involved in the conversation.

From this point forward, Asciidoctor will only implement syntax that's defined (or at least proposed) by the AsciiDoc Language project. So it has to be reconciled there first. So this issue will continue to sit indefinitely.

@reitzig
Copy link

reitzig commented Dec 20, 2023

Makes sense! I wasn't aware that there even was a separate project for the syntax itself! 👍️

Can you point me towards an an upstream issue I can follow, or where to create it? Can't find the spot starting from asciidoc.org.

Edit: Ah, this is the place, right? gitlab.eclipse.org/eclipse/asciidoc-lang Don't see this issue there, though, as of yet.

@mojavelinux
Copy link
Member Author

I wasn't aware that there even was a separate project for the syntax itself!

This is stated in the docs here: https://docs.asciidoctor.org/asciidoc/latest/#about-this-documentation

Can't find the spot starting from asciidoc.org.

If you click on Specifications, it will lead you to: https://asciidoc-wg.eclipse.org/. However, it should take you to https://asciidoc-wg.eclipse.org/projects/. I'll submit an update for that.

There isn't yet a dedicated issue for line comments, but rather one in which line comments are discussed. See https://gitlab.eclipse.org/eclipse/asciidoc-lang/asciidoc-lang/-/issues/26#note_1092203. But that's part of the open question. We need to sort out what issue we even want to follow because it's not yet clear when this processing would be done. Regardless, the discussion needs to happen in that project.

@daobrien
Copy link

Allow me to add my 👍 here. I'll also look at the doc/spec page.

I work as a technical editor and being able to drop comments all over the place is critical. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants