New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace terms in AST Tokens by terms from HTML specification #4448

Closed
romani opened this Issue Jun 12, 2017 · 11 comments

Comments

@romani
Member

romani commented Jun 12, 2017

documentation was changed at #3218

https://www.w3.org/TR/html4/
beginning/closing tag -> start/end tag
singleton tag -> empty tag

Reason: it is better to keep terminology of specification and to break before it become public.

Expected fix:

  • rename in javadoc grammar (
    singletonTag --> emptyTag ;
    OPEN --> START; CLOSE --> END;
    xxxxOpen --> xxxxStart; xxxxClose --> xxxxEnd;
    *_OPEN -> _START; *_CLOSE -> _END
    )
  • renames is AST trees in xdoc files.
@Vladlis

This comment has been minimized.

Show comment
Hide comment
@Vladlis

Vladlis Jun 13, 2017

Member

@romani
Should this issue be a part of GSoC project https://github.com/checkstyle/checkstyle/projects/4 ?

Member

Vladlis commented Jun 13, 2017

@romani
Should this issue be a part of GSoC project https://github.com/checkstyle/checkstyle/projects/4 ?

@romani

This comment has been minimized.

Show comment
Hide comment
@romani

romani Jun 17, 2017

Member

@Vladlis , yes, it should be part of GSoC project.

Member

romani commented Jun 17, 2017

@Vladlis , yes, it should be part of GSoC project.

@rnveach rnveach added the GSoC2017 label Jun 17, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jul 18, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jul 18, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jul 19, 2017

@ps-sp

This comment has been minimized.

Show comment
Hide comment
@ps-sp

ps-sp Jul 19, 2017

Collaborator

singletonElement can be renamed to voidElement as seen here: https://www.w3.org/TR/html51/syntax.html#void-elements

Collaborator

ps-sp commented Jul 19, 2017

singletonElement can be renamed to voidElement as seen here: https://www.w3.org/TR/html51/syntax.html#void-elements

@Vladlis

This comment has been minimized.

Show comment
Hide comment
@Vladlis

Vladlis Jul 19, 2017

Member

@ps-sp yes, I think it should be

Member

Vladlis commented Jul 19, 2017

@ps-sp yes, I think it should be

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jul 19, 2017

@romani

This comment has been minimized.

Show comment
Hide comment
@romani

romani Jul 22, 2017

Member

@ps-sp ,

singletonElement can be renamed to voidElement as seen here

yes, it has to be renamed. BUT in description I mentioned it as:
singleton tag -> empty tag, strange ...... looks like it is bad name too that we used non from spec - http://checkstyle.sourceforge.net/writingjavadocchecks.html#HTML_Code_In_Javadoc_Comments ...see "also known as".

We HAVE to follow specification as much as possible/reasonable.

please use term "void", please recheck writingjavadocchecks.html and substitute all singletons/empty/.... terms to void.

Member

romani commented Jul 22, 2017

@ps-sp ,

singletonElement can be renamed to voidElement as seen here

yes, it has to be renamed. BUT in description I mentioned it as:
singleton tag -> empty tag, strange ...... looks like it is bad name too that we used non from spec - http://checkstyle.sourceforge.net/writingjavadocchecks.html#HTML_Code_In_Javadoc_Comments ...see "also known as".

We HAVE to follow specification as much as possible/reasonable.

please use term "void", please recheck writingjavadocchecks.html and substitute all singletons/empty/.... terms to void.

@ps-sp

This comment has been minimized.

Show comment
Hide comment
@ps-sp

ps-sp Jul 23, 2017

Collaborator

@romani @Vladlis

TOPIC: To support incorrect tags in singletonTag rule or not.

FYI: self-closing tags are used to refer to tags which don't have an end token and end with SLASH_CLOSE . As per our current grammar both singletonTag rule and singletonElement qualify as self-closing.

Looking at the standard at https://www.w3.org/TR/html5/syntax.html#parsing, this is how things apparently work in relation with self-closing tags (like <p/>, <li/>;) :

1. When < is encountered in data-state, switch is supposed to be made to tag-open-state. (these states resemble to the modes in ANTLR grammar)

2. In tag-open-state, after processing ASCII characters (uppercase or lowercase) switch is to be made to tag-name-state.

3. In tag-name-state when / is encountered, switch is made to self-closing-start-tag-state and sets the self-closing flag as soon as > is encountered.

NOW THE CATCH:

The way in which the standard specifies to acknowledge-self-closing-flag. If you go through that documentation (or look at the note in this section on html.spec.whatwg.org ) you would realize that only void-elements and only foreign-elements are allowed to have self-closing flag set. Having any other self-closing tag is parse error. The way parse error has been defined on https://www.w3.org/TR/html5/syntax.html#parse-error:

This specification defines the parsing rules for HTML documents, whether they are syntactically correct or not. Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined (that's the processing rules described throughout this specification), but user agents, while parsing an HTML document, may abort the parser at the first parse error that they encounter for which they do not wish to apply the rules described in this specification.


I was able to find the way the parser is supposed to behave in such cases on: https://html.spec.whatwg.org/#parse-error-non-void-html-element-start-tag-with-trailing-solidus, which too is official doc page. Quoting from that page:

This error occurs if the parser encounters a start tag for an element that is not in the list of void elements or is not a part of foreign content (i.e., not an SVG or MathML element) that has a U+002F (/) code point right before the closing U+003E (>) code point. The parser behaves as if the U+002F (/) is not present.


So, in form of singletonTag, we are supporting something that specification suggests should either be ignored or result in parse error.

Javadoc tool doesn't properly support self-closing tag either

Example can be seen here. As can be seen in the logs provided both the self-closing tags provided in javadoc yield errors. But, the site rendered has working links as you can check for yourself by executing the javadoc tool on that file. Closing the tags properly makes the errors go away.

SOLUTION 1:
No need to separately support singletonTag and the / in singletonTags can be ignored as specification recommends and inputs like <p/>, <li/> ... can be included in corresponding ...TagOpen rules.

SOLUTION 2:
Continue supporting singletonTags as we do. Though I don't think they should be called voidTags since void-elements is used to refer to only particular tags. We can call them selfClosingTag or otherSelfClosingTag or something like that. But again, supporting these is against the standard and the javadoc tool.

Collaborator

ps-sp commented Jul 23, 2017

@romani @Vladlis

TOPIC: To support incorrect tags in singletonTag rule or not.

FYI: self-closing tags are used to refer to tags which don't have an end token and end with SLASH_CLOSE . As per our current grammar both singletonTag rule and singletonElement qualify as self-closing.

Looking at the standard at https://www.w3.org/TR/html5/syntax.html#parsing, this is how things apparently work in relation with self-closing tags (like <p/>, <li/>;) :

1. When < is encountered in data-state, switch is supposed to be made to tag-open-state. (these states resemble to the modes in ANTLR grammar)

2. In tag-open-state, after processing ASCII characters (uppercase or lowercase) switch is to be made to tag-name-state.

3. In tag-name-state when / is encountered, switch is made to self-closing-start-tag-state and sets the self-closing flag as soon as > is encountered.

NOW THE CATCH:

The way in which the standard specifies to acknowledge-self-closing-flag. If you go through that documentation (or look at the note in this section on html.spec.whatwg.org ) you would realize that only void-elements and only foreign-elements are allowed to have self-closing flag set. Having any other self-closing tag is parse error. The way parse error has been defined on https://www.w3.org/TR/html5/syntax.html#parse-error:

This specification defines the parsing rules for HTML documents, whether they are syntactically correct or not. Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined (that's the processing rules described throughout this specification), but user agents, while parsing an HTML document, may abort the parser at the first parse error that they encounter for which they do not wish to apply the rules described in this specification.


I was able to find the way the parser is supposed to behave in such cases on: https://html.spec.whatwg.org/#parse-error-non-void-html-element-start-tag-with-trailing-solidus, which too is official doc page. Quoting from that page:

This error occurs if the parser encounters a start tag for an element that is not in the list of void elements or is not a part of foreign content (i.e., not an SVG or MathML element) that has a U+002F (/) code point right before the closing U+003E (>) code point. The parser behaves as if the U+002F (/) is not present.


So, in form of singletonTag, we are supporting something that specification suggests should either be ignored or result in parse error.

Javadoc tool doesn't properly support self-closing tag either

Example can be seen here. As can be seen in the logs provided both the self-closing tags provided in javadoc yield errors. But, the site rendered has working links as you can check for yourself by executing the javadoc tool on that file. Closing the tags properly makes the errors go away.

SOLUTION 1:
No need to separately support singletonTag and the / in singletonTags can be ignored as specification recommends and inputs like <p/>, <li/> ... can be included in corresponding ...TagOpen rules.

SOLUTION 2:
Continue supporting singletonTags as we do. Though I don't think they should be called voidTags since void-elements is used to refer to only particular tags. We can call them selfClosingTag or otherSelfClosingTag or something like that. But again, supporting these is against the standard and the javadoc tool.

@Vladlis

This comment has been minimized.

Show comment
Hide comment
@Vladlis

Vladlis Jul 24, 2017

Member

So, in form of singletonTag, we are supporting something that specification suggests should either be ignored or result in parse error.

and also SVG or MathML, right?

Example can be seen here. As can be seen in the logs provided both the self-closing tags provided in javadoc yield errors.

You are referring html5 doc whereas javadoc tool supports only html4 until java9.

SOLUTION 1:
No need to separately support singletonTag and the / in singletonTags can be ignored as specification recommends and inputs like <p/>, <li/> ... can be included in corresponding ...TagOpen rules.

Won't this make inputs like <p/>some text</p> possible? It is unacceptable.

I think we need to continue supporting singletonTag and singletonElement so we don't need to support all tags from SVG and MathML. And yes, we cannot rename singletonElement to voidTag because it is not always a void tag. We could make a separate rule voidTag with only void tags inside, but in html5 these tags are also self-closing and we are going to support html5 here #3332.
So I think that second option is what we need, but renaming should be done in scope of #3332.

Member

Vladlis commented Jul 24, 2017

So, in form of singletonTag, we are supporting something that specification suggests should either be ignored or result in parse error.

and also SVG or MathML, right?

Example can be seen here. As can be seen in the logs provided both the self-closing tags provided in javadoc yield errors.

You are referring html5 doc whereas javadoc tool supports only html4 until java9.

SOLUTION 1:
No need to separately support singletonTag and the / in singletonTags can be ignored as specification recommends and inputs like <p/>, <li/> ... can be included in corresponding ...TagOpen rules.

Won't this make inputs like <p/>some text</p> possible? It is unacceptable.

I think we need to continue supporting singletonTag and singletonElement so we don't need to support all tags from SVG and MathML. And yes, we cannot rename singletonElement to voidTag because it is not always a void tag. We could make a separate rule voidTag with only void tags inside, but in html5 these tags are also self-closing and we are going to support html5 here #3332.
So I think that second option is what we need, but renaming should be done in scope of #3332.

@Vladlis

This comment has been minimized.

Show comment
Hide comment
@Vladlis

Vladlis Jul 24, 2017

Member

Example can be seen here. As can be seen in the logs provided both the self-closing tags provided in javadoc yield errors.

You are referring html5 doc whereas javadoc tool supports only html4 until java9.

Sorry, tags in example are not supported as self-closing in html5 either, only void, SVG and MathML tags. Looks like we have another issue - singletonTag should not allow anything apart from HTML_TAG_NAME, because <p/> is not correct as well as the others listed in this rule. Please create a new issue if you agree.

Member

Vladlis commented Jul 24, 2017

Example can be seen here. As can be seen in the logs provided both the self-closing tags provided in javadoc yield errors.

You are referring html5 doc whereas javadoc tool supports only html4 until java9.

Sorry, tags in example are not supported as self-closing in html5 either, only void, SVG and MathML tags. Looks like we have another issue - singletonTag should not allow anything apart from HTML_TAG_NAME, because <p/> is not correct as well as the others listed in this rule. Please create a new issue if you agree.

@ps-sp

This comment has been minimized.

Show comment
Hide comment
@ps-sp

ps-sp Jul 26, 2017

Collaborator

and also SVG or MathML, right?

SVG and MathML tags can be self-closing right ?

Won't this make inputs like

some text

possible? It is unacceptable.

I agree. But just fyi, Inputs like <b/> bold </b> result in bold texts in javadoc.

I think we need to continue supporting singletonTag and singletonElement so we don't need to support all tags from SVG and MathML.

The topic says to support singletonTag rule or not. But if you see all I am trying to say that we shouldn't support tags that are not supposed to be self-closing i.e these elements in JavadocParser.g4#L785-#L799 in singletonTag rule. We of course do want to support tags that can rightfully be self-closing. Also, at the time at which I wrote the comment I somehow incorrectly thought that <somevoidelementhtmltag/> would be supported by singletonElement rule and so we can completely delete singletonTag and hence the confusion. But of course I made it clear in that comment itself that we should not support only those tags which are not allowed to be self-closing. (but it shall be supported by the alternative in singletonTag at JavadocParser.g4#L784 )

I have updated the TOPIC header in previous comment now.

Please create a new issue if you agree.

Issue #4827

Collaborator

ps-sp commented Jul 26, 2017

and also SVG or MathML, right?

SVG and MathML tags can be self-closing right ?

Won't this make inputs like

some text

possible? It is unacceptable.

I agree. But just fyi, Inputs like <b/> bold </b> result in bold texts in javadoc.

I think we need to continue supporting singletonTag and singletonElement so we don't need to support all tags from SVG and MathML.

The topic says to support singletonTag rule or not. But if you see all I am trying to say that we shouldn't support tags that are not supposed to be self-closing i.e these elements in JavadocParser.g4#L785-#L799 in singletonTag rule. We of course do want to support tags that can rightfully be self-closing. Also, at the time at which I wrote the comment I somehow incorrectly thought that <somevoidelementhtmltag/> would be supported by singletonElement rule and so we can completely delete singletonTag and hence the confusion. But of course I made it clear in that comment itself that we should not support only those tags which are not allowed to be self-closing. (but it shall be supported by the alternative in singletonTag at JavadocParser.g4#L784 )

I have updated the TOPIC header in previous comment now.

Please create a new issue if you agree.

Issue #4827

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jul 26, 2017

ps-sp added a commit to ps-sp/checkstyle that referenced this issue Jul 26, 2017

romani added a commit that referenced this issue Jul 27, 2017

Issue #4448: Updated token names and rule names in javadoc grammar in…
… accordance with the HTML specification
@rnveach

This comment has been minimized.

Show comment
Hide comment
@rnveach

rnveach Jul 27, 2017

Member

@Vladlis @ps-sp Is there more to do in this issue?

Member

rnveach commented Jul 27, 2017

@Vladlis @ps-sp Is there more to do in this issue?

@Vladlis

This comment has been minimized.

Show comment
Hide comment
@Vladlis

Vladlis Jul 27, 2017

Member

@rnveach
I suppose the rest of questions is moved to the other existing/new issues, so I am closing this one.

Member

Vladlis commented Jul 27, 2017

@rnveach
I suppose the rest of questions is moved to the other existing/new issues, so I am closing this one.

@Vladlis Vladlis closed this Jul 27, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment