New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PATCH] Tokenizer doesn't recognize some valid HTML attributes (Origin: bugzilla #687301) #4936

Closed
doxygen opened this Issue Jul 2, 2018 · 0 comments

Comments

Projects
None yet
1 participant
@doxygen
Owner

doxygen commented Jul 2, 2018

status RESOLVED severity normal in component general for ---
Reported in version 1.8.2-SVN on platform Other
Assigned to: Dimitri van Heesch

Original attachment names and IDs:

On 2012-11-01 01:02:20 +0000, mason malone wrote:

Created attachment 227769
Fix html attribute parsing

There are many valid characters that can appear in HTML attribute names that Doxygen doesn't allow, notably the hyphen. This means you can't use data attributes (which always take the form data-foo="bar") in HTML, and that can be pretty annoying since data attributes are frequently used to pass data to Javascript apps.

The attached patch modifies doctokenizer.l to be more liberal in what characters are allowed in attribute names. The regular expression for HTMLATTID was derived from this:
http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html#attributes-0
"Attributes have a name and a value. Attribute names must consist of one or more characters other than the space characters, U+0000 NULL, U+0022 QUOTATION MARK ("), U+0027 APOSTROPHE ('), U+003E GREATER-THAN SIGN (>), U+002F SOLIDUS (/), and U+003D EQUALS SIGN (=) characters, the control characters, and any characters that are not defined by Unicode."

On 2012-11-17 10:06:09 +0000, Dimitri van Heesch wrote:

I don't mind adding the '-' but allowing even more characters will probably lead to cases were text will suddenly be parsed as an attribute.

Besides that, using arbitrary names for attributes is not part of the HTML standard. The 4.01 standard only lists these as valid for instance:
http://www.w3.org/TR/REC-html40/index/attributes.html

On 2012-11-18 11:07:25 +0000, Dimitri van Heesch wrote:

Changed version 'latest' to '1.8.2-SVN' so I can remove 'latest' as an option as it is a moving target.

On 2012-12-26 16:09:10 +0000, Dimitri van Heesch wrote:

This bug was previously marked ASSIGNED, which means it should be fixed in
doxygen version 1.8.3. Please verify if this is indeed the case. Reopen the
bug if you think it is not fixed and please include any additional information
that you think can be relevant.

@doxygen doxygen closed this Jul 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment