Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Attributes and IDs Names #51

Open
1 of 12 tasks
tajmone opened this issue Feb 14, 2024 · 0 comments
Open
1 of 12 tasks

Custom Attributes and IDs Names #51

tajmone opened this issue Feb 14, 2024 · 0 comments
Labels
💡 enhancement New feature or request

Comments

@tajmone
Copy link
Owner

tajmone commented Feb 14, 2024

  • attributes:
    • define in variables: a RegEx variable for valid attribute names.
      • replace RegExs matching attributes with new variable.
    • in context that deal with attributes:
      • handle invalid attribute names, scoping them as invalid.
      • should we also scope as deprecated names with capital letters? (i.e. since they violate best practices)
  • custom IDs:
    • define in variables: a RegEx variable for valid IDs names.
    • define in variables: a RegEx variable for XML compliant IDs names.
    • in context that deal with custom IDs:
      • handle invalid ID names, scoping them as invalid.
      • should we also scope as deprecated ID names which are valid but do not follow XML convention? (i.e. since they violate best practices and might not work with some output formats and backends)

We need to improve how the syntax handles attributes and IDs, so that it can capture valid and invalid attributes and ID names accordingly.

The best approach is to define custom variables that can then be reused in the RegExs that handle matching in the various contexts.

In other words, the RegExs that handle them should be able to account for — and intercept — invalid attributes and IDs names and scope them as invalid, in order to warn the user about the problem.

Custom Attributes

Custom attributes have a fairly simple rule when it comes to naming:

  • be at least one character long,
  • begin with a word character (A-Z, a-z, 0-9, or _), and
  • only contain word characters and hyphens.

An important point to note here is that attribute names are stored in lowercase, and it's best practice to use lowercase lettering only:

Although uppercase characters are permitted in an attribute name, the name is converted to lowercase before being stored. For example, URL-REPO and URL-Repo are treated as url-repo when a document is loaded or converted. A best practice is to only use lowercase letters in the name and avoid starting the name with a number.

I'm not sure whether we ought to signal with the deprecated scope attributes that contain uppercase letters — i.e. to inform the user of a potential naming conflict, in case he/she was relying on letter-casing difference as being independent attributes, or the mere violation of best practices. Need to think about it.

Custom IDs

Custom IDs, on the other hand, are subject to different rules depending on how they are defined.

There are three ways to define a custom ID:

  • shorthand hash (#) syntax
  • legacy anchor ([[]]) syntax
  • longhand (id=) syntax

In the first two cases, the same rules are somewhat simpler — still unclear to me if they are the same of attributes names or not:

When the ID is defined using the shorthand hash syntax or the anchor syntax, the acceptable characters is more limited (for example, spaces are not permitted). Regardless, it’s not advisable to exploit the ability to use any characters the AsciiDoc syntax allows. The reason to be cautious is because the ID is passed through to the output, and not all output formats afford the same latitude. For example, XML is far more restrictive about which characters are permitted in an ID value.

As for the longhand id= syntax, things are more complicated since AsciiDoc doesn't impose restrictions, but strongly advises to stick to the XML naming convention:

AsciiDoc does not restrict the set of characters that can be used for an ID when the ID is defined using the named id attribute. All the language requires in this case is that the value be non-empty.

As for the best practices:

To ensure portability of your IDs, it’s best to conform to a universal standard. The standard we recommend following is a Name value as defined by the XML specification. At a high level, the first character of a Name must be a letter, colon, or underscore and the optional following characters must be a letter, colon, underscore, hyphen, period, or digit. You should not use any space characters in an ID. Starting the ID with a digit is less likely to be problematic, but still best to avoid. It’s best to use lowercase letters whenever possible as this solves portability problem when using case-insensitive platforms.

When the AsciiDoc processor auto-generates IDs for section titles and discrete headings, it adheres to this standard.

Basically it's a huge mess to deal with, as far as our RegEx-driven syntax goes.

We need to keep into account that the syntax needs to be able to handle any valid construct, regardless of whether it follows best practices or not — we can't have the syntax break-down due to a well-formed construct.

Probably, the best approach here is to go for to broad case: i.e. accept anything that is valid. Ideally, in the future, we could reach a point where the syntax first handles XML-abiding names, and then treats any name which is valid (but failed the XML RegEx) as deprecated, just to call attention to the fact that it violates best practices (can't really use invalid if it's acceptable).

References

@tajmone tajmone added the 💡 enhancement New feature or request label Feb 14, 2024
tajmone added a commit that referenced this issue Feb 15, 2024
* Define syntax variable "attribute_name" with reusable RegEx
  for attribute names (see Issue #51).
* Add basic syntax support for preprocessor directives
  (ifdef/ifndef/ifeval/include).
  Malformed directives are being gracefully handled by forcibly
  popping out of the context in case of a premature EOL or missing
  brackets, but advanced elements like "include/ifeval" parameters
  within brackets are not yet being scoped.
* Update "AsciiDoc Dark" to color preprocessor directives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💡 enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant