Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL: Tokenize quoted schema object names, and escape characters, uniquely #1555

Merged
merged 2 commits into from Oct 27, 2020

Conversation

@kurtmckee
Copy link
Contributor

@kurtmckee kurtmckee commented Sep 27, 2020

Changes in this patch:

  • Name.Quoted and Name.Quoted.Escape are introduced as non-standard tokens
  • HTML and LaTeX formatters were confirmed to provide default formatting
    if they encounter these two non-standard tokens. They also add style
    classes based on the token name, like "n-Quoted" (HTML) or "nQuoted"
    (LaTeX) so that users can add custom styles for these.
  • Removed "`" and "\" as schema object name escapes. These are relics
    of the previous regular expression for backtick-quoted names and are
    not treated as escape sequences. The behavior was confirmed in the
    MySQL documentation as well as by running queries in MySQL Workbench.
  • Prevent "123abc" from being treated as an integer followed by a schema
    object name. MySQL allows leading numbers in schema object names as long
    as 0-9 are not the only characters in the schema object name.
  • Add ~10 more unit tests to validate behavior.

Closes #1551

…iquely

Changes in this patch:

* Name.Quoted and Name.Quoted.Escape are introduced as non-standard tokens
* HTML and LaTeX formatters were confirmed to provide default formatting
  if they encounter these two non-standard tokens. They also add style
  classes based on the token name, like "n-Quoted" (HTML) or "nQuoted"
  (LaTeX) so that users can add custom styles for these.
* Removed "\`" and "\\" as schema object name escapes. These are relics
  of the previous regular expression for backtick-quoted names and are
  not treated as escape sequences. The behavior was confirmed in the
  MySQL documentation as well as by running queries in MySQL Workbench.
* Prevent "123abc" from being treated as an integer followed by a schema
  object name. MySQL allows leading numbers in schema object names as long
  as 0-9 are not the only characters in the schema object name.
* Add ~10 more unit tests to validate behavior.

Closes pygments#1551
@kurtmckee
Copy link
Contributor Author

@kurtmckee kurtmckee commented Sep 27, 2020

These checks are not identical to what's in tox. I'll fix the regex linter warning and open a ticket about the tox/GitHub checks disparity.

Also, add tests that confirm correct behavior. No tests failed before
or after removing the '$' match in the regex, but now regexlint isn't
complaining.

Removing the '$' matching probably depends on the fact that Pygments
adds a newline at the end of the input text, so there is always something
after a bare integer literal.
@Anteru Anteru requested review from Anteru and birkenfeld Sep 28, 2020
Anteru
Anteru approved these changes Sep 28, 2020
@Anteru Anteru added this to the 2.7.3 milestone Oct 27, 2020
@Anteru Anteru merged commit a72957f into pygments:master Oct 27, 2020
11 checks passed
@Anteru
Copy link
Collaborator

@Anteru Anteru commented Oct 27, 2020

As usual, thanks for the good work!

@kurtmckee kurtmckee deleted the mysql-tokens branch Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

2 participants