New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CHARACTER SET isn't recognized on MySQL lexer #975
Comments
I am working to overhaul the MySQL lexer. I will try to improve this behavior but note that "CHARACTER" and "SET" in the I will consider this closed in my PR if CHARACTER and SET share the same token type. |
Fixes pygments#975, pygments#1063, pygments#1453 Changes include: Documentation ------------- * Note in the lexer docstring that Oracle MySQL is the target syntax. MariaDB syntax is not a target (though there is significant overlap). Unit tests ---------- * Add 140 unit tests for MySQL. Literals -------- * Hexadecimal/binary/date/time/timestamp literals are supported. * Integer mantissas are supported for scientific notation. * In-string escapes are now tokenized properly. * Support the "unknown" constant. Comments -------- * Optimizer hints are now supported, and keywords are recognized and tokenized as preprocessor instructions. * Remove nested multi-line comment support, which is no longer supported in MySQL. Variables --------- * Support the '@' prefix for variable names. * Lift restrictions on characters in unquoted variable names. (MySQL does not impose a restriction on lead characters.) * Support single/double/backtick-quoted variable names, including escapes. * Support the '@@' prefix for system variable names. * Support '?' as a variable so people can demonstrate prepared statements. Keywords -------- * Keyword / data type / function are now in a separate, auto-updating file. * Support 25 additional data types (including spatial and JSON types). * Support 460 additional MySQL keywords. * Support 372 MySQL functions. Explicit function support resolves a bug that causes non-function items to be treated as functions simply because they have a trailing opening parenthesis. * Support exceptions for the 'SET' keyword, which is both a datatype and a keyword depending on context. Schema object names ------------------- * Support Unicode in MySQL schema object names. * Support parsing of backtick-quoted schema object name escapes. (Escapes do not produce a distinct token type at this time.) Operators --------- * Remove non-operator characters from the list of operators. * Remove non-punctuation characters from the list of punctuation.
Fixes pygments#975, pygments#1063, pygments#1453 Changes include: Documentation ------------- * Note in the lexer docstring that Oracle MySQL is the target syntax. MariaDB syntax is not a target (though there is significant overlap). Unit tests ---------- * Add 140 unit tests for MySQL. Literals -------- * Hexadecimal/binary/date/time/timestamp literals are supported. * Integer mantissas are supported for scientific notation. * In-string escapes are now tokenized properly. * Support the "unknown" constant. Comments -------- * Optimizer hints are now supported, and keywords are recognized and tokenized as preprocessor instructions. * Remove nested multi-line comment support, which is no longer supported in MySQL. Variables --------- * Support the '@' prefix for variable names. * Lift restrictions on characters in unquoted variable names. (MySQL does not impose a restriction on lead characters.) * Support single/double/backtick-quoted variable names, including escapes. * Support the '@@' prefix for system variable names. * Support '?' as a variable so people can demonstrate prepared statements. Keywords -------- * Keyword / data type / function are now in a separate, auto-updating file. * Support 25 additional data types (including spatial and JSON types). * Support 460 additional MySQL keywords. * Support 372 MySQL functions. Explicit function support resolves a bug that causes non-function items to be treated as functions simply because they have a trailing opening parenthesis. * Support exceptions for the 'SET' keyword, which is both a datatype and a keyword depending on context. Schema object names ------------------- * Support Unicode in MySQL schema object names. * Support parsing of backtick-quoted schema object name escapes. (Escapes do not produce a distinct token type at this time.) Operators --------- * Remove non-operator characters from the list of operators. * Remove non-punctuation characters from the list of punctuation.
Fixes pygments#975 Fixes pygments#1063 Fixes pygments#1453 Changes include: Documentation ------------- * Note in the lexer docstring that Oracle MySQL is the target syntax. MariaDB syntax is not a target (though there is significant overlap). Unit tests ---------- * Add 140 unit tests for MySQL. Literals -------- * Hexadecimal/binary/date/time/timestamp literals are supported. * Integer mantissas are supported for scientific notation. * In-string escapes are now tokenized properly. * Support the "unknown" constant. Comments -------- * Optimizer hints are now supported, and keywords are recognized and tokenized as preprocessor instructions. * Remove nested multi-line comment support, which is no longer supported in MySQL. Variables --------- * Support the '@' prefix for variable names. * Lift restrictions on characters in unquoted variable names. (MySQL does not impose a restriction on lead characters.) * Support single/double/backtick-quoted variable names, including escapes. * Support the '@@' prefix for system variable names. * Support '?' as a variable so people can demonstrate prepared statements. Keywords -------- * Keyword / data type / function are now in a separate, auto-updating file. * Support 25 additional data types (including spatial and JSON types). * Support 460 additional MySQL keywords. * Support 372 MySQL functions. Explicit function support resolves a bug that causes non-function items to be treated as functions simply because they have a trailing opening parenthesis. * Support exceptions for the 'SET' keyword, which is both a datatype and a keyword depending on context. Schema object names ------------------- * Support Unicode in MySQL schema object names. * Support parsing of backtick-quoted schema object name escapes. (Escapes do not produce a distinct token type at this time.) Operators --------- * Remove non-operator characters from the list of operators. * Remove non-punctuation characters from the list of punctuation.
* Overhaul the MySQL lexer Fixes #975, #1063, #1453 Changes include: Documentation ------------- * Note in the lexer docstring that Oracle MySQL is the target syntax. MariaDB syntax is not a target (though there is significant overlap). Unit tests ---------- * Add 140 unit tests for MySQL. Literals -------- * Hexadecimal/binary/date/time/timestamp literals are supported. * Integer mantissas are supported for scientific notation. * In-string escapes are now tokenized properly. * Support the "unknown" constant. Comments -------- * Optimizer hints are now supported, and keywords are recognized and tokenized as preprocessor instructions. * Remove nested multi-line comment support, which is no longer supported in MySQL. Variables --------- * Support the '@' prefix for variable names. * Lift restrictions on characters in unquoted variable names. (MySQL does not impose a restriction on lead characters.) * Support single/double/backtick-quoted variable names, including escapes. * Support the '@@' prefix for system variable names. * Support '?' as a variable so people can demonstrate prepared statements. Keywords -------- * Keyword / data type / function are now in a separate, auto-updating file. * Support 25 additional data types (including spatial and JSON types). * Support 460 additional MySQL keywords. * Support 372 MySQL functions. Explicit function support resolves a bug that causes non-function items to be treated as functions simply because they have a trailing opening parenthesis. * Support exceptions for the 'SET' keyword, which is both a datatype and a keyword depending on context. Schema object names ------------------- * Support Unicode in MySQL schema object names. * Support parsing of backtick-quoted schema object name escapes. (Escapes do not produce a distinct token type at this time.) Operators --------- * Remove non-operator characters from the list of operators. * Remove non-punctuation characters from the list of punctuation. * Cleanup items based on feedback * Remove an unnecessary optional newline lookahead for single-line comments
* Overhaul the MySQL lexer Fixes pygments#975, pygments#1063, pygments#1453 Changes include: Documentation ------------- * Note in the lexer docstring that Oracle MySQL is the target syntax. MariaDB syntax is not a target (though there is significant overlap). Unit tests ---------- * Add 140 unit tests for MySQL. Literals -------- * Hexadecimal/binary/date/time/timestamp literals are supported. * Integer mantissas are supported for scientific notation. * In-string escapes are now tokenized properly. * Support the "unknown" constant. Comments -------- * Optimizer hints are now supported, and keywords are recognized and tokenized as preprocessor instructions. * Remove nested multi-line comment support, which is no longer supported in MySQL. Variables --------- * Support the '@' prefix for variable names. * Lift restrictions on characters in unquoted variable names. (MySQL does not impose a restriction on lead characters.) * Support single/double/backtick-quoted variable names, including escapes. * Support the '@@' prefix for system variable names. * Support '?' as a variable so people can demonstrate prepared statements. Keywords -------- * Keyword / data type / function are now in a separate, auto-updating file. * Support 25 additional data types (including spatial and JSON types). * Support 460 additional MySQL keywords. * Support 372 MySQL functions. Explicit function support resolves a bug that causes non-function items to be treated as functions simply because they have a trailing opening parenthesis. * Support exceptions for the 'SET' keyword, which is both a datatype and a keyword depending on context. Schema object names ------------------- * Support Unicode in MySQL schema object names. * Support parsing of backtick-quoted schema object name escapes. (Escapes do not produce a distinct token type at this time.) Operators --------- * Remove non-operator characters from the list of operators. * Remove non-punctuation characters from the list of punctuation. * Cleanup items based on feedback * Remove an unnecessary optional newline lookahead for single-line comments
(Original issue 1271 created by dereckson on 2016-07-29T00:20:08.320023+00:00)
As a sample, here the instructions for Etherpad encoding:
Expected behavior: CHARACTER SET and CONVERT TO CHARACTER SET receive the same highlighting
Actual behavior: CHARACTER received the class k, SET the class kt
The text was updated successfully, but these errors were encountered: