This is an unofficial, incomplete ANTLR4-based parser of textual representation of query trees, built and used by the MS SQL query optimizer during its execution.
For any T-SQL statement, a batch of query trees for specific optimization steps can be acquired in a textual form as a part of diagnostic information from the standard output by enabling following trace flags (source):
8605
- Shows the (converted) input tree, laying out the logical operations implied by the query8606
- Shows query trees for intermediate steps in the processing of the (converted) input tree, (input tree, simplified tree, join-collapsed tree, trees before and after "project normalization")8607
- Shows output tree composed from physical operators (not supported at this time)8612
- Shows extra information for certain logical operators, like cardinality estimations
Note that for having this diagnostic information being redirected to the standard output from the error log, trace flag 3604
must also be enabled.
This is a toy project I am working on in my spare time, a byproduct of an effort to learn how the query optimizer works. It is not intended be used in practical scenarios as it is a non-goal at this time.
As of now, the grammar is vastly incomplete and not well-defined, and, due to the lack of any official documentation or specification, everything is based on assumptions inferred directly from the available diagnostic tools and public knowledge.
T-SQL permits database object identifiers (called Delimited identifiers) to contain basically any character, other than alphanumeric ([a-zA-Z0-9]) and underscore ([_]), given that the identifier is escaped in brackets (see documentation). For example:
CREATE FUNCTION dbo.[ &'""<>!@#$%^&*() ... dbo.[ abcd ]] IsDet IsNonDet IsNonDet IsDet IsDet ]
Query trees containing such objects are currently not supported by the parser, as they are unfortunately being printed unescaped to the diagnostic output:
LogOp_Project COL: Expr1000 [ Card=0 ]
LogOp_ConstTableGet (1) [empty] [ Card=0 ]
AncOp_PrjList
AncOp_PrjEl COL: Expr1000
ScaOp_Udf dbo. &'""<>!@#$%^&*() ... dbo.[ abcd ] IsDet IsNonDet IsNonDet IsDet IsDet IsNonDet
This would make the grammar definition significantly more complex than it is and I do not consider to support it at this time.
This project is licensed under the MIT license. See LICENSE for details.