Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions .github/BUG_FIXING_GUIDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Bug Fixing Guide for SqlScriptDOM

This guide provides a summary of the typical workflow for fixing a bug in the SqlScriptDOM parser, based on practical experience. For a more comprehensive overview of the project structure and code generation, please refer to the main [Copilot / AI instructions for SqlScriptDOM](copilot-instructions.md).

## Summary of the Bug-Fixing Workflow

The process of fixing a bug, especially one that involves adding new syntax, follows these general steps:

1. **Grammar Modification**:
* Identify the correct grammar rule to modify in the `SqlScriptDom/Parser/TSql/*.g` files.
* Apply the necessary changes to all relevant `.g` files, from the version where the syntax was introduced up to the latest version (e.g., `TSql130.g` through `TSql170.g` and `TSqlFabricDW.g`).

2. **Abstract Syntax Tree (AST) Update**:
* If the new syntax requires a new AST node or enum member, edit `SqlScriptDom/Parser/TSql/Ast.xml`. For example, adding a new operator like `NOT LIKE` required adding a `NotLike` member to the `BooleanComparisonType` enum.

3. **Script Generation Update**:
* Update the script generator to handle the new AST node or enum. This typically involves modifying files in `SqlScriptDom/ScriptDom/SqlServer/ScriptGenerator/`. For the `NOT LIKE` example, this meant adding an entry to the `_booleanComparisonTypeGenerators` dictionary in `SqlScriptGeneratorVisitor.CommonPhrases.cs`.

4. **Build the Project**:
* After making code changes, run a build to regenerate the parser and ensure everything compiles correctly:
```bash
dotnet build
```

5. **Add a Unit Test**:
* Create a new `.sql` file in `Test/SqlDom/TestScripts/` that contains the specific syntax for the new test case.

6. **Define the Test Case**:
* Add a new `ParserTest` entry to the appropriate `Only<version>SyntaxTests.cs` files (e.g., `Only130SyntaxTests.cs`). This entry points to your new test script and defines the expected number of parsing errors for each SQL Server version.

7. **Generate and Verify Baselines**:
This is a critical and multi-step process:
* **a. Create Placeholder Baseline Files**: Before running the test, create empty or placeholder baseline files in the corresponding `Test/SqlDom/Baselines<version>/` directories. The filename must match the test script's filename.
* **b. Run the Test to Get the Generated Script**: Run the specific test that you just added. It is *expected to fail* because the placeholder baseline will not match the script generated by the parser.
```bash
# Example filter for running a specific test
dotnet test --filter "FullyQualifiedName~YourTestMethodName"
```
* **c. Update the Baseline Files**: Copy the "Actual" output from the test failure log. This is the correctly formatted script generated from the AST. Paste this content into all the baseline files you created in step 7a.
* **d. Re-run the Tests**: Run the same test command again. This time, the tests should pass, confirming that the generated script matches the new baseline.

By following these steps, you can ensure that new syntax is correctly parsed, represented in the AST, generated back into a script, and fully validated by the testing framework.
64 changes: 60 additions & 4 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,65 @@
# Copilot / AI instructions for SqlScriptDOM

ScriptDom is a library for parsing and generating T-SQL scripts. It is primarily used by DacFx to build database projects, perform schema comparisons, and generate scripts for deployment.

T-SQL syntax definitions are defined in the .g files in SqlScriptDom/Parser/TSql/. The file names map to SQL Server versions, e.g. TSql170.g corresponds to the syntax definitions for SQL Server 2025, TSql160.g to SQL Server 2022, etc. Syntax for Azure SQL Database should always be based on the latest SQL Server version.
## Key points (quick read)
- Grammar files live in: `SqlScriptDom/Parser/TSql/` — each file corresponds to a SQL Server version (e.g. `TSql170.g` for 170 / SQL Server 2025).
- Grammar format: ANTLR v2. Generated C# lexer/parser code is produced during the build (see `GenerateFiles.props`).
- Build & tests: use the .NET SDK pinned in `global.json`. Typical commands from repo root:
- `dotnet build -c Debug`
- `dotnet test Test/SqlDom/UTSqlScriptDom.csproj -c Debug`
- To regenerate parser/token/AST sources explicitly, build the main project (generation targets are hooked into its build):
- `dotnet build SqlScriptDom/Microsoft.SqlServer.TransactSql.ScriptDom.csproj -c Debug`
- (or) `dotnet msbuild SqlScriptDom/Microsoft.SqlServer.TransactSql.ScriptDom.csproj -t:GLexerParserCompile;GSqlTokenTypesCompile;CreateAST -p:Configuration=Debug`

## Why files are generated and where
- `SqlScriptDom/GenerateFiles.props` contains the MSBuild targets invoked during the library build:
- `GSqlTokenTypesCompile` / `GLexerParserCompile` -> run ANTLR and post-process outputs (powershell/sed scripts)
- `CreateAST` -> runs AstGen tool (from `tools/AstGen`) to generate AST visitor/fragment classes
- `GenerateEverything` -> runs ScriptGenSettingsGenerator and TokenListGenerator
- The Antlr binary is downloaded to the path defined in `Directory.Build.props` (`AntlrLocation`) when the build runs (via the `InstallAntlr` target).
- Generated C# files are written to `$(CsGenIntermediateOutputPath)` (under `obj/...` by default). Do not hand-edit generated files — change the .g grammar or post-processing scripts instead.

## Important files and folders (read these first)
- `SqlScriptDom/Parser/TSql/*.g` — ANTLR v2 grammar files (TSql80..TSql170 etc.). Example: `TSql170.g` defines new-170 syntax.
- `SqlScriptDom/GenerateFiles.props` and `Directory.Build.props` — define code generation targets and antlr location.
- `SqlScriptDom/ParserPostProcessing.sed`, `LexerPostProcessing.sed`, `TSqlTokenTypes.ps1` — post-processing for generated C# sources and tokens.
- `tools/` — contains code generators used during build: `AstGen`, `ScriptGenSettingsGenerator`, `TokenListGenerator`.
- `Test/SqlDom/` — unit tests, baselines and test scripts. See `Only170SyntaxTests.cs`, `TestScripts/`, and `Baselines170/`.

## Developer workflow & conventions (typical change cycle)
1. Add/modify grammar rule(s) in the correct `TSql*.g` (pick the _version_ the syntax belongs to).
2. If tokens or token ordering change, update `TSqlTokenTypes.g` (and the sed/ps1 post-processors if necessary).
3. Rebuild the ScriptDom project to regenerate parser and AST (`dotnet build` will run generation). Use the targeted msbuild targets if you only want generation.
4. Add tests:
- Put the input SQL in `Test/SqlDom/TestScripts/` (filename is case sensitive and used as an embedded resource).
- Add/confirm baseline output in `Test/SqlDom/Baselines<version>/` (the UT project embeds these baselines as resources).
- Update the appropriate `Only<version>SyntaxTests.cs` (e.g., `Only170SyntaxTests.cs`) by adding a `ParserTest170("MyNewTest.sql", ...)` entry. See `ParserTest.cs` and `ParserTestOutput.cs` for helper constructors and verification semantics.
5. Run `dotnet test Test/SqlDom/UTSqlScriptDom.csproj -c Debug` and iterate until tests pass.

## Testing details and how tests assert correctness
- Tests run a full parse -> script generator -> reparse round-trip. Baseline comparison verifies pretty-printed generated scripts exactly match the stored baseline.
- Expected parse errors (where applicable) are verified by number and exact error messages; test helpers live in `ParserTest.cs`, `ParserTestOutput.cs`, and `ParserTestUtils.cs`.
- If a test fails due to mismatch in generated script, compare the generated output (the test harness logs it) against the baseline to spot formatting/structure differences.

## Bug Fixing and Baseline Generation
For a practical guide on fixing bugs, including the detailed workflow for generating test baselines, see the [Bug Fixing Guide](BUG_FIXING_GUIDE.md).

## Editing generated outputs, debugging generation
- Never edit generated files permanently (they live under `obj/...`/CsGenIntermediateOutputPath). Instead change:
- `.g` grammar files
- post-processing scripts (`*.ps1`/`*.sed`)
- AST XML in `SqlScriptDom/Parser/TSql/Ast.xml` if AST node shapes need to change (used by `tools/AstGen`).
- To see antlr output/errors, force verbose generation by setting MSBuild property `OutputErrorInLexerParserCompile=true` on the command line (e.g. `dotnet msbuild -t:GLexerParserCompile -p:OutputErrorInLexerParserCompile=true`).
- If the antlr download fails during build, manually download `antlr-2.7.5.jar` (for non-Windows) or `.exe` (for Windows) and place it at the location defined in `Directory.Build.props` or override `AntlrLocation` when invoking msbuild.


The grammar files are in ANTLR v2 format. C# code is generated from these grammar files as part of the build process.
## Patterns & code style to follow (examples you will see)
- Grammar rule pattern: `ruleName returns [Type vResult = this.FragmentFactory.CreateFragment<Type>()] { ... } : ( alternatives ) ;` — this pattern initializes an AST fragment via FragmentFactory.
- Parser-generated code frequently uses `Match(<token>, CodeGenerationSupporter.<Symbol>)` and `ThrowParseErrorException("SQLxxxx", ...)` for diagnostics.
- The codebase prefers using the factory and fragment visitors for AST creation and script generation. Look at `ScriptDom/SqlServer/ScriptGenerator` for script generation patterns.

For each new syntax definition, ScriptDom needs to be able to parse it successfully, and roundtrip back to the original script via the script generator.
## Grammar Gotchas & Common Pitfalls
- **Operator vs. Function-Style Predicates:** Be careful to distinguish between standard T-SQL operators (like `NOT LIKE`, `>`, `=`) and the function-style predicates used in some contexts (like `package.equals(...)` in `CREATE EVENT SESSION`). For example, `NOT LIKE` in an event session's `WHERE` clause is a standard comparison operator, not a function call. Always verify the exact T-SQL syntax before modifying the grammar.
- **Logical `NOT` vs. Compound Operators:** The grammar handles the logical `NOT` operator (e.g., `WHERE NOT (condition)`) in a general way, often in a `booleanExpressionUnary` rule. This is distinct from compound operators like `NOT LIKE` or `NOT IN`, which are typically parsed as a single unit within a comparison rule. Don't assume that because `NOT` is supported, `NOT LIKE` will be automatically supported in all predicate contexts.

Changes need to have accompanying tests in Only170SyntaxTests.cs or the one for its respective version. The test framework should already verify the parser and script generator; you just need to add the test scripts to TestScripts and corresponding Baselines folder. Older syntaxes should be supported unless explicitly stated otherwise.
5 changes: 5 additions & 0 deletions SqlScriptDom/Parser/TSql/BooleanComparisonType.cs
Original file line number Diff line number Diff line change
Expand Up @@ -66,5 +66,10 @@ public enum BooleanComparisonType
/// The distinct predicate, IS NOT DISTINCT FROM.
/// </summary>
IsNotDistinctFrom = 12,

/// <summary>
/// The NOT LIKE predicate
/// </summary>
NotLike = 13,
}
}
1 change: 1 addition & 0 deletions SqlScriptDom/Parser/TSql/CodeGenerationSupporter.cs
Original file line number Diff line number Diff line change
Expand Up @@ -556,6 +556,7 @@ internal static class CodeGenerationSupporter
internal const string Level3 = "LEVEL_3";
internal const string Level4 = "LEVEL_4";
internal const string Library = "LIBRARY";
internal const string Like = "LIKE";
internal const string LifeTime = "LIFETIME";
internal const string Linux = "LINUX";
internal const string List = "LIST";
Expand Down
18 changes: 8 additions & 10 deletions SqlScriptDom/Parser/TSql/TSql130.g
Original file line number Diff line number Diff line change
Expand Up @@ -7289,16 +7289,14 @@ eventDeclarationComparisonPredicate [BooleanComparisonExpression vParent, EventS
BooleanComparisonType vType = BooleanComparisonType.Equals;
ScalarExpression eventValue;
}
: vType = comparisonOperator eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;

dropEventDeclarationList [AlterEventSessionStatement vParent]
: (vType = comparisonOperator | {LA(2) == Like}? tNot:Not tLike:Like { vType = BooleanComparisonType.NotLike; }) eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;dropEventDeclarationList [AlterEventSessionStatement vParent]
{
EventSessionObjectName vDropEventDeclaration;
}
Expand Down
18 changes: 8 additions & 10 deletions SqlScriptDom/Parser/TSql/TSql140.g
Original file line number Diff line number Diff line change
Expand Up @@ -7675,16 +7675,14 @@ eventDeclarationComparisonPredicate [BooleanComparisonExpression vParent, EventS
BooleanComparisonType vType = BooleanComparisonType.Equals;
ScalarExpression eventValue;
}
: vType = comparisonOperator eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;

dropEventDeclarationList [AlterEventSessionStatement vParent]
: (vType = comparisonOperator | {LA(2) == Like}? tNot:Not tLike:Like { vType = BooleanComparisonType.NotLike; }) eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;dropEventDeclarationList [AlterEventSessionStatement vParent]
{
EventSessionObjectName vDropEventDeclaration;
}
Expand Down
18 changes: 8 additions & 10 deletions SqlScriptDom/Parser/TSql/TSql150.g
Original file line number Diff line number Diff line change
Expand Up @@ -8227,16 +8227,14 @@ eventDeclarationComparisonPredicate [BooleanComparisonExpression vParent, EventS
BooleanComparisonType vType = BooleanComparisonType.Equals;
ScalarExpression eventValue;
}
: vType = comparisonOperator eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;

dropEventDeclarationList [AlterEventSessionStatement vParent]
: (vType = comparisonOperator | {LA(2) == Like}? tNot:Not tLike:Like { vType = BooleanComparisonType.NotLike; }) eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;dropEventDeclarationList [AlterEventSessionStatement vParent]
{
EventSessionObjectName vDropEventDeclaration;
}
Expand Down
18 changes: 8 additions & 10 deletions SqlScriptDom/Parser/TSql/TSql160.g
Original file line number Diff line number Diff line change
Expand Up @@ -8252,16 +8252,14 @@ eventDeclarationComparisonPredicate [BooleanComparisonExpression vParent, EventS
BooleanComparisonType vType = BooleanComparisonType.Equals;
ScalarExpression eventValue;
}
: vType = comparisonOperator eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;

dropEventDeclarationList [AlterEventSessionStatement vParent]
: (vType = comparisonOperator | {LA(2) == Like}? tNot:Not tLike:Like { vType = BooleanComparisonType.NotLike; }) eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;dropEventDeclarationList [AlterEventSessionStatement vParent]
{
EventSessionObjectName vDropEventDeclaration;
}
Expand Down
18 changes: 8 additions & 10 deletions SqlScriptDom/Parser/TSql/TSql170.g
Original file line number Diff line number Diff line change
Expand Up @@ -8278,16 +8278,14 @@ eventDeclarationComparisonPredicate [BooleanComparisonExpression vParent, EventS
BooleanComparisonType vType = BooleanComparisonType.Equals;
ScalarExpression eventValue;
}
: vType = comparisonOperator eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;

dropEventDeclarationList [AlterEventSessionStatement vParent]
: (vType = comparisonOperator | {LA(2) == Like}? tNot:Not tLike:Like { vType = BooleanComparisonType.NotLike; }) eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;dropEventDeclarationList [AlterEventSessionStatement vParent]
{
EventSessionObjectName vDropEventDeclaration;
}
Expand Down
18 changes: 8 additions & 10 deletions SqlScriptDom/Parser/TSql/TSqlFabricDW.g
Original file line number Diff line number Diff line change
Expand Up @@ -8252,16 +8252,14 @@ eventDeclarationComparisonPredicate [BooleanComparisonExpression vParent, EventS
BooleanComparisonType vType = BooleanComparisonType.Equals;
ScalarExpression eventValue;
}
: vType = comparisonOperator eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;

dropEventDeclarationList [AlterEventSessionStatement vParent]
: (vType = comparisonOperator | {LA(2) == Like}? tNot:Not tLike:Like { vType = BooleanComparisonType.NotLike; }) eventValue = eventDeclarationValue
{
vSourceDeclaration.Value = vSource;
vParent.FirstExpression = vSourceDeclaration;
vParent.ComparisonType = vType;
vParent.SecondExpression = eventValue;
}
;dropEventDeclarationList [AlterEventSessionStatement vParent]
{
EventSessionObjectName vDropEventDeclaration;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,9 @@ protected void GenerateByPassword(Literal password)
new KeywordGenerator(TSqlTokenType.MultiplyEquals) }},
{ BooleanComparisonType.RightOuterJoin, new List<TokenGenerator>() {
new KeywordGenerator(TSqlTokenType.RightOuterJoin) }},
{ BooleanComparisonType.NotLike, new List<TokenGenerator>() {
new KeywordGenerator(TSqlTokenType.Not, true),
new KeywordGenerator(TSqlTokenType.Like) }},
{ BooleanComparisonType.IsDistinctFrom, new List<TokenGenerator>() {
new KeywordGenerator(TSqlTokenType.Is, true),
new KeywordGenerator(TSqlTokenType.Distinct, true),
Expand Down
20 changes: 20 additions & 0 deletions Test/SqlDom/Baselines130/CreateEventSessionNotLikePredicate.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
CREATE EVENT SESSION [test_not_like] ON SERVER
ADD EVENT sqlserver.sql_statement_completed
(
ACTION (sqlserver.sql_text)
WHERE ([sqlserver].[like_i_sql_unicode_string] ([sqlserver].[sql_text], N'%foo%')
AND [sqlserver].[client_app_name] NOT LIKE N'SQLAgent%')
)
ADD TARGET package0.event_file
(
SET filename = N'test_not_like.xel'
)
WITH (
MAX_MEMORY = 4096 KB,
EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY = 30 SECONDS,
MAX_EVENT_SIZE = 0 KB,
MEMORY_PARTITION_MODE = NONE,
TRACK_CAUSALITY = OFF,
STARTUP_STATE = OFF
);
20 changes: 20 additions & 0 deletions Test/SqlDom/Baselines140/CreateEventSessionNotLikePredicate.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
CREATE EVENT SESSION [test_not_like] ON SERVER
ADD EVENT sqlserver.sql_statement_completed
(
ACTION (sqlserver.sql_text)
WHERE ([sqlserver].[like_i_sql_unicode_string] ([sqlserver].[sql_text], N'%foo%')
AND [sqlserver].[client_app_name] NOT LIKE N'SQLAgent%')
)
ADD TARGET package0.event_file
(
SET filename = N'test_not_like.xel'
)
WITH (
MAX_MEMORY = 4096 KB,
EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY = 30 SECONDS,
MAX_EVENT_SIZE = 0 KB,
MEMORY_PARTITION_MODE = NONE,
TRACK_CAUSALITY = OFF,
STARTUP_STATE = OFF
);
Loading
Loading