github · aaronpowell · May 19, 2026 · May 16, 2026 · May 18, 2026
@@ -335,6 +335,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to
 | [spring-boot-testing](../skills/spring-boot-testing/SKILL.md)<br />`gh skills install github/awesome-copilot spring-boot-testing` | Expert Spring Boot 4 testing specialist that selects the best Spring Boot testing techniques for your situation with Junit 6 and AssertJ. | `references/assertj-basics.md`<br />`references/assertj-collections.md`<br />`references/context-caching.md`<br />`references/datajpatest.md`<br />`references/instancio.md`<br />`references/mockitobean.md`<br />`references/mockmvc-classic.md`<br />`references/mockmvc-tester.md`<br />`references/restclienttest.md`<br />`references/resttestclient.md`<br />`references/sb4-migration.md`<br />`references/test-slices-overview.md`<br />`references/testcontainers-jdbc.md`<br />`references/webmvctest.md` |
 | [sql-code-review](../skills/sql-code-review/SKILL.md)<br />`gh skills install github/awesome-copilot sql-code-review` | Universal SQL code review assistant that performs comprehensive security, maintainability, and code quality analysis across all SQL databases (MySQL, PostgreSQL, SQL Server, Oracle). Focuses on SQL injection prevention, access control, code standards, and anti-pattern detection. Complements SQL optimization prompt for complete development coverage. | None |
 | [sql-optimization](../skills/sql-optimization/SKILL.md)<br />`gh skills install github/awesome-copilot sql-optimization` | Universal SQL performance optimization assistant for comprehensive query tuning, indexing strategies, and database performance analysis across all SQL databases (MySQL, PostgreSQL, SQL Server, Oracle). Provides execution plan analysis, pagination optimization, batch operations, and performance monitoring guidance. | None |
+| [sql-server-table-reconciliation](../skills/sql-server-table-reconciliation/SKILL.md)<br />`gh skills install github/awesome-copilot sql-server-table-reconciliation` | Use when: comparing SQL Server tables across instances, data migration validation, ETL verification, row mismatch detection, schema drift, reconciliation report, production vs staging comparison. Uses mssql-python driver with Apache Arrow for fast columnar data transfer and comparison. | `scripts/reconcile.py` |
 | [ssma-console](../skills/ssma-console/SKILL.md)<br />`gh skills install github/awesome-copilot ssma-console` | Use when: SSMA console operations — create project, generate assessment report, convert schema, migrate data, Oracle to SQL Server migration, schema conversion, data migration | None |
 | [structured-autonomy-generate](../skills/structured-autonomy-generate/SKILL.md)<br />`gh skills install github/awesome-copilot structured-autonomy-generate` | Structured Autonomy Implementation Generator Prompt | None |
 | [structured-autonomy-implement](../skills/structured-autonomy-implement/SKILL.md)<br />`gh skills install github/awesome-copilot structured-autonomy-implement` | Structured Autonomy Implementation Prompt | None |

@@ -0,0 +1,158 @@
+---
+name: sql-server-table-reconciliation
+description: "Use when: comparing SQL Server tables across instances, data migration validation, ETL verification, row mismatch detection, schema drift, reconciliation report, production vs staging comparison. Uses mssql-python driver with Apache Arrow for fast columnar data transfer and comparison."
+---
+
+# SQL Server Table Reconciliation
+
+Compare identical tables across two SQL Server instances using Python with `mssql-python` driver and Apache Arrow. Detect missing rows, column mismatches, schema drift, and produce a reconciliation report.
+
+## Workflow
+
+1. Collect connection details for source and target
+2. Identify primary key / composite key
+3. Detect schema differences
+4. Extract data via Arrow for efficient columnar transfer
+5. Compare rows and columns
+6. Generate reconciliation report
+
+## Collect Inputs
+
+| Parameter | Required | Description |
+|-----------|----------|-------------|
+| Source server | Yes | Source SQL Server (e.g. `prod-server.database.windows.net`) |
+| Source database | Yes | Source database name |
+| Target server | Yes | Target SQL Server (e.g. `staging-server.database.windows.net`) |
+| Target database | Yes | Target database name |
+| Tables | Yes | Comma-separated `schema.table` names, or `schema.*` wildcard (e.g. `dbo.Orders,dbo.Items` or `dbo.*`) |
+| Auth mode | Yes | `sql` (user/password) or `entra` (Azure AD/token) |
+| Primary key | Auto-detect | Column(s) forming the row identity. Auto-detect from metadata if not provided. |
+| Columns to compare | All | Subset of columns, or all non-PK columns |
+| Chunk size | `100000` | Rows per batch for large tables |
+| Output format | `console` | `console`, `csv`, `parquet`, or `json` |
+
+## Bundled Script
+
+The reconciliation logic is provided as a standalone script at `scripts/reconcile.py`. Invoke it with the appropriate arguments based on user inputs:
+
+```bash
+python scripts/reconcile.py \
+    --source-server <source_server> \
+    --source-database <source_database> \
+    --target-server <target_server> \
+    --target-database <target_database> \
+    --tables "<table_spec>" \
+    --auth <sql|entra> \
+    --chunk-size <chunk_size> \
+    --output <console|csv|json>
+```
+
+### Optional arguments
+
+| Argument | Description |
+|----------|-------------|
+| `--primary-key` | Comma-separated PK column(s). Omit to auto-detect. |
+| `--columns` | Comma-separated columns to compare. Omit to compare all non-PK columns. |
+
+### Example invocations
+
+Single table with SQL auth:
+
+```bash
+python scripts/reconcile.py \
+    --source-server prod-server.database.windows.net \
+    --source-database ProdDB \
+    --target-server staging-server.database.windows.net \
+    --target-database StagingDB \
+    --tables "dbo.Orders" \
+    --auth sql \
+    --output console
+```
+
+Wildcard with Entra auth and CSV output:
+
+```bash
+python scripts/reconcile.py \
+    --source-server prod-server.database.windows.net \
+    --source-database ProdDB \
+    --target-server staging-server.database.windows.net \
+    --target-database StagingDB \
+    --tables "dbo.*" \
+    --auth entra \
+    --output csv
+```
+
+### Prerequisites
+
+Install required packages before running:
+
+```bash
+pip install mssql-python pyarrow pandas
+```
+
+## Comparison Rules
+
+- **Normalize types before comparing**: cast decimals to same precision, trim strings, normalize datetime to UTC
+- **NULL handling**: `NULL == NULL` is considered a match (both sides missing = no diff)
+- **Ignore row order**: always compare by PK join, never positional
+- **Large tables**: chunk extraction with `OFFSET/FETCH` or `ROW_NUMBER()` partitioning
+
+## Hash-Based Optimization (for large tables)
+
+When table has >1M rows, generate a hash pre-check:
+
+```sql
+SELECT {pk_cols},
+       HASHBYTES('SHA2_256', CONCAT_WS('|', col1, col2, ...)) AS row_hash
+FROM {table}
+```
+
+Compare hashes first; only fetch full rows for mismatched hashes. This reduces data transfer significantly.
+
+## Report Format
+
+```
+Reconciling dbo.EMPLOYEES...
+Reconciling dbo.DEPARTMENTS...
+Reconciling dbo.JOBS...
+
+--- dbo.EMPLOYEES ---
+  Source: 107  Target: 107
+  Missing: 0  Extra: 0  Mismatches: 0
+  Result: ✓ IDENTICAL
+
+--- dbo.DEPARTMENTS ---
+  Source: 27  Target: 27
+  Missing: 0  Extra: 0  Mismatches: 3
+  Result: ✗ DIFFERENCES FOUND
+
+--- dbo.JOBS ---
+  Source: 19  Target: 19
+  Missing: 0  Extra: 0  Mismatches: 0
+  Result: ✓ IDENTICAL
+
+=== Summary: 2 passed, 1 failed, 0 skipped / 3 tables ===
+```
+
+When a single table is provided, include full detail (schema drift, sample rows, mismatches). When multiple tables, use the compact per-table format above with full detail only for tables with `FAIL` status.
+
+## Performance Considerations
+
+| Scenario | Strategy |
+|----------|----------|
+| < 100K rows | Single Arrow fetch, in-memory pandas compare |
+| 100K–1M rows | Chunked extraction (100K batches), streaming comparison |
+| > 1M rows | Hash pre-check → only fetch mismatched rows |
+| Wide tables (100+ cols) | Compare PK + hash first, drill into specific columns on mismatch |
+| Network-constrained | Use Arrow columnar format (10-50x smaller than row-by-row) |
+
+## Constraints
+
+- Always use `mssql-python` driver (not pyodbc, pymssql)
+- Always use Apache Arrow via cursor (`cursor.arrow()`) for data extraction
+- Connection MUST use connection string format, not keyword arguments (kwargs like `encrypt=True` throw errors)
+- Never compare without identifying PK first — ask user if auto-detect fails
+- Handle connection failures gracefully with retry logic
+- **Never hardcode credentials** in generated scripts — use `os.environ` / `getpass` (env vars: `MSSQL_USER`, `MSSQL_PASSWORD`)
+- Do not print credentials in output or logs
+- Use parameterized queries (`?` placeholders) for metadata lookups — never f-string interpolate user input into SQL