Skip to content

fuwasegu/php-sql-snapshot

Repository files navigation

PHP SQL Snapshot

PHPUnit extension for semantic SQL query testing using AST (Abstract Syntax Tree) comparison. Test SQL queries by their meaning, not their formatting.

The Problem

Traditional SQL testing using string comparison fails when queries are semantically identical but formatted differently:

// These queries are semantically identical
$expected = "SELECT id, name FROM users WHERE active = 1 AND role = 'admin'";
$actual   = "SELECT name, id FROM users WHERE role = 'admin' AND active = 1";

// String comparison fails!
$this->assertEquals($expected, $actual); // FAILS

The Solution

This library parses SQL queries into AST and compares them semantically, ignoring:

  • Column order in SELECT clauses
  • Condition order in WHERE clauses (within AND groups)
  • Whitespace and formatting differences
  • Quote styles around identifiers
use PhpSqlSnapshot\PHPUnit\SqlAssertionTrait;

class MyDatabaseTest extends TestCase
{
    use SqlAssertionTrait;

    public function testQueryGeneration(): void
    {
        $expected = "SELECT id, name FROM users WHERE active = 1 AND role = 'admin'";
        $actual = "SELECT name, id FROM users WHERE role = 'admin' AND active = 1";

        // AST comparison succeeds!
        $this->assertMySqlEquals($expected, $actual);
    }
}

Features

  • Database Agnostic: Supports MySQL and PostgreSQL
  • Order Independent: Ignores column and condition ordering
  • Format Independent: Ignores whitespace, newlines, and formatting
  • Type Safe: Strongly typed PHP 8.2+ API
  • PHPUnit Native: Works seamlessly with PHPUnit assertions
  • Clear Error Messages: Detailed diff output when queries don't match
  • Comprehensive Dialect Support: JSON functions, CTEs, UNION, subqueries, and more

Requirements

  • PHP 8.2 or higher (tested on 8.2, 8.3, 8.4, 8.5)
  • PHPUnit 10.0 or higher

Installation

Install via Composer:

composer require --dev fuwasegu/php-sql-snapshot

Usage

Basic Usage

<?php
use PHPUnit\Framework\TestCase;
use PhpSqlSnapshot\PHPUnit\SqlAssertionTrait;

class MyDatabaseTest extends TestCase
{
    use SqlAssertionTrait;

    public function testMySqlQuery(): void
    {
        $expected = "SELECT id, name FROM users WHERE active = 1";
        $actual = "SELECT name, id FROM users WHERE active = 1";

        $this->assertMySqlEquals($expected, $actual);
    }

    public function testPostgreSqlQuery(): void
    {
        $expected = "SELECT id, name FROM users WHERE active = true";
        $actual = "SELECT name, id FROM users WHERE active = true";

        $this->assertPostgreSqlEquals($expected, $actual);
    }

    public function testWithCustomMessage(): void
    {
        $expected = "SELECT * FROM users";
        $actual = "SELECT id, name FROM users";

        $this->assertMySqlEquals(
            $expected,
            $actual,
            'Query generation failed for user list'
        );
    }
}

Available Assertions

assertSqlEquals(string $expected, string $actual, string $databaseType, string $message = '')

Generic SQL assertion that requires specifying the database type:

$this->assertSqlEquals($expected, $actual, 'mysql');
$this->assertSqlEquals($expected, $actual, 'postgresql');

assertMySqlEquals(string $expected, string $actual, string $message = '')

MySQL-specific assertion:

$this->assertMySqlEquals($expected, $actual);
$this->assertMySqlEquals($expected, $actual, 'Custom failure message');

assertPostgreSqlEquals(string $expected, string $actual, string $message = '')

PostgreSQL-specific assertion:

$this->assertPostgreSqlEquals($expected, $actual);
$this->assertPostgreSqlEquals($expected, $actual, 'Custom failure message');

Use Cases

Perfect for testing:

  • Query Builders: Assert that your query builder generates correct SQL
  • ORMs: Verify ORM-generated queries match expectations
  • Database Migrations: Ensure migration queries are correct
  • Dynamic SQL Generation: Test SQL assembled from user input or configuration
  • Code Refactoring: Ensure refactored query logic produces identical SQL

Dialect Support

This library supports both standard SQL and database-specific dialect features.

MySQL Dialect Features

Feature Status Example
Backtick identifiers ✅ Supported SELECT `id` FROM `users`
LIMIT offset, count ✅ Supported LIMIT 10, 20
IF() function ✅ Supported IF(active = 1, "yes", "no")
IFNULL() function ✅ Supported IFNULL(name, "unknown")
GROUP_CONCAT() ✅ Supported GROUP_CONCAT(name)
DATE_FORMAT() ✅ Supported DATE_FORMAT(created_at, "%Y-%m-%d")
CONCAT_WS() ✅ Supported CONCAT_WS("-", first_name, last_name)
ON DUPLICATE KEY UPDATE ✅ Supported INSERT ... ON DUPLICATE KEY UPDATE
JSON functions (JSON_EXTRACT, JSON_OBJECT, etc.) ✅ Supported JSON_EXTRACT(data, "$.name")
JSON operators (->, ->>) ✅ Supported data->"$.name", data->>"$.name"
WITH clause (CTE) ✅ Supported WITH cte AS (SELECT ...) SELECT * FROM cte
WITH RECURSIVE ✅ Supported WITH RECURSIVE cte AS (...)
REPLACE INTO ❌ Not supported -
DDL statements (CREATE, ALTER, etc.) ❌ Not supported -

PostgreSQL Dialect Features

Feature Status Example
Double quote identifiers ✅ Supported SELECT "id" FROM "users"
RETURNING clause ✅ Supported INSERT ... RETURNING id
DISTINCT ON ✅ Supported DISTINCT ON (category)
:: cast operator ✅ Supported id::text
ILIKE operator ✅ Supported name ILIKE '%test%'
|| concatenation ✅ Supported first_name || ' ' || last_name
Regex operators (~, ~*, !~, !~*) ✅ Supported name ~ '^[A-Z]'
Array literals ✅ Supported ARRAY[1, 2, 3]
Array subscript ✅ Supported tags[1]
JSON/JSONB operators (->, ->>, #>, #>>) ✅ Supported data->'name', data#>'{address,city}'
JSON/JSONB functions (json_agg, jsonb_build_object, etc.) ✅ Supported jsonb_build_object('id', id)
JSON operators (@>, ?, ?|, ?&) ✅ Supported tags @> '["admin"]', data ? 'name'
WITH clause (CTE) ✅ Supported WITH cte AS (SELECT ...) SELECT * FROM cte
WITH RECURSIVE ✅ Supported WITH RECURSIVE cte AS (...)
WITH ... MATERIALIZED ✅ Supported WITH cte AS MATERIALIZED (...)
WITH ... NOT MATERIALIZED ✅ Supported WITH cte AS NOT MATERIALIZED (...)
DDL statements (CREATE, ALTER, etc.) ❌ Not supported -

Common SQL Features

Feature MySQL PostgreSQL Example
LIMIT ... OFFSET LIMIT 10 OFFSET 20
Boolean literals WHERE active = true
IS NULL / IS NOT NULL WHERE deleted_at IS NULL
IN / NOT IN WHERE role IN ('admin', 'editor')
BETWEEN / NOT BETWEEN WHERE age BETWEEN 18 AND 65
LIKE / NOT LIKE WHERE name LIKE '%test%'
CASE WHEN CASE WHEN active = 1 THEN 'yes' END
JOINs (INNER, LEFT, RIGHT) FROM users u JOIN orders o ON ...
Aggregate functions COUNT(*), MAX(), MIN(), AVG(), SUM()
HAVING clause GROUP BY role HAVING COUNT(*) > 5
UNION / UNION ALL SELECT ... UNION SELECT ...
Subqueries / Nested queries FROM (SELECT ...) AS sub, WHERE id IN (SELECT ...)
WITH clause (CTE) WITH cte AS (SELECT ...) SELECT * FROM cte
WITH RECURSIVE WITH RECURSIVE cte AS (...) SELECT * FROM cte

Parser Architecture

The library uses greenlion/php-sql-parser for both MySQL and PostgreSQL, which supports standard SQL syntax. Database-specific parsers are kept separate to allow for future enhancements of dialect-specific features.

⚠️ Security Warning

This library is designed for TESTING PURPOSES ONLY.

  • DO NOT use it to validate or sanitize user input in production code
  • DO NOT use it as a SQL injection prevention mechanism
  • DO NOT pass untrusted user input directly to the parsers

This library parses SQL queries for comparison in automated tests. It is not designed for security-critical operations.

What Gets Normalized?

The library normalizes the following aspects of SQL queries:

1. SELECT Clause Ordering

// These are considered equal
"SELECT id, name, email FROM users"
"SELECT email, id, name FROM users"

2. WHERE Clause Condition Ordering (within AND groups)

// These are considered equal
"SELECT * FROM users WHERE active = 1 AND role = 'admin' AND age > 18"
"SELECT * FROM users WHERE role = 'admin' AND age > 18 AND active = 1"

3. Whitespace and Formatting

// These are considered equal
"SELECT id FROM users WHERE active = 1"
"SELECT   id   FROM   users   WHERE   active = 1"
"SELECT id\nFROM users\nWHERE active = 1"

4. Identifier Quoting

// These are considered equal
"SELECT id FROM users"
"SELECT `id` FROM `users`"
"SELECT \"id\" FROM \"users\""

Important Notes

  • OR Logic: The library preserves OR precedence. Conditions separated by OR are NOT reordered.
  • JOIN Clauses: Currently, JOIN order is preserved (not normalized).
  • Subqueries: Subqueries are compared recursively with the same normalization rules.

Complex Examples

Multi-Condition Queries

// These queries are considered equal
$expected = "SELECT u.id, u.name, u.email
             FROM users u
             WHERE u.active = 1
             AND u.age >= 18
             AND u.role = 'admin'
             ORDER BY u.name ASC";

$actual = "SELECT u.email, u.id, u.name
           FROM users u
           WHERE u.role = 'admin'
           AND u.active = 1
           AND u.age >= 18
           ORDER BY u.name";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

Queries with Functions

// Aggregate functions - column order doesn't matter
$expected = "SELECT COUNT(*), MAX(age), MIN(age), AVG(score) FROM users";
$actual   = "SELECT AVG(score), MIN(age), MAX(age), COUNT(*) FROM users";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

Complex WHERE with AND/OR

// AND conditions are reordered, but OR conditions are preserved
$expected = "SELECT id FROM users
             WHERE active = 1
             AND (role = 'admin' OR role = 'editor')";

$actual = "SELECT id FROM users
           WHERE (role = 'admin' OR role = 'editor')
           AND active = 1";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

Queries That Should Fail

// Different values - should NOT match
$expected = "SELECT id FROM users WHERE age > 18";
$actual   = "SELECT id FROM users WHERE age > 21";

$this->assertMySqlEquals($expected, $actual); // ❌ FAILS with clear diff

UNION Queries

// UNION with reordered columns
$expected = "SELECT id, name FROM users
             UNION
             SELECT id, name FROM admins";

$actual = "SELECT name, id FROM users
           UNION
           SELECT name, id FROM admins";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

// UNION ALL with WHERE clause reordering
$expected = "SELECT id FROM users WHERE active = 1 AND role = 'admin'
             UNION ALL
             SELECT id FROM admins WHERE active = 1 AND role = 'admin'";

$actual = "SELECT id FROM users WHERE role = 'admin' AND active = 1
           UNION ALL
           SELECT id FROM admins WHERE role = 'admin' AND active = 1";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

// UNION with different query order
$expected = "SELECT id FROM users
             UNION SELECT id FROM admins
             UNION SELECT id FROM moderators";

$actual = "SELECT id FROM moderators
           UNION SELECT id FROM users
           UNION SELECT id FROM admins";

// ✅ PASSES - UNION queries are automatically sorted for comparison
$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

// UNION with ORDER BY and LIMIT
$expected = "SELECT id FROM users UNION SELECT id FROM admins ORDER BY id LIMIT 10";
$actual = "SELECT id FROM users UNION SELECT id FROM admins ORDER BY id LIMIT 10";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

Nested / Subqueries

// Nested subquery in FROM clause
$expected = "SELECT * FROM (SELECT id, name FROM users WHERE active = 1) AS active_users";
$actual = "SELECT * FROM (SELECT name, id FROM users WHERE active = 1) AS active_users";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

// Nested subquery in WHERE clause
$expected = "SELECT id FROM users
             WHERE id IN (SELECT user_id FROM orders WHERE total > 100)";

$actual = "SELECT id FROM users
           WHERE id IN (SELECT user_id FROM orders WHERE total > 100)";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

// Double nested subqueries
$expected = "SELECT * FROM (
                 SELECT * FROM (
                     SELECT id FROM users WHERE active = 1
                 ) AS inner_sub
             ) AS outer_sub";

$actual = "SELECT * FROM (
               SELECT * FROM (
                   SELECT id FROM users WHERE active = 1
               ) AS inner_sub
           ) AS outer_sub";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

WITH Clause (Common Table Expressions)

// Simple CTE
$expected = "WITH active_users AS (SELECT id, name FROM users WHERE active = 1)
             SELECT * FROM active_users";

$actual = "WITH active_users AS (SELECT name, id FROM users WHERE active = 1)
           SELECT * FROM active_users";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

// Multiple CTEs
$expected = "WITH
               admins AS (SELECT id FROM users WHERE role = 'admin'),
               editors AS (SELECT id FROM users WHERE role = 'editor')
             SELECT * FROM admins UNION SELECT * FROM editors";

$actual = "WITH
             editors AS (SELECT id FROM users WHERE role = 'editor'),
             admins AS (SELECT id FROM users WHERE role = 'admin')
           SELECT * FROM editors UNION SELECT * FROM admins";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

// Recursive CTE
$expected = "WITH RECURSIVE numbers AS (
               SELECT 1 AS n
               UNION ALL
               SELECT n + 1 FROM numbers WHERE n < 5
             )
             SELECT * FROM numbers";

$actual = "WITH RECURSIVE numbers AS (
             SELECT 1 AS n
             UNION ALL
             SELECT n + 1 FROM numbers WHERE n < 5
           )
           SELECT * FROM numbers";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

// PostgreSQL MATERIALIZED hint
$expected = "WITH cte AS MATERIALIZED (SELECT id FROM users)
             SELECT * FROM cte";

$actual = "WITH cte AS MATERIALIZED (SELECT id FROM users)
           SELECT * FROM cte";

$this->assertPostgreSqlEquals($expected, $actual); // ✅ PASSES

JSON Functions

// MySQL JSON operators
$expected = "SELECT data->\"$.name\" FROM users WHERE data->>\"$.active\" = 'true'";
$actual = "SELECT data->\"$.name\" FROM users WHERE data->>\"$.active\" = 'true'";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES

// MySQL JSON functions
$expected = "SELECT JSON_EXTRACT(data, \"$.name\"), JSON_OBJECT(\"id\", id) FROM users";
$actual = "SELECT JSON_OBJECT(\"id\", id), JSON_EXTRACT(data, \"$.name\") FROM users";

$this->assertMySqlEquals($expected, $actual); // ✅ PASSES - column order normalized

// PostgreSQL JSONB operators
$expected = "SELECT data->'name', data#>'{address,city}' FROM users WHERE tags @> '[\"admin\"]'";
$actual = "SELECT data#>'{address,city}', data->'name' FROM users WHERE tags @> '[\"admin\"]'";

$this->assertPostgreSqlEquals($expected, $actual); // ✅ PASSES

// PostgreSQL JSON functions
$expected = "SELECT json_agg(name), jsonb_build_object('id', id, 'name', name) FROM users";
$actual = "SELECT jsonb_build_object('id', id, 'name', name), json_agg(name) FROM users";

$this->assertPostgreSqlEquals($expected, $actual); // ✅ PASSES

Error Messages

When queries don't match, you get detailed error messages:

Failed asserting that two SQL queries are semantically equal.

Expected SQL:
  SELECT id, name FROM users WHERE active = 1

Actual SQL:
  SELECT id, name FROM users WHERE active = 2

Differences:
  1. [root.where[0].children[1].value] Value mismatch: expected "1", got "2"

Limitations

Current limitations (may be addressed in future versions):

  • DDL Statements: CREATE, ALTER, DROP statements are not supported (focus is on DML: SELECT, INSERT, UPDATE, DELETE)
  • OR Condition Ordering: Conditions within OR clauses are NOT reordered (by design, as OR precedence matters)
  • GROUP BY Ordering: GROUP BY column order is preserved (not normalized)
  • JOIN Normalization: JOIN order is not currently normalized
  • Complex Expressions: Very complex expressions may not normalize perfectly
  • Nested UNION in Subqueries: UNION inside subqueries requires recursive parsing (future enhancement)

What IS supported:

  • SELECT, INSERT, UPDATE, DELETE statements
  • WITH clauses (CTEs) including RECURSIVE and MATERIALIZED
  • UNION and UNION ALL with query reordering
  • JSON functions and operators (MySQL and PostgreSQL)
  • Subqueries in FROM, WHERE, and other clauses
  • Complex WHERE conditions with AND/OR logic
  • All standard SQL aggregate functions and operators

Architecture

The library uses greenlion/php-sql-parser for both MySQL and PostgreSQL:

  • Pure PHP: No C extensions required
  • Standard SQL: Supports ANSI SQL syntax used by both MySQL and PostgreSQL
  • Unified AST: Both parsers are wrapped to provide a consistent AST format
  • Normalization: AST is normalized before comparison to ignore formatting differences

Development

Running Tests

composer install
vendor/bin/phpunit

CI/CD

The GitHub Actions workflow runs tests across multiple PHP versions:

  • PHP 8.2, 8.3, 8.4, 8.5: Full test suite including both MySQL and PostgreSQL tests (301 tests, 422 assertions)

This ensures the library works correctly across all supported PHP versions.

Test Coverage:

  • Parser tests (MySQL/PostgreSQL dialect features, JSON, WITH, UNION)
  • Normalizer tests (column/condition ordering, AST normalization)
  • Comparator tests (semantic comparison, diff generation)
  • Integration tests (real-world complex queries)

Total CI jobs: 4 (PHP versions) + PHPStan + PHP CS Fixer = 6 jobs

Project Structure

php-sql-snapshot/
├── src/
│   ├── Parser/           # SQL parsers
│   ├── Normalizer/       # AST normalization
│   ├── Comparator/       # AST comparison
│   ├── PHPUnit/          # PHPUnit integration
│   └── Exception/        # Exception classes
└── tests/                # Unit tests

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License

Credits

This library uses:

Related Projects

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages