Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[10.x] Add toRawSql, dumpRawSql() and ddRawSql() to Query Builders #47507

Merged
merged 3 commits into from Jun 28, 2023

Conversation

tpetry
Copy link
Contributor

@tpetry tpetry commented Jun 20, 2023

For quite some time, many developers have requested to get SQL queries with merged bindings from the query builder (#38027, #39053, #39551, #45705, #45189).

They want to have something like this:

User::where('email', 'foo@example.com')->ddRawSql();
// "SELECT * FROM users WHERE email = 'foo@example.com'"

Instead of:

User::where('email', 'foo@example.com')->dd();
// "SELECT * FROM users WHERE email = ?"
// [
//  0 => "foo@example.com"
// ]

Prior Implementations

All prior implementations had the same issues that prevented them from being merged:

  1. The bindings had been inserted into the query as strings without any further processing. Each of those generated queries was vulnerable to SQL injection attacks.
  2. They replaced all question marks with values that would break on raw calls, e.g. ->whereRaw("description = 'foo?'")

Improved Implementation

These new ways of generating SQL queries with embedded bindings are available:

$sql = User::where('email', 'foo@example.com')
  ->toRawSql();// $sql = "SELECT * FROM users WHERE email = 'foo@example.com'"

User::where('email', 'foo@example.com')
  ->dumpRawSql(); // "SELECT * FROM users WHERE email = 'foo@example.com'"

User::where('email', 'foo@example.com')
  ->ddRawSql(); // "SELECT * FROM users WHERE email = 'foo@example.com'"

$sql = DB::connection()->getQueryGrammar()->makeRawSql(
  'SELECT * FROM users WHERE email = ?',
  'foo@example.com',
); // $sql = "SELECT * FROM users WHERE email = 'foo@example.com'"

1. SQL Injections

I've built an extension for the database layer that can escape any values for safe embedding into SQL queries (#46558) that is already merged into Laravel 10.x. Based on that code, any binding is escaped before being injected into the SQL query:

User::where('name', "Robert'; drop table users; --")->dd();
// "SELECT * FROM users WHERE email = 'Robert\'; drop table users; --'"

2. Ambiguous Question Marks

Simple search-and-replace operations are not enough to reliably generate a raw SQL statement. With raw expressions anyone can embed more question marks into a SQL query that are clearly no placeholders:

User::whereRaw("abc = 'Hello World?'")->where('name', 'Robert')->dd();
// "SELECT * FROM users WHERE abc = 'Hello WorldRobert' name = ?"

But this can also be solved relatively easily. The generated SQL string with placeholders is parsed by a very simple LL(1) parser (just 20 lines) to watch for string escape sequences and only replace question not being escaped in string literals:

  • The occurrence of ' starts a string literal -> no question marks will be replaced
  • The occurrence of '' and \' marks escaped quotes -> they are copied
  • The occurrence of ' ends a string literal -> question marks will be replaced again.

That way the generated raw SQL string have no problem with question marks in string literals:

User::whereRaw("abc = 'Hello World?'")->where('name', 'Robert')->dd();
// "SELECT * FROM users WHERE abc = 'Hello World?' name = 'Robert'"

3. Executability of Raw SQL Queries

A generated query by these new functions should be able to be copied and pasted into any query tool and execute without problems. This is guaranteed for any database except PostgreSQL. Because PostgreSQL has special operators involving a question that needs to be doubled because of some PDO behaviour:

User::where('json', '?', 'abc')->dd(); // json object contains key "abc"
// "SELECT * FROM users WHERE json ?? 'abc'"

The query can not be executed as Laravel (correctly!) doubles the question mark (double ones are exempt from replacement 馃槈). To also make these special queries copy-able any (1) PostgreSQL operator containing a question mark that is (2) included in Laravel's operator information is decoded again:

User::where('json', '?', 'abc')->dd(); // json object contains key "abc"
// "SELECT * FROM users WHERE json ? 'abc'"

Final

This implementation solves any known problems of generating raw SQL string known until today (including mine added for PostgreSQL).

I am open to a different naming for these new methods.

@tpetry
Copy link
Contributor Author

tpetry commented Jun 20, 2023

The failing test is of a different module than I have been working on. So not relevant for the PR.

@bert-w
Copy link
Contributor

bert-w commented Jun 23, 2023

This opens the doors for devs to do stuff like DB::raw($query->toRawSql()); which now depends on your (complex) escaping logic.

Meanwhile I can oneline this output for simple ? replacements (ignoring edgecases) which works in most cases when you quickly need to debug a query:

$sql = str_replace_array('?', $query->getBindings(), $query->toSql());

Thirdly there are query logs for your RDBMS of choice and there's also a nice Laravel debugbar :) (although only useful after execution of the query).

@tpetry
Copy link
Contributor Author

tpetry commented Jun 23, 2023

This opens the doors for devs to do stuff like DB::raw($query->toRawSql()); which now depends on your (complex) escaping logic.
Sure. You can always do dumb stuff, but it will still be safe.

Meanwhile I can oneline this output for simple ? replacements (ignoring edgecases) which works in most cases when you quickly need to debug a query:

$sql = str_replace_array('?', $query->getBindings(), $query->toSql());

As you said, there are edge cases that will fail on some queries.

Thirdly there are query logs for your RDBMS of choice and there's also a nice Laravel debugbar :) (although only useful after execution of the query).
You can think of this implementation to also be the foundation stone for these packages. Debugbar and so many packages can use the included implementation that also handles edge cases they all don鈥榯 have to implement again and again.

@boris-glumpler
Copy link
Contributor

I'm looking to create a raw query that could then be passed to something like pgsql2shp. @tpetry, could you please clarify if this would work? From your description above it's not immediately apparent to me if Postgres is fully supported. Thanks!

@tpetry
Copy link
Contributor Author

tpetry commented Jun 26, 2023

I don鈥榯 know what this command should do or how this affected by a SQL query.

This implementation can generate a raw SQL query for all databases supported by Laravel: MySQL, MariaDB, PostgreSQL, SQLite and SQL Server.

@boris-glumpler
Copy link
Contributor

This implementation can generate a raw SQL query for all databases supported by Laravel: MySQL, MariaDB, PostgreSQL, SQLite and SQL Server.

Alright, that should work! pgsql2shp is a PostGIS binary. It creates a shapefile for you based on the raw query you pass to it.

@taylorotwell taylorotwell merged commit 830efbe into laravel:10.x Jun 28, 2023
16 checks passed
@poisa
Copy link
Contributor

poisa commented Jun 30, 2023

I've had this on a big project a few years ago where we did a home-grown solution. Just as a heads up (and possible feature request) that if you're planning to log your queries, make sure you insert some conditional logic to redact personal/risky information from the queries so that you don't log stuff like passwords, or any data that might violate GDPR.

@ARehmanMahi
Copy link

In which Laravel version is this going to be available? I'm on v10.14.1 and still don't see the changes.

@bojanvmk
Copy link

bojanvmk commented Jul 9, 2023

@ARehmanMahi I guess it's not released yet, since this was merged a few hours after v10.14.1 was released.

@AmirHossein
Copy link

AmirHossein commented Jul 17, 2023

@tpetry There will be error below on some drivers: The database driver's grammar implementation does not support escaping values. on this line.
I am not sure about side effects but it could be fixed by calling setConnection() on grammer property:

public function toRawSql()
{
    return $this->grammar->setConnection($this->getConnection())->substituteBindingsIntoRawSql(
        $this->toSql(), $this->connection->prepareBindings($this->getBindings())
    );
}

and

public function getRawQueryLog()
{
    return array_map(fn (array $log) => [
        'raw_query' => $this->queryGrammar->setConnection($this)->substituteBindingsIntoRawSql(
            $log['query'],
            $this->prepareBindings($log['bindings'])
        ),
        'time' => $log['time'],
    ], $this->getQueryLog());
}

@tpetry
Copy link
Contributor Author

tpetry commented Jul 17, 2023

It is working for all database drivers provided by Laravel. If you鈥榬e using a custom one, it has to make the needed changes.

@Rydgel
Copy link

Rydgel commented Jul 18, 2023

@AmirHossein same problem on MySQL with multiples databases. It's not documented what is needed to be done since everything else works fine.

@tpetry
Copy link
Contributor Author

tpetry commented Jul 18, 2023

When the connection creates the grammar, it has to also pass itself to the grammar:

protected function getDefaultQueryGrammar(): QueryGrammar
{
    $grammar = new QueryGrammar();
    if (method_exists($grammar, 'setConnection')) {
        $grammar->setConnection($this);
    }


    return $grammar;
}

@Rydgel
Copy link

Rydgel commented Jul 18, 2023

Thanks I didn't noticed I was using a custom mysql package for connection, we will make the needed change above in their project.

@zepoua
Copy link

zepoua commented Jan 5, 2024

Please how to use it in create or update or delete instructions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants