Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[10.x] database expressions with grammar-specific formatting #44784

Merged
merged 5 commits into from Feb 2, 2023

Conversation

tpetry
Copy link
Contributor

@tpetry tpetry commented Oct 30, 2022

Last year I proposed PR #40115 to change database expressions to be grammar-enhanced. But my PR was changing too much, so I redid the whole concept to change as least as possible to be easier to merge for the next Laravel major release.

What is the problem?

Laravel's database implementation provides us a good way of working with multiple databases while abstracting away the inner workings. We don't have to think about small syntax differences when using a query builder or how column names are escaped to not interfere with reserved keywords.

However, when we want to use more database functionality we have to fall back to raw database code and writing database-specific code to e.g. quote column names or column aliases:

DB::table('table')
    ->when(isPostgreSQL(), fn ($query) => $query->select(DB::raw('coalesce("user", "admin") AS "value"')))
    ->when(isMySQL(), fn ($query) => $query->select(DB::raw('coalesce(`user`, `admin`) AS `value`')))

How can this be solved?

Currently the Expression class returned by DB::raw is just a container to indicate that the value should not be further processed. My implementation changes that approach a little bit by declaring an expression contract DB::raw uses and custom classes can implement:

interface Expression
{
    /**
     * Get the value of the expression.
     *
     * @param  \Illuminate\Database\Grammar  $grammar
     * @return scalar
     */
    public function getValue(Grammar $grammar);
}

Following that idea, the operations normally done by raw expressions and statements can be created as reusable expression classes. The former example could be written like this, which is totally database-agnostic for the one using those classes instead of raw expressions:

DB::table('table')
    ->select(new Alias(new Coalesce(['user', 'admin']), 'count'));


class Alias implements Expression
{
    public function __construct(
    public readonly Expression|string $expression,
    public readonly string $name,
    ) { }

    public function getValue(Grammar $grammar): string
    {
        return match ($grammar->isExpression($this->expression)) {
            true => "{$grammar->getValue($this->expression)} as {$grammar->wrap($this->name)}",
            false => $grammar->wrap("{$this->name} as {$this->name}"),
        };
    }
}

class Coalesce implements Expression
{
    public function __construct(
        public readonly array $expressions,
    ) { }

    public function getValue(Grammar $grammar): string
    {
        $expressions = array_map(function ($expression) use($grammar): string {
            return match ($grammar->isExpression($expression)) {
                true => $grammar->getValue($expression),
                false => $grammar->wrap($expression),
            };
        }, $this->expressions);
        $expressions = implode(', ', $expressions);

        return "coalesce({$expressions})";
    }
}

But isn't it much work to write those classes?

Kind of. But keep in mind these have to be written only once and you don't have to be the one. Like with every package created by the community anyone can start and write those SQL expressions for anyone to use. When this PR will be merged into 10.x I will start by writing the most common ones at tpetry/laravel-query-expressions.

But I don't like the syntax

That's ok as I don't propose any syntax, I propose a system of how this can work. You can start a package that will provide a macro-able facade to do the same with static function calls as long as those calls return expression objects:

Expr::alias(Expr::coalesce(['user', 'admin']), 'count')

An implementation even doesn't have to use this wrapping. I can imagine a parser could transform a special little syntax to create those object chains from a string:

DB::table('table')
    ->select(exprParse('[user], [admin] ~> coalesce ~> alias:count'));

As said, this PR is not about providing syntax. It is about adding a building block packages can use to build something great

Outlook

In future PRs I would like to work on more things useful for database expressions:

  • Access to the database type:. Some functions are implemented differently by the databases. When the Grammar would provide methods to identify it as e.g. a MySQL grammar, the expression could use a different function for regex replacement than for e.g. PostgreSQL.
  • Value bindings: Right now request-provided values can't be added to expressions as the value could contain a SQL injection. When the Grammar would provide method to quote() a value these could be safely injected into an expression.
  • Database version: Some features are easily done on recent database versions as critical features have been added while a lot of boilerplate is needed for older releases. When the Grammar would provide the database type and version a grammar class could use best implementation for every database and version.

@Wulfheart
Copy link

Wulfheart commented Oct 30, 2022

This would be really nice in my opinion! This would make the DB expressions more typesafe.

@michaeldyrynda
Copy link
Contributor

This is pretty slick, nicely done.

Two things:

First, is the new 'admin' in your example a typo, or some new PHP 8 syntax?

Second, should we preserve the __toString() in src/Illuminate/Database/Query/Expression.php for backwards compatibility?

@tpetry
Copy link
Contributor Author

tpetry commented Oct 31, 2022

First, is the new 'admin' in your example a typo, or some new PHP 8 syntax?

That is a typo. I simplified the example to make it easier to understand and missed the new keyword.

Second, should we preserve the __toString() in src/Illuminate/Database/Query/Expression.php for backwards compatibility?

That's not easily possible. We could do that but then we have two different expression implementations that need to be handled differently. And I guess almost no one really extended the Expression class.

@deleugpn
Copy link
Contributor

deleugpn commented Nov 2, 2022

I can see how this PR seems like a good compromise. It basically expands the Expression class to be able to take a Grammar as dependency in order to get help from the Grammar to generate the underlying DB expression.
Personally I don't like any potential syntax that will arise from this due to how one expression needs to wrap another one, but at least that's no longer the Framework's concern and will be handled exclusively outside of the framework.
Another thing that I think is extremely edge-case is the need to write a single raw expression compatible with multiple databases. Sure, things like Laravel Nova may theoretically benefit from such a feature, but truth is most applications will target only one database and when doing raw expressions, doing the raw expression of your database will likely yield a better syntax.

In summary, I don't particularly like the code that will be written as a result of this feature, but this feature in itself isn't that much harmful to the framework. I just wonder whether there's a simpler approach to provide the Grammar to the Expression class with less changes, but that requires digging deeper...

@taylorotwell
Copy link
Member

Can you provide the breaking changes and the upgrade steps we would need to include in the 10.x upgrade guide if this is merged?

@tpetry
Copy link
Contributor Author

tpetry commented Nov 3, 2022

Sure.

Breaking Changes

The Illuminate\Database\Query\Expression class now implements the Illuminate\Contracts\Database\Query\Expression contract. An expression can now return any scalar value instead of the former requirement to return strings.

Upgrade Guide

If you use DB::raw or create instances from Illuminate\Database\Query\Expression you don't have to change anything. All changes are transparent to you. But if you do extend the Illuminate\Database\Query\Expression class you have to change your implementation. The SQL code you returned in the __toString method needs to be moved to the getValue method.

@morloderex
Copy link
Contributor

morloderex commented Nov 18, 2022

@tpetry It might also be worth noting that because of the removal from __toString() method towards getValue() that any expression cannot be converted to a string using typecasting anymore due to the now missing Grammar context in that method.

I wonder if we could add that back somehow? I am however not even sure if doing (string) new \Illuminate\Database\Query\Expression('some expression') is something we want to support?

@tpetry
Copy link
Contributor Author

tpetry commented Nov 18, 2022

I am not sure whether that counts as a breaking change for users of Laravel, as only the Laravel core is transforming those expressions into strings. I did choose against both ways to get the string for an expression because it increases the complexity of handling those expressions. One way is more straightforward than two ways. And as I said, I don't expect anyone having extended those classes. I only found the idea of doing it in a tweet which I used as an idea to work on this PR.

@nevadskiy
Copy link
Contributor

@tpetry That is fantastic. What do you think about adding a separate GrammarAwareInterface that contains a single setGrammar method? I think that would allow to keep the original toString method and avoid the breaking change.

@tpetry
Copy link
Contributor Author

tpetry commented Jan 3, 2023

As said before, I found no one really extending the Expression class. Therefore adding two ways to handle them would add a lot more complexity to Eloquent than just doing a breaking change in a major release for something maybe 1 of 1.000 have done.

@driesvints driesvints changed the base branch from master to 10.x January 5, 2023 13:43
@taylorotwell
Copy link
Member

I think this look fairly good to me if we can get the conflicts fixed @tpetry 👍

@tpetry
Copy link
Contributor Author

tpetry commented Jan 23, 2023

@taylorotwell Done. Rebased it on the newest 10.x commit.

src/Illuminate/Database/Grammar.php Outdated Show resolved Hide resolved
src/Illuminate/Database/Query/Expression.php Outdated Show resolved Hide resolved
@taylorotwell
Copy link
Member

Thanks! 👍

@willrowe
Copy link
Contributor

@tpetry I'm curious, in what case would an expression ever return anything but a SQL string from getValue? I thought it was supposed to represent a SQL expression.

@tpetry
Copy link
Contributor Author

tpetry commented Jan 25, 2023

An integer or float is still representable as string. And it was completely valid to use them before, so I doesnt want to break this.

@willrowe
Copy link
Contributor

@tpetry I know it allowed those types previously, I was just wondering if there was an example of a situation in which something other than a string would need to be returned.

@taylorotwell
Copy link
Member

Hey @tpetry - I've finally finished my review of this. Thanks for your patience!

I wanted to get some further elaboration on your "Outlook" section regarding adding bindings / quoted values from the expression. What was your thoughts there on how that would work. While this PR definitely provides an improvement over our current expression capabilities, I can see people quickly asking for the ability for their expressions to be able to safely add bindings to the query. Any thoughts on how that would look or be implemented?

@tpetry
Copy link
Contributor Author

tpetry commented Jan 27, 2023

I did that in my old PR. Basically we have to the grammar. In the future (in my opinion) the query grammar instance should contain a database connection object (PDO). That way it could provide escaping of raw content based on the current connection settings (the charset encoding is important with mysql to prevent sql injections!) or export the version number of the database (to use newer sql features when available).

I could work backport this one tomorrow or tuesday if this merged. And we‘ll check it for 10.x or better plan it for 11.x.

@taylorotwell
Copy link
Member

taylorotwell commented Jan 27, 2023

@tpetry now that I have a pretty decent grasp of this PR we can just try to expose the PDO connection to the grammar with another commit on this PR if you don't mind?

@tpetry
Copy link
Contributor Author

tpetry commented Jan 27, 2023

I can do that, on tomorrow. If those changes are then too much for you, feel free to just revert the additional commit.

@tpetry
Copy link
Contributor Author

tpetry commented Jan 27, 2023

While thinking about it, I would like to delay that change about linking the grammar with the connection. I don‘t want to break the 10.x release. Let me work on the grammar thing in a backward compatible way within L10. I guess other db providers (like mine, staudenmeier or singlestore) may break with such a change? Its a little bit late for that in the 10.x process. I‘ll find a way shortly after the 10.x release for grammars to opt-in for that feature if they want support it.

@tpetry
Copy link
Contributor Author

tpetry commented Jan 28, 2023

@taylorotwell Sorry, I did write the last message misleading last night. I am ok with merging this PR. Adding the connection to grammars I would like to work on shortly after the 10.x release to do it in a safe backwards-compatible way. I don‘t want to break other non-core drivers so shortly before the final release.

@taylorotwell taylorotwell merged commit 18dfa03 into laravel:10.x Feb 2, 2023
crynobone added a commit to crynobone/framework that referenced this pull request Feb 3, 2023
Signed-off-by: Mior Muhammad Zaki <crynobone@gmail.com>
driesvints pushed a commit that referenced this pull request Feb 3, 2023
Signed-off-by: Mior Muhammad Zaki <crynobone@gmail.com>
browner12 added a commit to browner12/framework that referenced this pull request Feb 16, 2023
The `Expression` class no longer provides a `__toString()` method, so we cannot cast it to a string here.

We'll now use the `->getValue()` method and pass the current database connection's query grammar.

laravel#44784
taylorotwell pushed a commit that referenced this pull request Feb 16, 2023
The `Expression` class no longer provides a `__toString()` method, so we cannot cast it to a string here.

We'll now use the `->getValue()` method and pass the current database connection's query grammar.

#44784
chu121su12 pushed a commit to chu121su12/framework that referenced this pull request Feb 25, 2023
The `Expression` class no longer provides a `__toString()` method, so we cannot cast it to a string here.

We'll now use the `->getValue()` method and pass the current database connection's query grammar.

laravel#44784
@danrichards
Copy link

Was there a reason for not just passing in an extension of Expression that implements your own __toString() method.

https://www.php.net/manual/en/class.stringable.php

I might be missing something?

@joelharkes
Copy link
Contributor

@tpetry Why did you not add the classes like Alias in the PR? i would be nice if they were generally available to anyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet