New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predicates that reference a column on the right side of a left outer join cannot be used #621

Closed
sgrif opened this Issue Feb 4, 2017 · 0 comments

Comments

Projects
None yet
1 participant
@sgrif
Member

sgrif commented Feb 4, 2017

A minimal script to reproduce with Diesel 0.10.0

#[macro_use] extern crate diesel;
#[macro_use] extern crate diesel_codegen;

use diesel::prelude::*;

table! {
    users {
        id -> Integer,
    }
}

table! {
    posts {
        id -> Integer,
        user_id -> Integer,
    }
}

#[derive(Identifiable, Associations)]
#[has_many(posts)]
struct User {
    id: i32,
}

#[derive(Identifiable, Associations)]
#[belongs_to(User)]
struct Post {
    id: i32,
    user_id: i32,
}

fn main() {
    users::table.left_outer_join(posts::table)
        .filter(posts::id.is_null());
}

This affects the usage of any predicate in the query DSL. Ultimately the problem comes down to the fact that we are writing impls like:

impl<T, QS> SelectableExpression<QS> for NotNull<T> where
    T: SelectableExpression<QS>

Since the second parameter of SelectableExpression defaults to the SQL type of the expression, and that changes to be nullable if we're on the right side of a left outer join, the where clause resolves to the wrong SQL type.

The fix here is to change the second parameter of SelectableExpression to be an associated type, not a type parameter. Either by going forward with #6, or with a smaller change.

@sgrif sgrif added this to the 0.10 milestone Feb 4, 2017

@sgrif sgrif added the bug label Feb 11, 2017

sgrif added a commit that referenced this issue Feb 15, 2017

Ensure aggregate functions enforce the column is from the right table
While working on #621, I noticed that these impls were incorrect and
could be used to compile an incorrect query. I've corrected the impls
and added the appropriate compile-fail test.

sgrif added a commit that referenced this issue Feb 15, 2017

Ensure aggregate functions enforce the column is from the right table
While working on #621, I noticed that these impls were incorrect and
could be used to compile an incorrect query. I've corrected the impls
and added the appropriate compile-fail test.

I'm not sure if this was just an oversight or if I intentionally did
this to avoid nullability somewhere. The latter is no longer relevant
since we always make these expressions nullable now.

sgrif added a commit that referenced this issue Feb 15, 2017

Use associated types for `SelectableExpression`
The `SelectableExpression` trait serves two purposes for us. The first
and most important role it fills is to ensure that columns from tables
that aren't in the from clause cannot be used. The second way that we
use it to make columns which are on the right side of a left outer join
be nullable.

There were two reasons that we used a type parameter instead of an
associated type. The first was to make it so that `(Nullable<X>,
Nullable<Y>)` could be treated as `Nullable<(X, Y)>`. We did this
because the return type of `users.left_outer_join(posts)` should be
`(User, Option<Post>)`, not `(User, Post)` where every field of `Post`
is an `Option`.

Since we now provide a `.nullable()` method in the core DSL, I think we
can simply require calling that method explicitly if you want that tuple
conversion to occur. I think that the most common time that conversion
will even be used is when the default select clause is used, where we
can just handle it for our users automatically.

The other reason that we went with a type parameter originally was that
it was easier, since we can provide a default value for a type parameter
but not an associated type. This turned out to actually be a drawback,
as it led to #104. This PR actually brings back aspects of that issue,
which I'll get to in a moment.

It's expected that any expression which implements
`SelectableExpression<QS>` have a `T: SelectableExpression<QS>` bound
for each of its parts. The problem is, the missing second parameter is
defaulting to `T::SqlType`, which means we are implicitly saying that
this bound only applies for `QS` which does not change the SQL type
(anything except a left outer join). This ultimately led to #621.

However, with our current structure, it is impossible to fix #621
without re-introducing at least some aspects of #104. In
#104 (comment) I
said that we didn't need to worry about `1 + NULL`, because we didn't
implement add for any nullable types. However, I'm not sure I considered
joins when I made that statement. The statement applied to joins
previously because of that implicit "sql type doesn't change"
constraint. This commit removes that constraint, meaning #104 will be
back at least when the nullability comes from being on the right side of
a left join.

I don't think this is a serious enough issue that we need to immediately
address it, as the types of queries which would cause the issue still
just don't happen in practice. We should come up with a long term plan
for it, though. Ultimately the nullability of a field really only
matters in the select clause. Since any operation on null returns null,
and you basically want null to act as false in the where clasue, it
doesn't matter there.

So one partial step we could take is to break this out into two separate
traits. One for the "make sure this is valid given the from clause", and
one for the "make this nullable sometimes" case and only constrain on
the first one in the where clause. We could then re-add the "sql type
doesn't change" constraint on the problem cases, which will bring back
aspects of #621, but only for select clauses which is a smaller problem.

I'm not sure if I ultimately want to go the two traits route or not. If
nothing else, the problem cases are much more obvious with this commit.
Anywhere that has `type SqlTypeForSelect = Self::SqlType` is likely a
problem case when joins are involved. This will make it easier to find
all the places to apply a solution when I come up with one that I'm
happy with.

Fixes #621.

@sgrif sgrif closed this in #709 Feb 16, 2017

sgrif added a commit that referenced this issue Feb 26, 2017

Split `SelectableExpression` into two traits
The change in #709 had the side effect of re-introducing #104.
With the design that we have right now, nullability isn't propagating
upwards. This puts the issue of "expressions aren't validating that the
type of its arguments haven't become nullable, and thus nulls are
slipping in where they shouldn't be" at odds with "we can't use complex
expressions in filters for joins because the SQL type changed".

This semi-resolves the issue by restricting when we care about
nullability. Ultimately the only time it really matters is when we're
selecting data, as we need to enforce that the result goes into an
`Option`. For places where we don't see the bytes in Rust (filter,
order, etc), `NULL` is effectively `false`.

This change goes back to fully fixing #104, but brings back a small
piece of #621. I've changed everything that is a composite expression to
only be selectable if the SQL type hasn't changed. This means that you
won't be able to do things like
`users.left_outer_join(posts).select(posts::id + 1)`, but you will be
able to use whatever you want in `filter`.

This change is also to support what I think will fix the root of all
these issues. The design of "Here's the SQL type on this query source"
is just fundamentally not what we need. There is only one case where the
type changes, and that is to become null when it is on the right side of
a left join, the left side of a right join, or either side of a full
join.

One of the changes that #709 made was to require that you explicitly
call `.nullable()` on a tuple if you wanted to get `Option<(i32,
String)>` instead of `(Option<i32>, Option<String>)`. This has worked
out fine, and isn't a major ergonomic pain. The common case is just to
use the default select clause anyway. So I want to go further down this
path.

The longer term plan is to remove `SqlTypeForSelect` entirely, and *not*
implement `SelectableExpression` for columns on the nullable side of a
join. We will then provide these two blanket impls:

```rust
impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<T> where T: SelectableExpression<Right>,
{}

impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<Cons<Head, Tail>> where
        Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>,
        Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>,
{}
```

(Note: Those impls overlap. Providing them as blanket impls would
require rust-lang/rust#40097. Providing them as
non-blanket impls would require us to mark `Nullable` and possibly
`Cons` as `#[fundamental]`)

The end result will be that nullability naturally propagates as we want
it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing
`select(lower(posts::name).nullable())` will work. `lower(posts::name)`
will fail because `posts::name` doesn't impl `SelectableExpression`.
`lower(posts::name.nullable())` will fail because while
`SelectableExpression` will be met, the SQL type of the argument isn't
what's expected. Putting `.nullable` at the very top level naturally
follows SQL's semantics here.

sgrif added a commit that referenced this issue Feb 26, 2017

Split `SelectableExpression` into two traits
The change in #709 had the side effect of re-introducing #104.
With the design that we have right now, nullability isn't propagating
upwards. This puts the issue of "expressions aren't validating that the
type of its arguments haven't become nullable, and thus nulls are
slipping in where they shouldn't be" at odds with "we can't use complex
expressions in filters for joins because the SQL type changed".

This semi-resolves the issue by restricting when we care about
nullability. Ultimately the only time it really matters is when we're
selecting data, as we need to enforce that the result goes into an
`Option`. For places where we don't see the bytes in Rust (filter,
order, etc), `NULL` is effectively `false`.

This change goes back to fully fixing #104, but brings back a small
piece of #621. I've changed everything that is a composite expression to
only be selectable if the SQL type hasn't changed. This means that you
won't be able to do things like
`users.left_outer_join(posts).select(posts::id + 1)`, but you will be
able to use whatever you want in `filter`.

This change is also to support what I think will fix the root of all
these issues. The design of "Here's the SQL type on this query source"
is just fundamentally not what we need. There is only one case where the
type changes, and that is to become null when it is on the right side of
a left join, the left side of a right join, or either side of a full
join.

One of the changes that #709 made was to require that you explicitly
call `.nullable()` on a tuple if you wanted to get `Option<(i32,
String)>` instead of `(Option<i32>, Option<String>)`. This has worked
out fine, and isn't a major ergonomic pain. The common case is just to
use the default select clause anyway. So I want to go further down this
path.

The longer term plan is to remove `SqlTypeForSelect` entirely, and *not*
implement `SelectableExpression` for columns on the nullable side of a
join. We will then provide these two blanket impls:

```rust
impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<T> where T: SelectableExpression<Right>,
{}

impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<Cons<Head, Tail>> where
        Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>,
        Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>,
{}
```

(Note: Those impls overlap. Providing them as blanket impls would
require rust-lang/rust#40097. Providing them as
non-blanket impls would require us to mark `Nullable` and possibly
`Cons` as `#[fundamental]`)

The end result will be that nullability naturally propagates as we want
it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing
`select(lower(posts::name).nullable())` will work. `lower(posts::name)`
will fail because `posts::name` doesn't impl `SelectableExpression`.
`lower(posts::name.nullable())` will fail because while
`SelectableExpression` will be met, the SQL type of the argument isn't
what's expected. Putting `.nullable` at the very top level naturally
follows SQL's semantics here.

sgrif added a commit that referenced this issue Mar 3, 2017

Split `SelectableExpression` into two traits
The change in #709 had the side effect of re-introducing #104.
With the design that we have right now, nullability isn't propagating
upwards. This puts the issue of "expressions aren't validating that the
type of its arguments haven't become nullable, and thus nulls are
slipping in where they shouldn't be" at odds with "we can't use complex
expressions in filters for joins because the SQL type changed".

This semi-resolves the issue by restricting when we care about
nullability. Ultimately the only time it really matters is when we're
selecting data, as we need to enforce that the result goes into an
`Option`. For places where we don't see the bytes in Rust (filter,
order, etc), `NULL` is effectively `false`.

This change goes back to fully fixing #104, but brings back a small
piece of #621. I've changed everything that is a composite expression to
only be selectable if the SQL type hasn't changed. This means that you
won't be able to do things like
`users.left_outer_join(posts).select(posts::id + 1)`, but you will be
able to use whatever you want in `filter`.

This change is also to support what I think will fix the root of all
these issues. The design of "Here's the SQL type on this query source"
is just fundamentally not what we need. There is only one case where the
type changes, and that is to become null when it is on the right side of
a left join, the left side of a right join, or either side of a full
join.

One of the changes that #709 made was to require that you explicitly
call `.nullable()` on a tuple if you wanted to get `Option<(i32,
String)>` instead of `(Option<i32>, Option<String>)`. This has worked
out fine, and isn't a major ergonomic pain. The common case is just to
use the default select clause anyway. So I want to go further down this
path.

The longer term plan is to remove `SqlTypeForSelect` entirely, and *not*
implement `SelectableExpression` for columns on the nullable side of a
join. We will then provide these two blanket impls:

```rust
impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<T> where T: SelectableExpression<Right>,
{}

impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<Cons<Head, Tail>> where
        Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>,
        Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>,
{}
```

(Note: Those impls overlap. Providing them as blanket impls would
require rust-lang/rust#40097. Providing them as
non-blanket impls would require us to mark `Nullable` and possibly
`Cons` as `#[fundamental]`)

The end result will be that nullability naturally propagates as we want
it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing
`select(lower(posts::name).nullable())` will work. `lower(posts::name)`
will fail because `posts::name` doesn't impl `SelectableExpression`.
`lower(posts::name.nullable())` will fail because while
`SelectableExpression` will be met, the SQL type of the argument isn't
what's expected. Putting `.nullable` at the very top level naturally
follows SQL's semantics here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment