Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to change Column (Nullable ...) to NullableColumn ... #97

Open
tomjaguarpaw opened this issue Aug 14, 2015 · 23 comments

Comments

Projects
None yet
6 participants
@tomjaguarpaw
Copy link
Owner

commented Aug 14, 2015

I realised that Column (Nullable ...) doesn't make sense. Not only does it allow the nonsensical type Column (Nullable (Nullable ...)) it also means that the type Column a -> Column a -> Column PGBool is uncorrect. It should be Column a -> Column a -> Column (Nullable PGBool). This is too painful to contemplate!

The correct fix is to have two type constructors, Column and NullableColumn. This will break a lot of code. I propose the following

  • 0.5: Add NullableColumn a as type alias for Column (Nullable a)
  • 0.6: Deprecate Nullable
  • 0.7: Remove Nullable

As the major stakeholders, I would appreciate opinions from

@bergmark

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2015

I think this would be pretty hairy for us to work around...

As for .==: We solved this by not allowing comparisons of Nullable and using a .==? for them instead.

I also don't see what's nonsensical about Nullable (Nullable a). You can't distinguish between Nullable Null and Null but I thought that was the reason Maybe isn't used for these columns. We do the same thing in Fay.

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented Aug 14, 2015

I think this would be pretty hairy for us to work around...

Can you say more about why?

We solved this by not allowing comparisons of Nullable

How did you achieve that?

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented Aug 14, 2015

I see you achieved it thus: (.==) :: ShowConstant a => Column a -> Column a -> Column Bool

@bergmark

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2015

Can you say more about why?

Sure,

We currently only use two type aliases, one for the haskell values and one opaleye, e.g.:

data PersonP a b = Person { name :: a, gender :: b }
type PersonO = Person String (Nullable Gender)
type PersonH = Person String (Maybe Gender)

A normal query for a person gives us To Column PersonO with Nullable already inside the type alias, we'd have to pull that out somehow and the To type family may not be usable anymore.

A left join gives us MapTo Column (OtherType, To Nullable PersonO) = (To Column OtherType, To Column (To Nullable PersonO)), this would also need to change somehow.

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented Aug 14, 2015

I can see that would be problematic.

@bergmark

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2015

Since Opaleye has PGOrd already we could have PGEq a :: Column a -> Column a -> PGBool. We went through this migration and It was a good thing to do. Type errors were easy to fix, first consider whether the postgres semantics for null equality is what we want and then change to .==?, possibly calling nullable on one of the operands.

@k0001

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2015

I agree with @tomjaguarpaw that Column a vs NullableColumn a is a better represenation than Column a vs Column (Nullable a).

As an additional data point, I'm currently working on a thin layer on top of Opaleye to provide a more robust API in scenarios like this one: I'm trying to make the distinction between Column (Nullable a) and Column a as explicit as possible, yet allow some operations (say, something like .==) to work correctly both in nullable and not nullable columns, but there are many cases where implementing instances for those overloaded operations is inconvenient or impossible because of overlapping instances between Column (Nullable a) and Column a.

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented Sep 10, 2015

Renzo, I suggest you go ahead and define newtype NullableColumn a = NullableColumn (Column a) somewhere in your own library and report back how it goes! You may have to write a bunch of boilerplate to convert to existing Haskell functions but it's probably worthwhile nonetheless.

@k0001

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2015

Good idea Tom, I might just do that. Thanks.
Il 10/set/2015 06:27 PM, "tomjaguarpaw" notifications@github.com ha
scritto:

Renzo, I suggest you go ahead and define newtype NullableColumn a =
NullableColumn (Column a) somewhere in your own library and report back
how it goes! You may have to write a bunch of boilerplate to convert to
existing Haskell functions but it's probably worthwhile nonetheless.


Reply to this email directly or view it on GitHub
#97 (comment)
.

@k0001

This comment has been minimized.

Copy link
Contributor

commented Sep 20, 2015

I wanted to mention that in opaleye-sot, the thin API layer on top of opaleye I'm working on, I've made the switch to this representation. I'm quite happy with the results so far, having this explicit distinction between what can and what can't be nullable brings the benefits of Int vs Maybe Int to the DSL. This distinction makes it much easier to reason about what can and can't be NULL.

To avoid any confusions between Opeleye's current Column and this other representation that forbids having a Column (Nullable a), I've temporarily renamed the Column proposed here to Kol, and the NullableColumn proposed here to Koln. You can see my work on Kol and Koln in https://github.com/k0001/opaleye-sot/blob/master/src/lib/Opaleye/SOT/Internal.hs

Keep in mind that said code is a work in process, and that in that very same module a lot of type-level programming is going on for purposes unrelated to the implementation of Kol and Koln, so you can ignore that. If you are interested in learning about my goals with opaleye-sot, you can take a look at some of the examples in the tutorial I'm working on: https://github.com/k0001/opaleye-sot/blob/master/src/lib/Opaleye/SOT/Tutorial.hs

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented Sep 20, 2015

Thanks for the progress report. Glad to hear it's working out well.

@ocharles

This comment has been minimized.

Copy link
Contributor

commented May 5, 2016

I think there's one more oddity that you might want to consider here. Functions like restrict actually have domain NullableColumn Bool rather than just Column Bool. However, I believe NULL does have a special meaning here - it's the same as false. Likewise for LEFT JOIN.

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented May 5, 2016

I'm not so worried about that. Anyone who wants to restrict or join on a genuinely nullable condition is mad.

@ocharles

This comment has been minimized.

Copy link
Contributor

commented May 5, 2016

Err, no they aren't.

SELECT * X LEFT JOIN Y ON (X.A = Y.A) LEFT JOIN Z ON (Y.A = Z.A)

Y.A can be null.

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented May 5, 2016

That's something else. The return value of the equality cannot be NULL.

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented May 5, 2016

Actually I rescind my claim of madness in the join case. Due to the use of indexes I can see that you might want to join on nullable conditions. I don't think you ever want to restrict on one though.

@ocharles

This comment has been minimized.

Copy link
Contributor

commented May 5, 2016

Surely any arguments for indexes applies equally well for WHERE clauses then. But I have no examples there, but I do have examples with joins (the above is just an alpha-renaming & simplification of a real query we have).

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented May 5, 2016

OK you convinced me. I'm the mad one :)

@xnmp

This comment has been minimized.

Copy link

commented May 4, 2017

Is there any progress on this at all, and would it solve the problem of having to to manually type annotate left joins? Doing that (and also having to type annotate any usage of runQuery) is by far the single most annoying thing about Opaleye for me.

the type of left join could just change to

leftJoin :: (D.Default U.Unpackspec (Column a) (Column a),
              D.Default U.Unpackspec (Column b) (Column b),
          => Query (Column a)  -- ^ Left query
          -> Query (Column b)  -- ^ Right query
          -> ((Column a, Column b) -> Column T.PGBool) -- ^ Condition on which to join
          -> Query (Column a, NullableColumn b) -- ^ Left join

With of course appropriate modifications to dealing with records and tuples. (Column User ... = User Column ...), and Column (a,b) = (Column a, Column b).
Since I only write select queries I'm sacrificing type safety so that I don't have to annotate by getting rid of nullable completely, with:

null2 :: Column a
null2 = O.Column (HPQ.ConstExpr HPQ.NullLit)

isNull2 :: Column a -> Column Bool
isNull2 = O.unOp HPQ.OpIsNull

@tomjaguarpaw tomjaguarpaw changed the title Need to change Column (Nullable ...) to NullableCollumn ... Need to change Column (Nullable ...) to NullableColumn ... Jun 3, 2017

@tstat

This comment has been minimized.

Copy link
Contributor

commented Nov 3, 2017

@bergmark @ocharles @hesselink @dbp @c-lewis back in June I implemented the change from Column (Nullable a) to Column (n :: Nullability) a in this PR: #305. I haven't really revisited this issue (or the opaleye codebase) since then, so I may need to refresh myself on all of the possible implications of this change, but it seems like an obvious improvement to me for a couple of reasons.

  1. Nullable (Nullable a) and Nullable a are equivalent in that the possible return values from postgres are still a, or NULL, that is, it is impossible to get a Just Nothing back from such a query, but they are distinct at the type level.
  2. With the Nullable a construction we cannot represent arbitrary non-nullable types parametrically, This causes the type restrictions in some functions to be nonsensical. For instance, consider the functions for manipulating JSON. I have pasted one of these functions below:
(.#>) :: (PGIsJson a)
      => Column (C.Nullable a) -- ^
      -> Column (T.PGArray T.PGText) -- ^ path
      -> Column (C.Nullable a)

We make the first argument nullable so that we can ensure the return column is a nullable value of type a without running the risk of returning a nested nullable. Restricting the type of the first argument to be nullable is not actually meaningful, it is just a trick to make sure that the return value is nullable, but not a nested nullable. As another example of this same issue, I introduced joinNullable :: Nullable (Nullable a) -> Nullable a in my PR for adding indexing on arrays, as index :: (C.PGIntegral n) => Column (T.PGArray a) -> Column n -> Column (C.Nullable a) would return a nested Nullable if the input array has nullable values. So we are having to do some type dancing that is not actually meaningful. The Column (n :: Nullability) a approach solves this issue since we can represent arbitrary non-nullable types parametrically. d2c94d1#diff-7eeb6cd9196c52b6b8e2d6da8c6a0265L261

We could also solve issues like #91 with this approach.

So to summarize: I don't see how Nullable a gives us any distinct information when compared to a Column (n :: Nullability) a approach, yet we are introducing arbitrary type restrictions and doing small type dances to satisfy them. So I propose we move to a Column (n :: Nullability) a approach.

How do you all feel about this proposal?

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented Nov 4, 2017

How do you all feel about this proposal?

My feelings on the matter are exactly the same as before:

The idea is sound[1] and basically The Right Thing™ but it's an enormous change and the practical benefit over the status quo is small. This means that the time before I'll make this change is measured in months and, to be quite realistic, I may never get round to it at all.

-- #305 (comment)

To be clear, "enormous" in the above means in terms of the breakage it will cause clients, rather than in the amount of work it would take to implement.

I think it's worth talking through some of the issues and seeing how people feel about them. For example every column would have to be tagged with its nullability, not just nullable ones. We'll see
Column NonNullable PGType everywhere, instead of just Column PGType. We could shorten this to Column NN PGType, but it's still more text than before. Personally I think it's easily a price worth paying, but does everyone else?

@tstat

This comment has been minimized.

Copy link
Contributor

commented Nov 4, 2017

I think it's worth talking through some of the issues and seeing how people feel about them.

Yes, I agree; that is certainly my intention with this post. I know that @bergmark and @ocharles had issues with this proposal when you initially made it. I suspect some of these potential problems might be alleviated since this issue was originally proposing separate constructors.

For example every column would have to be tagged with its nullability, not just nullable ones. We'll see
Column NonNullable PGType everywhere, instead of just Column PGType. We could shorten this to Column NN PGType, but it's still more text than before. Personally I think it's easily a price worth paying, but does everyone else?

I think the specific issue you raise of verbosity is a legitimate as well, especially with new type variables to type classes, however, I will point out that we can shorten it beyond Column NN PGType and simply use Column PGType and NullableColumn PGType by declaring a couple of type aliases.

0ec5bcd#diff-a04b9a32b4efa5892b2d9309cb17af57R15

@tomjaguarpaw

This comment has been minimized.

Copy link
Owner Author

commented Nov 5, 2017

I know that @bergmark and @ocharles had issues with this proposal when you initially made it.

I think the idea of using a type parameter instead of two separate type constructors probably addresses their points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.