Skip to content
This repository has been archived by the owner on May 4, 2019. It is now read-only.

[WIP/RFC] functional operations on nullables as collections #115

Closed
wants to merge 4 commits into from
Closed

[WIP/RFC] functional operations on nullables as collections #115

wants to merge 4 commits into from

Conversation

TotalVerb
Copy link

@TotalVerb TotalVerb commented Jun 12, 2016

This implements Scala-style map and filter. I did not implement Scala's fold semantics, because I consider that to be a mistake, since it is inconsistent with the regular foldr and foldl, which I did keep. Since map is not type-stable, I also implemented broadcast as a type-stable map. broadcast also preserves the broadcasting capability of the version for arrays.

I also implemented x[] and x[1] to replace get(x). Finally, I added in the iteration interface, which allows imperative-style for loops as well as functional-style higher order functions.

John presented several options for propagation in this old mailing list post, which I quote:

As I see it, progagation can work in a couple of ways:

(1) Always off: you can't propagate nullability without explicitly constructing a Nullable object.
(2) Opt-in at the call-site: you can propagate nullability by using something like map(f, Nullable).
(3) Opt-in near the call-site: you can propagate nullability within a block of code using something like @propagate begin x = Nullable(1) + Nullable(2); sin(x) end
(4) Opt-in at the definition site: you can propagate nullability by annotating a function as propagating in the same way that we annotate functions as vectorizable.
(5) Always on: you can't prevent propagation without explicitly checking for nullability before calling any function.

The implementation of broadcast here doubles as an implementation of (2). This is simply a nice result of the algebraic properties of broadcasting. With new broadcast syntactic sugar, (2) is no longer very verbose. I think alternative syntaxes like ? are hence no longer required.


Notes: (optional, sorry for the long PR)

  • I avoided implementing the Haskell-style monadic functions, because I don't think Julia has any monads and so that would be a big departure. The Scala-style functions feel more typical in a Julia environment.
  • I haven't optimized for performance yet. Obviously some functions will need to be specialized. But since 0.5 allows specialization on functions, I am sure that this syntax can be made as fast as any manual lowering.
  • Over the course of this PR I discovered some further nasty things with nullable, such as how get(Nullable(), 1) doesn't work (it expects a Union{} second argument?!) and how equality comparison is a bit broken (try Nullable() == Nullable()). I am new to Nullables so please excuse if there is a reason for this behaviour, but I found it heavily unintuitive.
  • This is only turned on on v0.5- right now. It can probably be extended to 0.4 (I don't see why not) but performance will be terrible and the shorthand . broadcast syntax will not be available.
  • I would prefer to deprecate get(u) in favour of u[]. The former syntax seems strange to me, as all other uses of get in Julia include a default value. I don't know if this is a popular viewpoint; my background in nullables comes more from Lisp (where (cons x nil) can be used as Nullable(x)) and Scala than from numerical computing languages.
  • Mathematically, I'm treating nullables akin to infinite-dimensional square arrays here. (More precisely, a Nullable type is an union of infinite-order tensor power [defining "infinite" here as "arbitrarily large and finite", since actual infinite-order tensor powers probably behave more strangely] of finite-dimensional vector spaces, of course disallowing infinite objects. The reason for this long-winded analogy is that it's basically a SquareArray{T, ∞}, which provides a baseline for what should and shouldn't work.) This is perfectly consistent algebraically but the algebra is unimportant. It's only a guide for what kind of functionality should be implemented.
  • While the mathematics makes sense to me, I couldn't find any papers on treating Option values as infinite-order tensor powers of finite-dimensional vector spaces, and I wouldn't be surprised if this was not a description in use at all. So if this turns out to be absolutely incoherent, then let's abandon the analogy. I just thought that it is quite nice for explaining the parallel for broadcast and for indexing, as identical operations.

@coveralls
Copy link

coveralls commented Jun 12, 2016

Coverage Status

Coverage decreased (-77.6%) to 4.422% when pulling c8a4bd9 on TotalVerb:fw/nullableops into 9842b91 on JuliaStats:master.

@TotalVerb
Copy link
Author

I have never seen such a large coverage decrease! [I turned the other tests off for faster testing locally, and forgot to turn them back on again.]

@coveralls
Copy link

coveralls commented Jun 12, 2016

Coverage Status

Coverage decreased (-0.2%) to 81.882% when pulling 0457087 on TotalVerb:fw/nullableops into 9842b91 on JuliaStats:master.

@coveralls
Copy link

coveralls commented Jun 12, 2016

Coverage Status

Coverage increased (+0.1%) to 82.178% when pulling d69b31d on TotalVerb:fw/nullableops into 9842b91 on JuliaStats:master.

@hayd
Copy link

hayd commented Jun 13, 2016

It feels a little strange this being is in NullableArrays rather than base (or, if it really can't get into base/0.5, its own package)... 👍 to this functionality.

Edit: IMO The map, getindex are more obviously good for base, I suspect the broadcast stuff is distinct from that... and more controversial (?).

@vchuravy
Copy link

Already map over Nullable is controversial in Base. JuliaLang/julia#9446

importall Base
import Base: promote_op, LinearFast

# conceptually, nullable types can be considered infinite-order tensor powers of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment really helps understanding the code. Personally, I rather find it hard to get. :-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so attached either. I'll leave this comment out.

# warning: not type-stable
map{T}(f, u::Nullable{T}) = u.isnull ? Nullable{Union{}}() : Nullable(f(u.value))

# multi-argument map doesn't broadcast, so not very useful, but no harm having
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left out the comments

@nalimilan
Copy link
Member

I agree this should go into Base (base/nullable.jl), not NullableArrays. We're rather trying to move other essential features into Base.

There are many points to discuss in the description, but let's try to keep focused on this PR. I think we should keep get for now, I don't really see the problem with it: we could even add get(x::AbstractArray, i) (with no default) for consistency, even if it would be redundent with getindex. Anyway that's a minor point which could be tackled later. The get(Nullable(), 1) failure sounds like a bug to file separately. As regards ==, see the arithmetic operators issues I linked to at JuliaLang/julia#16889 (comment).

Regarding type-inference issues, they are dependent on fixing broadcast and map in Base. See e.g. JuliaLang/julia#4883.

@vchuravy I think with the new .( syntax it's less controversial. At least I now find this approach interesting.

@TotalVerb
Copy link
Author

TotalVerb commented Jun 13, 2016

Thanks for the comments. I see that NullableArrays is not the right place to submit this PR. After addressing @nalimilan's comments [thanks!], I'll move this PR to Base.

@TotalVerb
Copy link
Author

Thank you @nalimilan for the review comments! Much appreciated. I will close this PR after Travis passes and move it to Base later tomorrow.

@coveralls
Copy link

Coverage Status

Coverage decreased (-6.04%) to 76.006% when pulling 21160c1 on TotalVerb:fw/nullableops into 9842b91 on JuliaStats:master.

@TotalVerb
Copy link
Author

Travis passing now (except for unrelated, as far as I can tell, failure on nightly). Will move PR to Base.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants