Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get dict value as a nullable #13055

Open
dbeach24 opened this issue Sep 10, 2015 · 33 comments
Open

Get dict value as a nullable #13055

dbeach24 opened this issue Sep 10, 2015 · 33 comments
Labels
domain:missing data Base.missing and related functionality

Comments

@dbeach24
Copy link
Contributor

I need a good way to both test for a value in a dictionary and get that value, without resorting to one of the following fallbacks, all of which are somewhat undesirable:

  1. get(dict, key, specialnullval) -- problematic because I must either use a separate type to represent null (inefficient due to the type-variadic return), or I must have some special way to represent null within the value type of the dictionary.
  2. haskey(dict, key) followed by dict[key] -- problematic because it looks up the key twice when it exists.
  3. try k = dict[key] catch ... -- problematic because it relies on exceptions when the key is not present.

It seems that all of this can be solved my making use of Julia v0.4's Nullable type, with something like the following:

function getnull{K,V}(h::Dict{K,V}, key)
    index = ht_keyindex(h, key)
    return (index<0) ? Nullable{V}() : Nullable{V}(h.vals[index]::V)
end

I am proposing that this be included as part of base/dict.jl, possibly under a different/better name.

Is there some other/better way to do this?

@JeffBezanson
Copy link
Sponsor Member

Very reasonable idea.

@yuyichao
Copy link
Contributor

Ref #12157. Not necessarily a dup of that but should be solved by the proposal there.

@dbeach24
Copy link
Contributor Author

@yuyichao Thanks for the heads up. I read through #12157, but personally I would side with Stefan: I don't care for the complexity and the syntax. IMO, this seems like a more direct way to solve this (admittedly more narrow) problem.

@ScottPJones
Copy link
Contributor

Does it have to be one or the other (this or #12157)? I think this would be a very nice capability to have, no matter what.

@davidagold
Copy link
Contributor

If ? could be made to parse slightly differently, then perhaps the getnull method proposed above could be named get? and suffixing method names with an ? could indicate in general that the method returns a Nullable object.

@tkelman
Copy link
Contributor

tkelman commented Sep 11, 2015

Don't think that would play nicely with ternary syntax. Though resolving #4233 (nicest approach proposed so far would be a general macro @if for compile-time conditionals and replacing @unix? a : b with that) would help free up ? a bit.

@davidagold
Copy link
Contributor

Indeed, hence conditioning the suggestion on getting ? to parse differently. I think that if at least one whitespace character were required between the condition and the ternary operator, then names could include ? at the end (or elsewhere in them, for that matter) without parsing as conditionals.

Fun aside, it is currently possible to use the ? character itself as a name, but the only way I've found the result to be at all functional is if one defines ? to be a callable object, since evidently the parser does not interpret ? as the ternary operator if it is the very first symbol in an expression:

julia>  ? = x -> x + 5
(anonymous function)

julia>  ?(5)
10

@tkelman
Copy link
Contributor

tkelman commented Sep 11, 2015

#6286 #1910

@mbauman
Copy link
Sponsor Member

mbauman commented Sep 11, 2015

I think this could be spelled get with just two arguments (and no default value).

julia> methods(get, Tuple{Associative,Any})
0-element Array{Any,1}

julia> methods(get, Tuple{Associative,Vararg{Any}})
5-element Array{Any,1}:
 get(t::ObjectIdDict, key::ANY, default::ANY) at dict.jl:326
 get{K,V}(h::Dict{K,V}, key, default) at dict.jl:722
 get{K}(wkh::WeakKeyDict{K,V}, key, default) at dict.jl:834
 get(::Base.EnvHash, k::AbstractString, def) at env.jl:80
 get{K,V}(pq::Base.Collections.PriorityQueue{K,V,O<:Base.Order.Ordering}, key, deflt) at collections.jl:227

julia> methods(get, Tuple{Any,Any})
1-element Array{Any,1}:
 get{T}(x::Nullable{T}, y) at nullable.jl:33

@davidagold
Copy link
Contributor

Ahah, thank you for these resources. I'll be curious to see if requiring whitespace for the ternary operator and thereby freeing up ? for use in identifiers ever becomes a thing.

@toivoh
Copy link
Contributor

toivoh commented Sep 11, 2015

I think when the original idea to allow ? in identifiers came up, Stefan said that "you can petty the ternary operator out of my cold, dead hands". So for better or worse, I don't think that will ever be a thing.

@davidagold
Copy link
Contributor

Hmm. Is requiring whitespace between a condition and the ternary operator tantamount to nixing the latter altogether? I understand that such a requirement may be a particularly annoying gotcha, but beyond that I must be missing something.

@johnmyleswhite
Copy link
Member

I think @StefanKarpinski may be softening his stance given his growing appreciation of the centrality of nullable types in many practical settings.

@ScottPJones
Copy link
Contributor

Too bad, the ternary operator is one of the things that is always confusing to new programmers, and uses up two ASCII characters (a very limited resource).
I think at least requiring spaces around the ? and : is not such a terrible thing.

@jakebolewski
Copy link
Member

Death to the ternary operator.

@ScottPJones
Copy link
Contributor

It really does seem out of place, with some of the rest of the Julia syntax being easy to read/understand compared to C/C++/Java/JS world. Maybe I need to define something like @eitheror(boolean, a, b) (macro, just to be able to get conditional evaluation of a or b, since ifelse(boolean, a, b) evaluates both).

@jiahao
Copy link
Member

jiahao commented Sep 11, 2015

See #1910 for a discussion of ? as part of an identifier name, and #5936 for a more general discussion of allowable code points in identifiers.

@tbreloff
Copy link

-1000000. Must... use... ternary... operator...

@quinnj
Copy link
Member

quinnj commented Sep 11, 2015

I've said it before, but I'd still be in favor of a if x then y else z one line construct; even giving up the ternary operator for it.

@ScottPJones
Copy link
Contributor

Actually, you already can do a 1 line if construct in Julia, ternary operator is just extra unnecessary syntax.
result = (if cond ; a ; else ; b ; end)
I should have thought of it earlier.

@hayd
Copy link
Member

hayd commented Sep 11, 2015

or without the ;s

result = if cond a else b end

Though I think python's a if cond else b reads nicer.

@ScottPJones
Copy link
Contributor

I just wasn't sure if it would always work without the ;'s, because of operator precedence issues, but
you're right, and it probably works in most cases, and is nicer to read.

Julia also has some rough edges because of things like a, b meaning tuple construction, even without parentheses.
I dislike Python's syntax though, because you don't know that a might not be evaluated until you've read the if part.

@eschnett
Copy link
Contributor

By extension, such a function would also make sense for arrays.

This suggests a syntax that extends the [] operator as well, as in one of these:

x = d[key?]
x = d[?key]
x = d[key]?
x = d?[key]

Regarding parsing -- it might be a nightmare to implement, but all except the third one are unambiguous (i.e. do not conflict with the current ?: operator).

@ScottPJones
Copy link
Contributor

I'd worry that the 1st and 4th might have problems, what happens when key is an expression?
x = d[key1?[key2]]
Also, what if ? were allowed in identifiers. d[? key], with required space after [? operator and name?

@eschnett
Copy link
Contributor

The syntax ?[ is still illegal today, as expressions cannot begin with an opening bracket [.

Of course, if you're changing how ? is parsed, then all bets are off.

@mbauman
Copy link
Sponsor Member

mbauman commented Sep 14, 2015

julia> b?[1]:2
1-element Array{Int64,1}:
 1

That would attempt to make a Range if the parsing would change.

Even if we allow ? in identifiers, we could still prevent them from being the start of an identifier. That would allow an [? operator. Of course, if we start adding ? operators, when do we stop? Are we going to have dot-operators and question-operators? Field access: obj.?field? Function call lifting: f(?x)? Arithmetic: Nullable(1) +? Nullable(2)?

@davidagold
Copy link
Contributor

I'll clarify my original remarks. There are at least a few kinds of situations I can think of in which a method returns a Nullable object. The first is in the case of lifted operators -- methods that are defined for non-Nullable arguments and are systematically assigned a corresponding behavior over Nullable arguments. The second concerns objects such as NullableArrays whose dedicated functionality specifically concerns Nullables. The third is in the case such as above, in which an object of a type not specifically dedicated to work with Nullables -- in this case Dict -- is given a method that returns Nullable for one reason or another.

I don't think ? suffixing is useful in the first two cases, where it should be already be clear from context that the methods employed will return Nullables. My point in suggesting the suffix for situations such as get? is that it helps to make clear at a glance that, in this context, a collection that normally does not return Nullable objects will do so because of how the author intends to handle missing indices. It's just additional clarity that helps in reading the code. If it's unlikely that such parsing of ? will be available, then I think Matt's two-arg method signature solution would work just fine.

dbeach24 added a commit to dbeach24/julia that referenced this issue Aug 24, 2016
This returns a Nullable value corresponding to the key (or null).
Includes specialization for dictionaries.
Fixes JuliaLang#13055
@dbeach24
Copy link
Contributor Author

This issue has been open for almost a year now.

@JeffBezanson I hope this idea still seems as "reasonable" as it did last year. Please let me know if not.

Without straying into the larger questions regard special syntax for Nullable types and/or functions that return Nullables, I went ahead and implemented the fix I initially proposed. (See #18211)

Could someone please review/decide?

dbeach24 added a commit to dbeach24/julia that referenced this issue Aug 24, 2016
This returns a Nullable value corresponding to the key (or null).
Includes specialization for dictionaries.
Fixes JuliaLang#13055
dbeach24 added a commit to dbeach24/julia that referenced this issue Aug 30, 2016
This returns a Nullable value corresponding to the key (or null).
Includes specialization for various dictionary types.
Fixes JuliaLang#13055
dbeach24 added a commit to dbeach24/julia that referenced this issue Aug 30, 2016
This returns a Nullable value corresponding to the key (or null).
Includes specialization for various dictionary types.
(Fix bug in priority queue test.)
Fixes JuliaLang#13055
@nalimilan
Copy link
Member

FWIW, here's a review of how several languages handle looking up a non-existent key in a dict (see in particular Appendix C):
https://codewords.recurse.com/issues/one/option-and-null-in-dynamic-languages

@nalimilan nalimilan added the domain:missing data Base.missing and related functionality label Sep 6, 2016
@KristofferC
Copy link
Sponsor Member

Nullable is removed from Base. Not moving this to Nullables.jl because it seems like this needs to be a Base feature in order for it to be useful.

@nalimilan
Copy link
Member

The issue itself is still relevant though. See PR #25131 for a similar change (merging tryparse and parse). I think we could now support this via get(Union{Some, Void}, dict, key).

@nalimilan nalimilan reopened this Jan 7, 2018
@StefanKarpinski
Copy link
Sponsor Member

If we were to have that, perhaps it would be best called tryget(dict, key) which would by definition return Union{Some{eltype(dict)[2], Nothing} with nothing indicating absence. That would be a non-breaking API addition since it would leave the existing get function as is.

@henriquebecker91
Copy link
Contributor

I can open a new issue for this, but this seem the closest one to what I want.

I want to add to Base a method with the following signature (calling this now gives an error, so this is not breaking):

get!(f::Function, collection, key, default)

Return the new value stored for the given key, which is the application of f
to collection[key], or if no mapping for the key is present, f(default).

Basically, this avoids the double lookup problem that sometimes pop up in Discourse (see: https://discourse.julialang.org/t/avoiding-double-lookup/78636). People end up using ht_keyindex (which is not part of the publish interface) to avoid double hashing. The problem only happens if the key has a value and the new value needs the old value to be computed.

My reasoning for applying f to default is that:

  1. f(default) may be a heavy computation that you may not want to do unless it is strictly necessary.
  2. The user can always pass a sentinel value as default and deal with it in a custom way inside f.

Another name, like update! may also be use instead, as get/get! do not change the Dict when the key exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:missing data Base.missing and related functionality
Projects
None yet
Development

No branches or pull requests