Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WeakKeyDict should not convert keys as otherwise it can drop them #24941

Closed
mauro3 opened this issue Dec 6, 2017 · 16 comments · Fixed by #28198
Closed

WeakKeyDict should not convert keys as otherwise it can drop them #24941

mauro3 opened this issue Dec 6, 2017 · 16 comments · Fixed by #28198
Milestone

Comments

@mauro3
Copy link
Contributor

mauro3 commented Dec 6, 2017

Consider:

julia> wkh = WeakKeyDict{Vector{Int}, Any}()
WeakKeyDict{Array{Int64,1},Any} with 0 entries

julia> v = Float64[1:10^5;];

julia> wkh[v] = 1
1

julia> gc()

julia> keys(wkh)
Base.KeyIterator for a WeakKeyDict{Array{Int64,1},Any} with 0 entries

I would argue that this is not the intended behavior and instead an error should be thrown on wkh[v] = 1. Although, maybe there is a use for this which escapes me, e.g. hash consing, see #3002.

Open questions:

  • is whether this strict behavior should also be enforced for getindex, i.e. no key conversion on getters either.
  • one inconsistency with non-conversion is that k1==k2 does not imply wkh[k1]==wkh[k2]. Not sure whether this is bad or not (or worse than above example).

I'm not sure whether this issue should be classified as breaking (I guess that depends whether the behavior is a bug or a feature).

I have a branch in which I changed this here, which I could turn into a PR. Alternatively, if this is a feature and not a bug, a test should be added. Could above example be used or is that too fragile due to gc()?

@mauro3
Copy link
Contributor Author

mauro3 commented Jul 11, 2018

This should get the collections label and probably the bug label.

@StefanKarpinski
Copy link
Sponsor Member

WeakKeyDict should really be WeakKeyIdDict, anything else is pretty broken.

@StefanKarpinski StefanKarpinski added the triage This should be discussed on a triage call label Jul 17, 2018
@StefanKarpinski
Copy link
Sponsor Member

I realize it's very late in the game, but is it fundamentally broken for WeakKeyDict not to do egal-based key comparison? @JeffBezanson

@JeffBezanson
Copy link
Sponsor Member

No, I don't think it's fundamentally broken. The weakness controls when keys get deleted, and the comparison controls which keys can be found. For example if [1] is a key, other [1] arrays will match it but the key will only be removed when the original [1] is freed. That might not be the behavior you want, but it doesn't seem totally broken to me. But on the whole I agree WeakKeyIdDict makes a bit more sense.

@mauro3
Copy link
Contributor Author

mauro3 commented Jul 17, 2018

I totally agree that the ID-version makes more sense.

With the non-ID version, I cannot imagine a situation when conversion of the key upon insertion is the desired outcome. Do you have an example when that could be useful and not a bug? (I'm ok with comparison doing a "conversion")

@StefanKarpinski
Copy link
Sponsor Member

Ok, the current version isn't "fundamentally broken" in the sense that it crashes or gives incorrect behavior, but I cannot imagine a situation where someone is using a WeakKeyDict and a WeakKeyIdDict (or WeakIdDict?) would not actually be more correct. Which I think is roughly the same as what @mauro3 is saying.

@mlhetland
Copy link
Contributor

A recent data point in favor of that here.

@mauro3
Copy link
Contributor Author

mauro3 commented Jul 18, 2018

I implemented the using-object-id change in PR #28161, maybe this can help the triage-call tomorrow.

@JeffBezanson
Copy link
Sponsor Member

See #3002 --- Distributed uses the ==-based hashing behavior of WeakKeyDict. For now we should maybe just remove the conversion, and add WeakKeyIdDict later.

@mauro3
Copy link
Contributor Author

mauro3 commented Jul 19, 2018

I found three occurences of WeakKeyDict in stdlib (and none in base/):

As per Jeff's comment, the first two need ==-based hashing. The third looks suspiciously similar. So, if julia itself only needs the == version of WeakKeyDict, then the Id version should probably go to the DataStructures.jl package. However, procrastinating I coded a WeakKeyDict+WeakKeyIdDict over in PR #28182, so you can see how a possible implementation would look like.

It would probably be good though to add tests which test the ==-needing features of above three WeakKeyDict occurrences (as my PR #28161 shows, changing to ===-hashing does not lead to any test failures in their tests).

@StefanKarpinski
Copy link
Sponsor Member

Thanks for the investigation, @mauro3! The conclusion from discussion on the triage call is that we should just make the minimal change here and make WeakKeyDict not do autoconversion anymore.

@Keno Keno added this to the 0.7 milestone Jul 19, 2018
@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Jul 19, 2018
@mauro3
Copy link
Contributor Author

mauro3 commented Jul 19, 2018

Cool, I can prepare a PR tomorrow. Did you discuss whether the no-conversion applies to getters as well?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jul 19, 2018

We said getters should continue to convert – it's just setters that should type-assert.

@mauro3
Copy link
Contributor Author

mauro3 commented Jul 19, 2018

get! is a bit odd then:

a = [1]
wkd = WeakKeyDict(a=>1)
get!(wkd, [1.0], 1) # works
get!(wkd, [2.0], 1) # errors

but I guess that is ok. Or maybe more consistent to throw on the first too? (The latter would be easier to implement ;-)

@StefanKarpinski
Copy link
Sponsor Member

I would favor erroring on the first one too.

@mauro3
Copy link
Contributor Author

mauro3 commented Jul 20, 2018

For those following along at home and still wondering when the == behavior is needed. It's only used here where it stores RemoteRefs which are compared/hashed with the methods here. The other two WeakKeyDicts (here and here) take objects which fall back to === comparison/hashing. Sadly I couldn't come up with a test case for the RemoteRefs case which tests the behavior needing ==.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants