Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specs for clojure.set #70

Closed
eerohele opened this issue Nov 1, 2018 · 26 comments
Closed

Specs for clojure.set #70

eerohele opened this issue Nov 1, 2018 · 26 comments

Comments

@eerohele
Copy link
Contributor

eerohele commented Nov 1, 2018

A while back I wrote specs for the functions in clojure.set after being bit by (set/union #{:a} [:b :c]).

Would you be interested in merging that code into this project? If so, I can create a pull request if you want. If you'd rather not, I totally understand.

I think the specs are in pretty good shape otherwise, except that I haven't been able to decide whether set functions should allow nil args or not. While they behave pretty well when given nil args, I think that part is more or less undefined.

The specs currently allow nil args, but I'm willing to disallow them if you think that'd be better.

@borkdude
Copy link
Owner

borkdude commented Nov 1, 2018 via email

@slipset
Copy link
Collaborator

slipset commented Nov 1, 2018

@eerohele Because we'd like to have this repo in a state that lets the whole or part of it be included into clojure.core, we need the contributors to have signed the Clojure CA.
It might seem like you have not done so?

@borkdude
Copy link
Owner

borkdude commented Nov 1, 2018

Snippet from chat in #clojure-spec on Slack.

borkdude [4:22 PM]
Is nil considered valid input for Clojure set functions? 0 arity returns empty set, but 1 arity with nil returns nil

In other words should a spec for those functions account for nil input or throw
seancorfield [4:45 PM]
@borkdude Could you show some code? I'm not sure what you're asking.
borkdude [4:50 PM]
(set/union) vs (set/union nil), they differ in return type
seancorfield [4:55 PM]
set/union is only defined for arguments that are sets tho', right?
(and nil is not a set -- so it's "garbage-in, garbage-out" here?)
So I guess the answer to your question is "No, a spec for those functions should not allow nil as input".
borkdude [5:16 PM]
If we can agree for union etc. that 1) :ret should be set? then we must either 2) reject nil inputs in the :args spec, or 3) assert that the implementation is wrong (edited)
alexmiller [5:54 PM]
don’t know
borkdude [5:55 PM]
would we find it useful for fdefs detect nil (as an instance of non-sets) inputs for set fns? (edited)
or can we say this is undefined territory and assume the simpler spec
alexmiller [5:57 PM]
I think right now I would treat both inputs and output as nilable
as existing code may be relying on the behavior of those things
nils are often used interchangeably with empty collections. it seems unlikely to me that there is not some code relying on this either for input or output

@borkdude
Copy link
Owner

borkdude commented Nov 1, 2018

@eerohele Given the above, I think treating input and output as nilable sets makes sense.

@eerohele
Copy link
Contributor Author

eerohele commented Nov 1, 2018

@slipset I haven't, but I have no problem with it. I'll take care of it.

@borkdude All right — that means I probably won't have to make all that many changes. Thanks for doing the legwork on that.

I'll go over the style guide, make any necessary fixes, and submit a PR for speculative.set.

@jafingerhut
Copy link

Note: In the clojure.data/diff implementation there are calls to at least clojure.set/union, and perhaps a couple of other clojure.set functions, with sequences as arguments (the return value of the clojure.core/keys function IIRC), so not a set and not nil. I believe the implementation of clojure.set/union for the ways they are called from clojure.data/diff always return correct results (i.e. duplicates in the return value are only harmful for performance, not correctness of the clojure.data/diff results). Not sure if that is considered a bug or something that an 'official' spec for clojure.set/union should allow. https://dev.clojure.org/jira/browse/CLJ-1087 I am an interested observer of such detailed questions, too, not a policy maker

@jafingerhut
Copy link

Here is a Clojure JIRA ticket with a patch that proposes specs for clojure.set/union and a couple of other clojure.set functions, but those proposed specs may be not what is desired by the Clojure core team. https://dev.clojure.org/jira/browse/CLJ-2287

@borkdude
Copy link
Owner

borkdude commented Nov 1, 2018

@jafingerhut I think that usage of union is similar or equal to the usage of into?

cljs.user=> (require '[clojure.set :as set])
nil
cljs.user=> (def a {:a 1 :b 2})
#'cljs.user/a
cljs.user=> (def b {:b 2 :c 3})
#'cljs.user/b
cljs.user=> (set/union (keys a) (keys b))
(:c :b :a :b)
cljs.user=> (into (keys a) (keys b))
(:c :b :a :b)
cljs.user=>

So the question remains, why did Stu not did that... maybe he forgot a call to as-set-value around those keys? It works now, but did he intend to write it like this? The rest of the set functions in that namespace are called with sets.

@jafingerhut
Copy link

I agree that question remains. Sorry, I don't have the answer.

@eerohele
Copy link
Contributor Author

eerohele commented Nov 2, 2018

When I researched the behavior of clojure.set functions when passed non-set args a while back, I stumbled upon this comment by Alex Miller:

This was done somewhat intentionally for performance reasons - rather than validate or transform, just assume they are sets - their behavior when passed non-sets is considered undefined. However, I think this is worth revisiting in light of where we are today vs then. (https://groups.google.com/d/msg/clojure/HtK4pqsr--8/KbWvq8CFCQAJ)

In light of that comment, it seems to me that doing something like (set/union (keys a) (keys b)) is an error. But then again, I guess you could argue the same for (set/union #{:a} nil).

However, I'm inclined to agree that there's bound to be a somewhat significant amount of code that relies on set functions being able to accept nil args and that it might be better to allow them.

Here's another consideration regarding clojure.set: if you clone the Day of Datomic repo, enable instrumentation with the specs I wrote and evaluate the namespace, you get this error (with Expound):

https://gist.github.com/eerohele/7689259057ed976aba26e4fd1a83792c

In other words, somewhere in Datomic's code base there's a call to set/map-invert that passes in something other than a clojure.lang.IPersistentMap. I can't be sure whether that's an error in the calling code or something that the specs should account for. Perhaps map? is not the appropriate predicate here? associative? won't work, either, but is there something else we should rather use?

@borkdude
Copy link
Owner

borkdude commented Nov 2, 2018

My thoughts:

  • Let's assume data.diff's call to union meant to pass sets instead of seqs, until we have enough evidence it's not the case.
  • The init argument for reduce in map-invert is a map, so the result is supposed to be a map. We can assume that the input also is something map-like, but map? may not cut it for Datomic. Was the input an EntityMap maybe? Can you verify the type?

@eerohele
Copy link
Contributor Author

eerohele commented Nov 2, 2018

Let's assume data.diff's call to union meant to pass sets instead of seqs, until we have enough evidence it's not the case.

Sounds reasonable to me.

Can you verify the type?

I could look into it, but I don't currently know how to do that. It might be possible to create a custom spec error printer that prints the type, but I'm not sure.

@borkdude
Copy link
Owner

borkdude commented Nov 2, 2018

I could look into it, but I don't currently know how to do that.

You can write a different predicate than map? which prints the type as a side effect.

@eerohele
Copy link
Contributor Author

eerohele commented Nov 2, 2018

That's a good point. I'll do it when I get the chance.

@eerohele
Copy link
Contributor Author

eerohele commented Nov 2, 2018

It's a java.util.HashMap.

I didn't know map-invert works on HashMap instances, but it makes sense, I guess:

(set/map-invert (java.util.HashMap. {:foo :bar}))
;=> {:bar :foo}

So maybe we could do something like this?

(defn hash-map?
  [x]
  (instance? java.util.HashMap x))

(spec/def ::map-like (some-fn map? hash-map?))

Then use ::map-like(or ideally come up with a better name) for the set specs?

@borkdude
Copy link
Owner

borkdude commented Nov 2, 2018

map-invert works by handling the input as a reducible coll of pairs that can be destructured on the 0-th and 1-th index. So any coll that has pairs that can be called with (nth pair 0) and (nth pair 1) will do. nth also handles java.util.Map$Entry.

Given that

(instance? java.util.Map$Entry (first (doto (java.util.HashMap.) (.put 1 2))))

explains why map-invert also works for java.util.HashMap.

So a special case for HashMap is probably not enough.

This also works:

(clojure.set/map-invert [[:a 1] [:b 2]])
{1 :a, 2 :b}

@borkdude
Copy link
Owner

borkdude commented Nov 2, 2018

@eerohele One more thought. I think we can assume for now that a reducible collection of MapEntry is the only valid input for map-invert.

I just pushed a spec for map-entry: https://github.com/slipset/speculative/blob/master/src/speculative/core.cljc#L70

Here is the spec for a reducible coll:
https://github.com/slipset/speculative/blob/master/src/speculative/core.cljc#L120

You can combine those two to spec the input for map-invert.

@jafingerhut
Copy link

How far down Hyrum's law do people want to go here? I'm guessing "any instances of Java objects that can be passed to function X as arguments, that returns a Java object that satisfies the properties of an object that we think X ought to return" could get pretty convoluted, and must in general be computationally undecidable.

@borkdude
Copy link
Owner

borkdude commented Nov 3, 2018

@jafingerhut Can you be more specific? Is a reducible of map-entry too Hyrum for you?

@jafingerhut
Copy link

When Clojure 1.9 came out, some ns forms that were permitted before became illegal, because of specs. Not many, but a few.

It seems plausible to ask whether some of the calls in Datomic and/or clojure.data/diff should similarly become spec errors for functions in clojure.set. Sure, given their authors, one should question carefully whether they become errors or are allowed, but it seems worth asking.

@borkdude
Copy link
Owner

borkdude commented Nov 3, 2018

@jafingerhut Posted some questions in #46.

For map-invert a ::speculative/reducible-coll of ::speculative/map-entry still might make most sense, given this example (by @mfikes):

(set/map-invert (filter (comp even? key) {1 7 2 9 3 4 4 12}))
;; => {9 2, 12 4}

@borkdude
Copy link
Owner

borkdude commented Nov 16, 2018

@eerohele
Copy link
Contributor Author

eerohele commented Nov 17, 2018

Some additional findings regarding java.util.Map args in clojure.set functions that take map args:

  • set/project allows them
  • set/rename-keys and set/rename allow them in the second arg but throw a ClassCastException in the first arg
  • set/index allows them
  • set/join throws a ClassCastException in the first arg but allows them in the second and third args

@borkdude
Copy link
Owner

borkdude commented Nov 17, 2018

@eerohele

  • set/project leads via clojure.core/select-keys to clojure.core/find which uses the java Map interface. I think we do need to support a java map in find, select-keys.
  • set/rename-keys: it seems unlikely to me that anyone would want to pass a kmap arg that's not a Clojure map
  • set/index: same as set/project
  • set/join: seems unlikely to me that anyone would want to pass km that's not a Clojure map. Further, same as set/project.

@slipset
Copy link
Collaborator

slipset commented Dec 1, 2018

merged

@slipset slipset closed this as completed Dec 1, 2018
@borkdude
Copy link
Owner

borkdude commented Feb 21, 2019

@eerohele

Some additional findings regarding java.util.Map args in clojure.set functions that take map args:

  • set/join throws a ClassCastException in the first arg but allows them in the second and third args

Using generative testing I found some cases where set/join throws with java maps:

(set/join [{:a 1} {:a 1}] [(java.util.HashMap. {:b 2}) (java.util.HashMap. {:b 2})]) ;; works
(set/join [{:a 1} {:a 1}] [(java.util.HashMap. {:b 2})]) ;; fails
(set/join [{:a 1} {:a 1}] [{:b 2}]) ;; works

so in general set/join doesn't work well with java.util.Map. Therefore I excluded them from the set/join spec for now (in a soon to be merged PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants