Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sel (and $) return a list most of the times, but sometimes a single element #368

Open
jaromil opened this issue Jun 27, 2017 · 6 comments
Open

Comments

@jaromil
Copy link

jaromil commented Jun 27, 2017

If a dataset selection returns only a single element the type of the result changes and breaks most functions accepting it, generating a confusing error that may be hard to guess.
By wrapping the returned type of sel with a coll? check it is possible to convert it into a single element list so that in any case the calling function can assume that the returned value will always be a list.
Example:

(defn wrap
  "wrap a single element into a collection, safety measure for dataset
  operations on a columns that return a single element"
  [ele] (if (coll? ele) ele (list ele)))

Can we integrate this into the sel function?

@mikera
Copy link
Contributor

mikera commented Jun 27, 2017

Sounds like there is a bug here to be fixed but we should think carefully about how to fix it, i.e. exactly what dimensionality the inputs and outputs of the problematic functions expect. 0D, 1D, 2D, ND etc..

Can you share the exact code you were having problems with?

@jaromil
Copy link
Author

jaromil commented Jun 28, 2017

hi there and congrats for Vectorz :^) We need to verify if this is only related to the old incanter (1.5.7 "stable") or also the new one. Simple way to replicate the problem:

(type ($ :1 (to-dataset [{:1 1 :2 2}])))
; java.lang.Long
(type ($ :1 (to-dataset [{:1 1 :2 2} {:1 3 :2 4}])))
; clojure.lang.LazySeq

I'm suggesting that sel shall return always a LazySeq. If confirmed I'm happy to make a PR.
EDIT: just tested and this bug occurs also on the latest develop branch.

@jaromil
Copy link
Author

jaromil commented Aug 30, 2017

Any ideas on how to proceed on this? It is a corner-case for calls to sel returning only one element, the main issue being developer's experience. Currently the sel documentation is rather ambiguous:

  If the column or row is specified as an atomic object (index or name), then
  the result will be returned as a list (only values from selected column or row).

@mikera
Copy link
Contributor

mikera commented Aug 30, 2017

I think it should be consistent for any length of column.... i.e. any selection of a column should always return a vector, even if it only has one element. I think this is worth a breaking change, it is incredibly annoying to have to write special case code for stuff like this.

It should be possible to use core.matrix's slice or get-column to do this.

@jaromil
Copy link
Author

jaromil commented Aug 30, 2017

My assessment is that this won't be a breaking change if the solutions already adopted by developers affected is similar to the wrap function I published above. I am a bit confused by the complexity of the sel multi-method in incanter-core... can you tell if this behavior is coming from core.matrix/select itself? from its API documentation it seems so https://mikera.github.io/core.matrix/doc/clojure.core.matrix.html#var-select as the first example is not a vector.

@mikera
Copy link
Contributor

mikera commented Aug 30, 2017

I recall some of this was refactored in the work a couple of years ago during the project to port Incanter to use core.matrix, but probably hasn't been fully cleaned up. In most cases, Incanter functions should be fairly lightweight wrappers over core.matrix operations, perhaps adding some extra optional arguments and functionality.

From the first clojure.core.matrix/select example:

(select [[1 2][3 4]] 0 0) ;=> 1

It is important to note that this is actually selecting on two dimensions! The first 0 selects the first row, the second 0 selects the first element in that row, so the final dimensionality is 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants