Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange ds/head and ds/tail behaviour in v7 #343

Closed
alex314159 opened this issue Jan 6, 2023 · 6 comments
Closed

Strange ds/head and ds/tail behaviour in v7 #343

alex314159 opened this issue Jan 6, 2023 · 6 comments

Comments

@alex314159
Copy link

Hi Chris,

some weird behaviour with these functions in v7, see below:

(def x (ds/->dataset (repeat 1000 {:a 1 :b 2})))
=> #'jasmine.var/x

(ds/select-rows (ds/select-rows x (range 10)) [0 1]) ;;expected behaviour
=> _unnamed [2 2]:

| :a | :b |
|---:|---:|
|  1 |  2 |
|  1 |  2 |

(ds/select-rows (ds/tail x 5) [0 1]) ;;unexpected
=> _unnamed [2 2]:

| :a | :b |
|---:|---:|
|  1 |  2 |
|  1 |  2 |
|  1 |  2 |
|  1 |  2 |
|  1 |  2 |

(ds/select-rows (ds/head x 5) [0 1]) ;;unexpected
=> _unnamed [2 2]:

| :a | :b |
|---:|---:|
|  1 |  2 |
|  1 |  2 |
|  1 |  2 |
|  1 |  2 |
|  1 |  2 |

Also note the docstring for ds/head and ds/tail is wrong as it says "Get the first n row of a dataset. Equivalent to `(select-rows ds (range n)). Arguments are reversed, however, so this can be used in ->> operators." (the arguments are not reversed, and it's better that way IMHO!)

Thanks,

@alex314159
Copy link
Author

And maybe a better example of the issue:

(def x (ds/->dataset {:a (range 1000) :b (range 1000)}))
=> #'jasmine.var/x
(ds/select-rows (ds/tail x 5) [0 1])
Error printing return value (IndexOutOfBoundsException) at ham_fisted.ChunkedList/sublistCheck (ChunkedList.java:216).
End index out of range: end-index(5), n-elems(2)

@cnuernber
Copy link
Collaborator

That is the error I get - index out of bounds - it was because the head/tail operations put print-index-range metadata on the returned dataset and the select-rows operation didn't clear the metadata (which in general it should not). So when you go to print the third, 2 row dataset it has print-index-range of (range 5) from the head operation :-).

@cnuernber
Copy link
Collaborator

The fix is really two-fold. First, select-rows should clear specifically print-index-range. Second, printing itself should never cause that exception so it needs to always filter print-index-range to be in the range of the current dataset row indexes.

One thing you may really like - select-rows now accepts negative indexes which will index from the end python-style.

@cnuernber
Copy link
Collaborator

New release is up :-)

@alex314159
Copy link
Author

thanks!!

@harold
Copy link
Contributor

harold commented Jan 6, 2023

Wow! Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants