Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag retrieval: Inclusive and Exclusive stop rule #495

Merged
merged 17 commits into from
Dec 12, 2020

Conversation

achilleas-k
Copy link
Member

@achilleas-k achilleas-k commented Dec 10, 2020

This PR implements the slicing rules described in https://github.com/G-Node/nix/blob/master/docs/slicing_and_indices.md.
Slicing is by default excludes the right endpoint now. The endpoint can be included by specifying stop_rule=nix.SliceMode.Inclusive in the tagged_data() method of a Tag or MultiTag.

To clarify, the new Exclusive rules work as follows:

  • The first element in the DataView returned by tagged_data is the value at the last index of each dimension that is less than or equal to the Tag's position.
  • The last element in the DataView returned by tagged_data is the value at the last index of each dimension that is less than the Tag's position+extent.

Using Inclusive changes the second behaviour of the last element to be the same as for the first element:

  • The last element in the DataView returned by tagged_data is the value at the last index of each dimension that is less than or equal to the Tag's position+extent.

Note that when the value of position+extent falls between two dimension ticks, the behaviour is the same for both cases. For example, if a dimension has integer ticks [0, 1, 2, 3, 4...] and position + extent = 2.2, then the last index of the DataView will always be 2.

This is covered in the tests and described by diagrams in comments for future reference (because it always takes me a while to figure out what I'm doing and this will definitely help going forward).

The PR also includes a bit of a rewrite of the code for calculating the slices which simplifies and merges the methods for Tag and MultiTag.

Closes #484.
Closes #494.

Will be used to specify the slice mode when retrieving data through
tags:
- Inclusive: Include the end point of the range if it's == pos+ext.
- Exclusive: Do not include the end point of the range.
The tagged_data() method allows defining the SliceMode which affects how
the right (stop) endpoint behaves with the 'stop_rule' argument.  This
affects the value that's returned when the final position (pos+ext)
matches exactly a tick on the corresponding dimension:
If Exclusive, the previous position is returned.
If Inclusive, the matching position is returned.

The index_of() methods of the Dimensions take a boolean `leq` which
defines whether to return the exact match when found (less or equal), or
the previous (strictly less).
When sl=0 it was treated as None which returns the full data instead of
just the first element.
Improving clarity and readability.
Bit of a rewrite of the function to make it simpler and more readable.
@codecov-io
Copy link

Codecov Report

Merging #495 (d355871) into master (089ad5d) will increase coverage by 0.56%.
The diff coverage is 99.20%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #495      +/-   ##
==========================================
+ Coverage   77.84%   78.41%   +0.56%     
==========================================
  Files          57       57              
  Lines        8762     8991     +229     
==========================================
+ Hits         6821     7050     +229     
  Misses       1941     1941              
Impacted Files Coverage Δ
nixio/multi_tag.py 78.98% <89.47%> (-3.44%) ⬇️
nixio/tag.py 85.58% <97.14%> (-0.43%) ⬇️
nixio/__init__.py 100.00% <100.00%> (ø)
nixio/data_view.py 85.18% <100.00%> (ø)
nixio/dimensions.py 93.20% <100.00%> (+0.08%) ⬆️
nixio/test/test_multi_tag.py 100.00% <100.00%> (ø)
nixio/test/test_tag.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 089ad5d...d355871. Read the comment docs.

nixio/dimensions.py Outdated Show resolved Hide resolved
nixio/tag.py Outdated Show resolved Hide resolved
nixio/tag.py Outdated
stop = start + 1
elif idx < len(extent):
ext = extent[idx]
stop = self._pos_to_idx(pos+ext, units[idx], dim, stop_rule) + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the +1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stop element of a slice isn't inclusive ever. Slice(0, 30) is equivalent to [0:30] which returns 30 elements, the last one being index 29. _pos_to_idx() on the pos+ext value returns the last_index, the index of the last value that should be included in the return slice, so the slice stop value should be last_index + 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks

nixio/dimensions.py Outdated Show resolved Hide resolved
@jgrewe
Copy link
Member

jgrewe commented Dec 11, 2020

I agree with the last position handling, but I am not sure about the start position handling. As I see it, we can only define whether it is less or less or equal. For numpy like slicing we should have greater or equal for the start position. Or am I mistaken?

@achilleas-k
Copy link
Member Author

I agree with the last position handling, but I am not sure about the start position handling. As I see it, we can only define whether it is less or less or equal. For numpy like slicing we should have greater or equal for the start position. Or am I mistaken?

You're not wrong. I read the document a few times and still did it wrong. I made the start position follow leq rules as well.

One test for each dimension type that uses simple dimension ticks or
intervals (no scaling, integer positions) and runs through a whole range
of (pos, ext) combinations.

Adds offsets (0.1 and 0.9) to position and extent to test rounding up
(for start) and rounding down (for stop).
Same functionality as Tag.
Rewrote _calc_data_slices the same way we did with
Tag._calc_data_slices.
_calc_data_slices() moved to BaseTag.  Tag calls it directly.  MultiTag
extracts the position and extent from its array at the given index and
calls the same method.
The index_of() methods now take an optional mode argument (of enum type
IndexMode) that defines how the position is rounded (leq, geq) and if
the previous position should be returned if an exact match is found
(less).

The SetDimension now also implements index_of() to apply these rules.

The tagged_data() method still only accepts either Inlcusive or
Exclusive SliceModes that define which IndexMode is used for the stop
rule (the right endpoint). The start rule (left endpoint) always follows
the geq rule.
Adapted from new Tag tests.
30 was a bit too much and slowed down tests.
10 should be enough.
Testing index modes and out of bounds conditions for errors.
@achilleas-k
Copy link
Member Author

Added an index_of() method for SetDimension too. All the index_of() methods take an IndexMode argument now which can be Less, LEQ, or GEQ, and the method does rounding and bounds checking accordingly.

Added tests to the dimensions for all of these.

Copy link
Member

@jgrewe jgrewe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, thank you!

@jgrewe jgrewe merged commit d4c2dfd into G-Node:master Dec 12, 2020
@achilleas-k achilleas-k deleted the tag-retrieval branch December 13, 2020 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New Tag data retrieval behaviour (inclusive vs exclusive) (Multi)Tag extent clarification
3 participants