-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tag retrieval: Inclusive and Exclusive stop rule #495
Conversation
1110bef
to
395a553
Compare
Will be used to specify the slice mode when retrieving data through tags: - Inclusive: Include the end point of the range if it's == pos+ext. - Exclusive: Do not include the end point of the range.
The tagged_data() method allows defining the SliceMode which affects how the right (stop) endpoint behaves with the 'stop_rule' argument. This affects the value that's returned when the final position (pos+ext) matches exactly a tick on the corresponding dimension: If Exclusive, the previous position is returned. If Inclusive, the matching position is returned. The index_of() methods of the Dimensions take a boolean `leq` which defines whether to return the exact match when found (less or equal), or the previous (strictly less).
When sl=0 it was treated as None which returns the full data instead of just the first element.
Improving clarity and readability.
Bit of a rewrite of the function to make it simpler and more readable.
395a553
to
d355871
Compare
Codecov Report
@@ Coverage Diff @@
## master #495 +/- ##
==========================================
+ Coverage 77.84% 78.41% +0.56%
==========================================
Files 57 57
Lines 8762 8991 +229
==========================================
+ Hits 6821 7050 +229
Misses 1941 1941
Continue to review full report at Codecov.
|
nixio/tag.py
Outdated
stop = start + 1 | ||
elif idx < len(extent): | ||
ext = extent[idx] | ||
stop = self._pos_to_idx(pos+ext, units[idx], dim, stop_rule) + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the +1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stop
element of a slice isn't inclusive ever. Slice(0, 30)
is equivalent to [0:30]
which returns 30 elements, the last one being index 29. _pos_to_idx()
on the pos+ext
value returns the last_index
, the index of the last value that should be included in the return slice, so the slice stop value should be last_index + 1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thanks
I agree with the last position handling, but I am not sure about the start position handling. As I see it, we can only define whether it is less or less or equal. For numpy like slicing we should have greater or equal for the start position. Or am I mistaken? |
You're not wrong. I read the document a few times and still did it wrong. I made the start position follow |
One test for each dimension type that uses simple dimension ticks or intervals (no scaling, integer positions) and runs through a whole range of (pos, ext) combinations. Adds offsets (0.1 and 0.9) to position and extent to test rounding up (for start) and rounding down (for stop).
Same functionality as Tag. Rewrote _calc_data_slices the same way we did with Tag._calc_data_slices.
_calc_data_slices() moved to BaseTag. Tag calls it directly. MultiTag extracts the position and extent from its array at the given index and calls the same method.
The index_of() methods now take an optional mode argument (of enum type IndexMode) that defines how the position is rounded (leq, geq) and if the previous position should be returned if an exact match is found (less). The SetDimension now also implements index_of() to apply these rules. The tagged_data() method still only accepts either Inlcusive or Exclusive SliceModes that define which IndexMode is used for the stop rule (the right endpoint). The start rule (left endpoint) always follows the geq rule.
Adapted from new Tag tests.
30 was a bit too much and slowed down tests. 10 should be enough.
Testing index modes and out of bounds conditions for errors.
d355871
to
ab70aa5
Compare
Added an Added tests to the dimensions for all of these. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, thank you!
This PR implements the slicing rules described in https://github.com/G-Node/nix/blob/master/docs/slicing_and_indices.md.
Slicing is by default excludes the right endpoint now. The endpoint can be included by specifying
stop_rule=nix.SliceMode.Inclusive
in thetagged_data()
method of a Tag or MultiTag.To clarify, the new Exclusive rules work as follows:
tagged_data
is the value at the last index of each dimension that is less than or equal to the Tag'sposition
.tagged_data
is the value at the last index of each dimension that is less than the Tag'sposition+extent
.Using Inclusive changes the second behaviour of the last element to be the same as for the first element:
tagged_data
is the value at the last index of each dimension that is less than or equal to the Tag'sposition+extent
.Note that when the value of
position+extent
falls between two dimension ticks, the behaviour is the same for both cases. For example, if a dimension has integer ticks[0, 1, 2, 3, 4...]
andposition + extent = 2.2
, then the last index of the DataView will always be2
.This is covered in the tests and described by diagrams in comments for future reference (because it always takes me a while to figure out what I'm doing and this will definitely help going forward).
The PR also includes a bit of a rewrite of the code for calculating the slices which simplifies and merges the methods for Tag and MultiTag.
Closes #484.
Closes #494.