Tag retrieval: Inclusive and Exclusive stop rule #495

achilleas-k · 2020-12-10T14:27:40Z

This PR implements the slicing rules described in https://github.com/G-Node/nix/blob/master/docs/slicing_and_indices.md.
Slicing is by default excludes the right endpoint now. The endpoint can be included by specifying stop_rule=nix.SliceMode.Inclusive in the tagged_data() method of a Tag or MultiTag.

To clarify, the new Exclusive rules work as follows:

The first element in the DataView returned by tagged_data is the value at the last index of each dimension that is less than or equal to the Tag's position.
The last element in the DataView returned by tagged_data is the value at the last index of each dimension that is less than the Tag's position+extent.

Using Inclusive changes the second behaviour of the last element to be the same as for the first element:

The last element in the DataView returned by tagged_data is the value at the last index of each dimension that is less than or equal to the Tag's position+extent.

Note that when the value of position+extent falls between two dimension ticks, the behaviour is the same for both cases. For example, if a dimension has integer ticks [0, 1, 2, 3, 4...] and position + extent = 2.2, then the last index of the DataView will always be 2.

This is covered in the tests and described by diagrams in comments for future reference (because it always takes me a while to figure out what I'm doing and this will definitely help going forward).

The PR also includes a bit of a rewrite of the code for calculating the slices which simplifies and merges the methods for Tag and MultiTag.

Closes #484.
Closes #494.

Will be used to specify the slice mode when retrieving data through tags: - Inclusive: Include the end point of the range if it's == pos+ext. - Exclusive: Do not include the end point of the range.

The tagged_data() method allows defining the SliceMode which affects how the right (stop) endpoint behaves with the 'stop_rule' argument. This affects the value that's returned when the final position (pos+ext) matches exactly a tick on the corresponding dimension: If Exclusive, the previous position is returned. If Inclusive, the matching position is returned. The index_of() methods of the Dimensions take a boolean `leq` which defines whether to return the exact match when found (less or equal), or the previous (strictly less).

When sl=0 it was treated as None which returns the full data instead of just the first element.

Improving clarity and readability.

Bit of a rewrite of the function to make it simpler and more readable.

codecov-io · 2020-12-10T22:17:58Z

Codecov Report

Merging #495 (d355871) into master (089ad5d) will increase coverage by 0.56%.
The diff coverage is 99.20%.

@@            Coverage Diff             @@
##           master     #495      +/-   ##
==========================================
+ Coverage   77.84%   78.41%   +0.56%     
==========================================
  Files          57       57              
  Lines        8762     8991     +229     
==========================================
+ Hits         6821     7050     +229     
  Misses       1941     1941

Impacted Files	Coverage Δ
nixio/multi_tag.py	`78.98% <89.47%> (-3.44%)`	⬇️
nixio/tag.py	`85.58% <97.14%> (-0.43%)`	⬇️
nixio/__init__.py	`100.00% <100.00%> (ø)`
nixio/data_view.py	`85.18% <100.00%> (ø)`
nixio/dimensions.py	`93.20% <100.00%> (+0.08%)`	⬆️
nixio/test/test_multi_tag.py	`100.00% <100.00%> (ø)`
nixio/test/test_tag.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 089ad5d...d355871. Read the comment docs.

nixio/dimensions.py

nixio/tag.py

jgrewe · 2020-12-11T09:37:22Z

nixio/tag.py

+                stop = start + 1
+            elif idx < len(extent):
+                ext = extent[idx]
+                stop = self._pos_to_idx(pos+ext, units[idx], dim, stop_rule) + 1


why the +1?

The stop element of a slice isn't inclusive ever. Slice(0, 30) is equivalent to [0:30] which returns 30 elements, the last one being index 29. _pos_to_idx() on the pos+ext value returns the last_index, the index of the last value that should be included in the return slice, so the slice stop value should be last_index + 1.

got it, thanks

nixio/dimensions.py

jgrewe · 2020-12-11T09:43:46Z

I agree with the last position handling, but I am not sure about the start position handling. As I see it, we can only define whether it is less or less or equal. For numpy like slicing we should have greater or equal for the start position. Or am I mistaken?

achilleas-k · 2020-12-11T13:59:34Z

I agree with the last position handling, but I am not sure about the start position handling. As I see it, we can only define whether it is less or less or equal. For numpy like slicing we should have greater or equal for the start position. Or am I mistaken?

You're not wrong. I read the document a few times and still did it wrong. I made the start position follow leq rules as well.

One test for each dimension type that uses simple dimension ticks or intervals (no scaling, integer positions) and runs through a whole range of (pos, ext) combinations. Adds offsets (0.1 and 0.9) to position and extent to test rounding up (for start) and rounding down (for stop).

Same functionality as Tag. Rewrote _calc_data_slices the same way we did with Tag._calc_data_slices.

_calc_data_slices() moved to BaseTag. Tag calls it directly. MultiTag extracts the position and extent from its array at the given index and calls the same method.

The index_of() methods now take an optional mode argument (of enum type IndexMode) that defines how the position is rounded (leq, geq) and if the previous position should be returned if an exact match is found (less). The SetDimension now also implements index_of() to apply these rules. The tagged_data() method still only accepts either Inlcusive or Exclusive SliceModes that define which IndexMode is used for the stop rule (the right endpoint). The start rule (left endpoint) always follows the geq rule.

Adapted from new Tag tests.

30 was a bit too much and slowed down tests. 10 should be enough.

Testing index modes and out of bounds conditions for errors.

achilleas-k · 2020-12-11T17:25:55Z

Added an index_of() method for SetDimension too. All the index_of() methods take an IndexMode argument now which can be Less, LEQ, or GEQ, and the method does rounding and bounds checking accordingly.

Added tests to the dimensions for all of these.

jgrewe

nice, thank you!

achilleas-k force-pushed the tag-retrieval branch from 1110bef to 395a553 Compare December 10, 2020 15:42

achilleas-k added 9 commits December 10, 2020 18:10

New enum: SliceMode

e55cb14

Will be used to specify the slice mode when retrieving data through tags: - Inclusive: Include the end point of the range if it's == pos+ext. - Exclusive: Do not include the end point of the range.

[TestTag] Test new SliceMode

85a3b32

[TestTag] Adjust older tests to new behaviour

9909430

[DataView] Fix data read bug for sl=0

ef09e8f

When sl=0 it was treated as None which returns the full data instead of just the first element.

[Tag] _calc_data_slice: Refactor and comments

a4ac5e6

Improving clarity and readability.

[Tag] _calc_data_slices: General stop fix

490eaf3

[Tag] Fixed and clearer exception message

a3aee75

[Tag] Simplify _calc_data_slices

dcaf49a

Bit of a rewrite of the function to make it simpler and more readable.

achilleas-k force-pushed the tag-retrieval branch from 395a553 to d355871 Compare December 10, 2020 17:11

jgrewe reviewed Dec 11, 2020

View reviewed changes

achilleas-k added 8 commits December 11, 2020 18:21

[MultiTag] Support stop_rule

8888836

Same functionality as Tag. Rewrote _calc_data_slices the same way we did with Tag._calc_data_slices.

Unify _calc_data_slices for Tag and MultiTag

ce1bfd2

_calc_data_slices() moved to BaseTag. Tag calls it directly. MultiTag extracts the position and extent from its array at the given index and calls the same method.

[TestMultiTag] Adjust older tests to new behaviour

a070945

[TestMultiTag] Tests for new SliceMode

1ff41b5

Adapted from new Tag tests.

[TestTag,TestMultiTag] Reduce signal/tick count

f458997

30 was a bit too much and slowed down tests. 10 should be enough.

[TestDimensions] Test all modes and OOB conditions

ab70aa5

Testing index modes and out of bounds conditions for errors.

achilleas-k force-pushed the tag-retrieval branch from d355871 to ab70aa5 Compare December 11, 2020 17:22

jgrewe approved these changes Dec 12, 2020

View reviewed changes

jgrewe merged commit d4c2dfd into G-Node:master Dec 12, 2020

achilleas-k deleted the tag-retrieval branch December 13, 2020 23:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tag retrieval: Inclusive and Exclusive stop rule #495

Tag retrieval: Inclusive and Exclusive stop rule #495

achilleas-k commented Dec 10, 2020 •

edited

Loading

codecov-io commented Dec 10, 2020

jgrewe Dec 11, 2020

achilleas-k Dec 11, 2020

jgrewe Dec 11, 2020

jgrewe commented Dec 11, 2020

achilleas-k commented Dec 11, 2020

achilleas-k commented Dec 11, 2020

jgrewe left a comment

Tag retrieval: Inclusive and Exclusive stop rule #495

Tag retrieval: Inclusive and Exclusive stop rule #495

Conversation

achilleas-k commented Dec 10, 2020 • edited Loading

codecov-io commented Dec 10, 2020

Codecov Report

jgrewe Dec 11, 2020

Choose a reason for hiding this comment

achilleas-k Dec 11, 2020

Choose a reason for hiding this comment

jgrewe Dec 11, 2020

Choose a reason for hiding this comment

jgrewe commented Dec 11, 2020

achilleas-k commented Dec 11, 2020

achilleas-k commented Dec 11, 2020

jgrewe left a comment

Choose a reason for hiding this comment

achilleas-k commented Dec 10, 2020 •

edited

Loading