Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reader: treating empty string range as expected. #2883

Merged
merged 2 commits into from Feb 23, 2022

Conversation

KiterLuc
Copy link
Contributor

In a lot of places in the code, an empry string range means default.
This leads to a lot of confusion as the user can also pass in empty
string ranges and an empty string is a valid coordinate. This change
removes the assumption that empty strings are default in a few
functions in the dimension class and relies only on std::string_view
compare. They are:

  • Dimension::covered
  • Dimension::overlap
  • Dimension::overlap_ratio
  • Dimension::relevant_ranges
  • Dimension::covered_vec

To fix the default range behavior for strings, two functions needed to
be adjusted at the subarray level: compute_relevant_fragments and
compute_relevant_fragment_tile_overlap. For the first one, the fix was
trivial, simply default the relevant fragment bitmap to 1 for default
ranges and skip the computation. For the tile overlap however, it meant
that the is_default values need to be passed in through the rtree to
the domain as the overlap ratio computation for all dimensions is done
at that level.


TYPE: IMPROVEMENT
DESC: Reader: treating empty string range as expected.

In a lot of places in the code, an empry string range means default.
This leads to a lot of confusion as the user can also pass in empty
string ranges and an empty string is a valid coordinate. This change
removes the assumption that empty strings are default in a few
functions in the dimension class and relies only on std::string_view
compare. They are:
- Dimension::covered<char>
- Dimension::overlap<char>
- Dimension::overlap_ratio<char>
- Dimension::relevant_ranges<char>
- Dimension::covered_vec<char>

To fix the default range behavior for strings, two functions needed to
be adjusted at the subarray level: compute_relevant_fragments and
compute_relevant_fragment_tile_overlap. For the first one, the fix was
trivial, simply default the relevant fragment bitmap to 1 for default
ranges and skip the computation. For the tile overlap however, it meant
that the is_default values need to be passed in through the rtree to
the domain as the overlap ratio computation for all dimensions is done
at that level.

---
TYPE: IMPROVEMENT
DESC: Reader: treating empty string range as expected.
@shortcut-integration
Copy link

This pull request has been linked to Shortcut Story #14611: Unordered reader missing data.

Comment on lines 834 to 835
const auto r1_after_r2 = r1_start > mbr_end;
const auto mbr_after_r1 = mbr_start > r1_end;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(seems prior to your changes)
curious naming, r1, r2, when there is essentially only 'r', and no r2 anywhere in routine I can see...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, removed.

Copy link
Contributor

@dhoke4tdb dhoke4tdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some questions/observations that came to mind while looking at the changes.

Comment on lines +671 to +672
if (r1_default[d])
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears this will now exclude a 'default' range from consideration of any overlap and in calculating ratio,
is that desireable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in this case, if it's default, it means that we include everything in this dimension, which means overlap ration should be 1, which would leave ratio unaltered.

Comment on lines +501 to +502
std::vector<bool> is_default(subarray.size(), false);
auto tile_overlap = rtree_.get_tile_overlap(subarray, is_default);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there certainty here that the NDRange& subarray is entirely NON-default, and/or does it not matter?

Copy link
Contributor Author

@KiterLuc KiterLuc Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is certainty. This all gets called from Array::get_max_buffer_size, and there is a check in there to not allow var size dimensions. For fixed size dimensions, even if the dimension might be default, there is still values in there representing the non empty domain so we are good.

@Shelnutt2 Shelnutt2 merged commit 14f1c14 into dev Feb 23, 2022
@Shelnutt2 Shelnutt2 deleted the lr/reader-empty-range-fix/ch14611 branch February 23, 2022 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants