Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Subarray::compute_relevant_fragments #2216

Merged
merged 1 commit into from
Apr 20, 2021

Conversation

joe-maley
Copy link
Contributor

This patch refactors Subarray::compute_relevant_fragments to reduce the time
complexity at the expense of false-positive relevant fragments. The correctness
of the tile overlap computation is unaffected because it will not find overlap
within falsely relevant fragments.

Currently, this routine has a time complexity of:
<# dimensions>[product]<# dimension ranges> * <# fragments>

This patch modifies the routine to have a time complexity of:
<# dimensions>[summation]<# dimension ranges> * <# fragments>

Currently, a single relevant-fragment-bytemap is computed by checking overlap
on each fragment's ND-range against the ND-range for each flattened range
ids.

This patch maintains an individual relevant-fragment-bytemap for each
dimension, where the bytemap is computed by checking overlap on each fragment's
ND-range against each ND-range in subarray. The final relevant-fragment-bytemap
is computed by performing a logical AND among each dimension's individual
fragment bytemap.

The important consideration is that the input to this routine is a start/end
index on the flattened range ids. To reduce risk, this change does not modify
the interface to Subarray::compute_relevant_fragments. To compute in ND-space
instead of flattened-space, we must convert the input start/end indexes to
ND-coordinates. The ND space between the start/end coordinates is not guaranteed
to encapsulate all ranges in between the flattened ("total order") start/end
indexes. To handle this, we must expand the coordinates to sufficiently capture
all input ranges. This may add additional ranges, which may result in false
positive relevant fragments.


TYPE: IMPROVEMENT
DESC: Optimize Subarray::compute_relevant_fragments

@joe-maley joe-maley force-pushed the jpm/opt-compute_relevant_fragments branch from eb35d3e to 738282d Compare April 20, 2021 12:52
@joe-maley joe-maley force-pushed the jpm/opt-compute_relevant_fragments branch from 738282d to 2b146f8 Compare April 20, 2021 13:22
This patch refactors `Subarray::compute_relevant_fragments` to reduce the time
complexity at the expense of false-positive relevant fragments. The correctness
of the tile overlap computation is unaffected because it will not find overlap
within falsely relevant fragments.

Currently, this routine has a time complexity of:
`<# dimensions>[product]<# dimension ranges> * <# fragments>`

This patch modifies the routine to have a time complexity of:
`<# dimensions>[summation]<# dimension ranges> * <# fragments>`

Currently, a single relevant-fragment-bytemap is computed by checking overlap
on each fragment's ND-range against the ND-range for each flattened range
ids.

This patch maintains an individual relevant-fragment-bytemap for each
dimension, where the bytemap is computed by checking overlap on each fragment's
ND-range against each ND-range in subarray. The final relevant-fragment-bytemap
is computed by performing a logical AND among each dimension's individual
fragment bytemap.

The important consideration is that the input to this routine is a start/end
index on the flattened range ids. To reduce risk, this change does not modify
the interface to `Subarray::compute_relevant_fragments`. To compute in ND-space
instead of flattened-space, we must convert the input start/end indexes to
ND-coordinates. The ND space between the start/end coordinates is not guaranteed
to encapsulate all ranges in between the flattened ("total order") start/end
indexes. To handle this, we must expand the coordinates to sufficiently capture
all input ranges. This may add additional ranges, which may result in false
positive relevant fragments.

---
TYPE: IMPROVEMENT
DESC: Optimize `Subarray::compute_relevant_fragments`
@joe-maley joe-maley force-pushed the jpm/opt-compute_relevant_fragments branch from 2b146f8 to 09edff0 Compare April 20, 2021 14:29
@joe-maley joe-maley merged commit 0b8d52f into dev Apr 20, 2021
@joe-maley joe-maley deleted the jpm/opt-compute_relevant_fragments branch April 20, 2021 15:33
@github-actions
Copy link
Contributor

The backport to release-2.2 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-release-2.2 release-2.2
# Navigate to the new working tree
cd .worktrees/backport-release-2.2
# Create a new branch
git switch --create backport-2216-to-release-2.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick --mainline 1 0b8d52fe6f5e2492cc6daa0472cc9720664d57d9
# Push it to GitHub
git push --set-upstream origin backport-2216-to-release-2.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-release-2.2

Then, create a pull request where the base branch is release-2.2 and the compare/head branch is backport-2216-to-release-2.2.

KiterLuc pushed a commit that referenced this pull request Apr 25, 2021
This patch refactors `Subarray::compute_relevant_fragments` to reduce the time
complexity at the expense of false-positive relevant fragments. The correctness
of the tile overlap computation is unaffected because it will not find overlap
within falsely relevant fragments.

Currently, this routine has a time complexity of:
`<# dimensions>[product]<# dimension ranges> * <# fragments>`

This patch modifies the routine to have a time complexity of:
`<# dimensions>[summation]<# dimension ranges> * <# fragments>`

Currently, a single relevant-fragment-bytemap is computed by checking overlap
on each fragment's ND-range against the ND-range for each flattened range
ids.

This patch maintains an individual relevant-fragment-bytemap for each
dimension, where the bytemap is computed by checking overlap on each fragment's
ND-range against each ND-range in subarray. The final relevant-fragment-bytemap
is computed by performing a logical AND among each dimension's individual
fragment bytemap.

The important consideration is that the input to this routine is a start/end
index on the flattened range ids. To reduce risk, this change does not modify
the interface to `Subarray::compute_relevant_fragments`. To compute in ND-space
instead of flattened-space, we must convert the input start/end indexes to
ND-coordinates. The ND space between the start/end coordinates is not guaranteed
to encapsulate all ranges in between the flattened ("total order") start/end
indexes. To handle this, we must expand the coordinates to sufficiently capture
all input ranges. This may add additional ranges, which may result in false
positive relevant fragments.

---
TYPE: IMPROVEMENT
DESC: Optimize `Subarray::compute_relevant_fragments`

Co-authored-by: Joe Maley <joe@tiledb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants