Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fragment consolidation to allow using absolute URIs. #5135

Merged
merged 8 commits into from
Jul 15, 2024

Conversation

shaunrd0
Copy link
Contributor

This fixes fragment consolidation to allow using absolute URIs. I ran into this while adding a TileDB-Go binding for tiledb_array_consolidate_fragments in SC-49723 by passing URIs directly from the FragmentInfo APIs.


TYPE: BUG
DESC: Fix fragment consolidation to allow using absolute URIs.

@shaunrd0 shaunrd0 requested review from ypatia and KiterLuc June 21, 2024 14:29
@shaunrd0 shaunrd0 force-pushed the smr/sc-49934/relative-consolidation-uris branch from 5c5d147 to 7fa89f3 Compare June 21, 2024 16:28
Copy link
Member

@ypatia ypatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@ypatia
Copy link
Member

ypatia commented Jun 21, 2024

@KiterLuc this is a fix in consolidate_fragments needed for shipping Fragment list consolidation of tiledb:// arrays. Do you think we can get this in the upcoming 2.24.1 patch release? Thanks.

@KiterLuc
Copy link
Contributor

@KiterLuc this is a fix in consolidate_fragments needed for shipping Fragment list consolidation of tiledb:// arrays. Do you think we can get this in the upcoming 2.24.1 patch release? Thanks.

Sorry 2.24.1 has already sailed :(

@shaunrd0 shaunrd0 force-pushed the smr/sc-49934/relative-consolidation-uris branch from 7fa89f3 to ce20790 Compare July 1, 2024 14:44
@ypatia
Copy link
Member

ypatia commented Jul 2, 2024

@KiterLuc this is a fix in consolidate_fragments needed for shipping Fragment list consolidation of tiledb:// arrays. Do you think we can get this in the upcoming 2.24.1 patch release? Thanks.

Sorry 2.24.1 has already sailed :(

Adding a backport tag in case we go for 2.24.2

@@ -362,7 +362,7 @@ Status FragmentConsolidator::consolidate_fragments(
NDRange union_non_empty_domains;
std::unordered_set<std::string> to_consolidate_set;
for (auto& uri : fragment_uris) {
to_consolidate_set.emplace(uri);
to_consolidate_set.emplace(URI(uri).last_path_part());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be more rigorous here and validate that the removed part is either the array URI for old fragment style or array_uri + "__fragments". Let's add coverage for all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should relative fragment URIs not be allowed? I'm just thinking for that case neither of these checks would pass. The removed part of the URI would be the CWD if we do something like URI("__1719935189448_1719935189448_2eebea0af3ad535fe64ab72eee433790_22").last_path_part() for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old way of using the API should still work.

@ypatia ypatia requested a review from KiterLuc July 12, 2024 12:57
@shaunrd0 shaunrd0 force-pushed the smr/sc-49934/relative-consolidation-uris branch from fe1e779 to a5e55c1 Compare July 12, 2024 13:07

// Normalizes path separators for windows paths.
std::string array_uri = URI(array_name).to_string();
// Check for valid URI based on array format version.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use array_for_reads->array_directory().get_fragments_dir(write_version);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better, thanks!

@shaunrd0 shaunrd0 force-pushed the smr/sc-49934/relative-consolidation-uris branch from 7ba2202 to 3981759 Compare July 15, 2024 12:31
// Check for valid URI based on array format version.
auto fragments_dir = array_for_reads->array_directory().get_fragments_dir(
frag_id.array_format_version());
if (!fragment_uri.contains(fragments_dir)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uri != fragments_dir.join_path(uri.last_path_part().c_str())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be nitpicky 🙃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries 😆 done

@KiterLuc KiterLuc merged commit a7e2a07 into dev Jul 15, 2024
61 checks passed
@KiterLuc KiterLuc deleted the smr/sc-49934/relative-consolidation-uris branch July 15, 2024 14:49
shaunrd0 added a commit that referenced this pull request Jul 16, 2024
This fixes fragment consolidation to allow using absolute URIs. I ran
into this while adding a TileDB-Go binding for
`tiledb_array_consolidate_fragments` in
[SC-49723](https://app.shortcut.com/tiledb-inc/story/49723/add-tiledb-go-binding-for-tiledb-array-consolidate-fragments)
by passing URIs directly from the [FragmentInfo
APIs.](https://github.com/TileDB-Inc/TileDB-Go/pull/322/files#diff-c37e5f4dd452918ab7c467ff243d69310dbd1b238b7b717a98871f80cf0fab70R156)

---
TYPE: BUG
DESC: Fix fragment consolidation to allow using absolute URIs.

(cherry picked from commit a7e2a07)
KiterLuc pushed a commit that referenced this pull request Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants