Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: create SparseRepoData from byte slices #624

Merged
merged 3 commits into from
Apr 29, 2024

Conversation

aochagavia
Copy link
Contributor

@aochagavia aochagavia commented Apr 25, 2024

This is a breaking change, because it introduces a lifetime parameter for SparseRepoData. If that's undesirable, maybe we could create an inner type (e.g. SparseRepoData2) wrapped by two outer types (e.g. MemmappedSparseRepoData and ByteSliceSparseRepoData). We ended up using Bytes with a self-referential borrow to get rid of the lifetime entirely, so now the change is no longer breaking 🎉.

I also took the liberty to switch from stable to unstable sorting, which has better performance characteristics. If you'd rather leave things as they are now, I can drop the commit.

Fixes #619

@baszalmstra baszalmstra changed the title feat: create SparseRepoData from byte slices feat!: create SparseRepoData from byte slices Apr 25, 2024
@baszalmstra
Copy link
Collaborator

I also took the liberty to switch from stable to unstable sorting, which has better performance characteristics. If you'd rather leave things as they are now, I can drop the commit.

Did you profile this with a real-world scenario? Most repodata.json is already sorted but just on a slightly different key therefor I didn't think this would matter much.

Im also a little hesitant to add randomness to the repodata.json. I don't think the order of the records in the file matters much but I am not completely sure.

This is a breaking change, because it introduces a lifetime parameter for SparseRepoData.

Look at the changes, can we use a 'static lifetime for the memory mapped data, which would still allow use to pass it around in Arc<SparseRepoData<'static>>s?

@aochagavia
Copy link
Contributor Author

Look at the changes, can we use a 'static lifetime for the memory mapped data, which would still allow use to pass it around in Arc<SparseRepoData<'static>>s?

That's indeed possible (in fact, I had to manually specify the 'static lifetime somewhere and it worked). On the other hand, it might be more user friendly to keep SparseRepoData as it is now and introduce an additional ByteSliceRepoData<'a>. Both structs would internally use the same code, and users would be spared the pain of updating and having to specify lifetimes everywhere (which might even be confusing for people). What do you think?

Did you profile this with a real-world scenario? Most repodata.json is already sorted but just on a slightly different key therefor I didn't think this would matter much.

I haven't profiled it. It's just that the stable version allocates half the size of self, according to the docs, so it feels "more correct". The docs even recommend using sort_unstable if you can get away with it (unless you really need a stable sort, that is). Also good to know, the sort is deterministic (it uses a constant random seed).

I don't think it really matters in practice, so let me know if you'd rather have me drop the commit 😉

@baszalmstra
Copy link
Collaborator

Both structs would internally use the same code, and users would be spared the pain of updating and having to specify lifetimes everywhere (which might even be confusing for people). What do you think?

Lets do that, it feels more ergonomic.

I don't think it really matters in practice, so let me know if you'd rather have me drop the commit.

I'm also fairly certain the order doesn't matter. So let's just leave it in. 🙉

@aochagavia aochagavia changed the title feat!: create SparseRepoData from byte slices feat: create SparseRepoData from byte slices Apr 26, 2024
@aochagavia
Copy link
Contributor Author

Rebased and figured out how to make the change non-breaking. This is now ready for review!

Copy link
Collaborator

@baszalmstra baszalmstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One tiny little comment but otherwise LGTM

crates/rattler_repodata_gateway/Cargo.toml Outdated Show resolved Hide resolved
Co-authored-by: Bas Zalmstra <zalmstra.bas@gmail.com>
@baszalmstra baszalmstra enabled auto-merge (squash) April 29, 2024 10:13
@baszalmstra baszalmstra enabled auto-merge (squash) April 29, 2024 10:14
@baszalmstra baszalmstra merged commit 526cd23 into conda:main Apr 29, 2024
14 checks passed
@baszalmstra baszalmstra mentioned this pull request Apr 26, 2024
baszalmstra added a commit to baszalmstra/rattler that referenced this pull request Apr 29, 2024
Fixes conda#619

---------

Co-authored-by: Bas Zalmstra <zalmstra.bas@gmail.com>
tdejager pushed a commit that referenced this pull request Apr 30, 2024
## 🤖 New release
* `rattler_networking`: 0.20.3 -> 0.20.4 (✓ API compatible changes)
* `rattler_lock`: 0.22.3 -> 0.22.4 (✓ API compatible changes)
* `rattler_repodata_gateway`: 0.19.9 -> 0.19.10 (✓ API compatible
changes)
* `rattler_index`: 0.19.8 -> 0.19.9 (✓ API compatible changes)
* `rattler`: 0.23.1 -> 0.23.2
* `rattler_package_streaming`: 0.20.6 -> 0.20.7

<details><summary><i><b>Changelog</b></i></summary><p>

## `rattler_networking`
<blockquote>

##
[0.20.4](rattler_networking-v0.20.3...rattler_networking-v0.20.4)
- 2024-04-30

### Other
- bump py-rattler 0.5.0
([#629](#629))
</blockquote>

## `rattler_lock`
<blockquote>

##
[0.22.4](rattler_lock-v0.22.3...rattler_lock-v0.22.4)
- 2024-04-30

### Added
- adds pypi indexes to the lock-file
([#626](#626))
</blockquote>

## `rattler_repodata_gateway`
<blockquote>

##
[0.19.10](rattler_repodata_gateway-v0.19.9...rattler_repodata_gateway-v0.19.10)
- 2024-04-30

### Added
- create SparseRepoData from byte slices
([#624](#624))
</blockquote>

## `rattler_index`
<blockquote>

##
[0.19.9](rattler_index-v0.19.8...rattler_index-v0.19.9)
- 2024-04-30

### Other
- release ([#625](#625))
</blockquote>

## `rattler`
<blockquote>

##
[0.23.2](rattler-v0.23.1...rattler-v0.23.2)
- 2024-04-30

### Other
- updated the following local packages: rattler_networking
</blockquote>

## `rattler_package_streaming`
<blockquote>

##
[0.20.7](rattler_package_streaming-v0.20.6...rattler_package_streaming-v0.20.7)
- 2024-04-30

### Other
- updated the following local packages: rattler_networking
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/MarcoIeni/release-plz/).

Signed-off-by: Bas <4995967+baszalmstra@users.noreply.github.com>
Co-authored-by: Bas <4995967+baszalmstra@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: create SparseRepoData from memory
2 participants