Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Simplified Bytes (internal) #1099

Merged
merged 1 commit into from
Jun 25, 2022
Merged

Simplified Bytes (internal) #1099

merged 1 commit into from
Jun 25, 2022

Conversation

jorgecarleitao
Copy link
Owner

This PR decouples Bytes from everything except std.

This ports @jhorstmann's PR in arrow-rs. Although not as relevant here since we use Vec anyways, it enable us to potentially move Bytes to its own crate.

I am not fan of using Arc instead of Box - the struct itself does not need an Arc, will try to address this in a future PR.

Relevant thread in rust-users: https://users.rust-lang.org/t/implementing-a-potentially-foreign-allocated-potentially-mutable-memory-region/77388

@jorgecarleitao jorgecarleitao added the testing PRs that only increase coverage label Jun 24, 2022
@jorgecarleitao
Copy link
Owner Author

jorgecarleitao commented Jun 24, 2022

@jhorstmann I had to require Send + Sync to the owner, or Bytes would not be Send + Sync. unsafe impl Send for Bytes in arrow-rs may be unsound if we do not add Send + Sync to it.

@codecov
Copy link

codecov bot commented Jun 24, 2022

Codecov Report

Merging #1099 (0220531) into main (10b57c3) will increase coverage by 0.02%.
The diff coverage is 96.87%.

@@            Coverage Diff             @@
##             main    #1099      +/-   ##
==========================================
+ Coverage   81.23%   81.25%   +0.02%     
==========================================
  Files         367      366       -1     
  Lines       35357    35336      -21     
==========================================
- Hits        28721    28713       -8     
+ Misses       6636     6623      -13     
Impacted Files Coverage Δ
src/buffer/bytes.rs 98.11% <95.65%> (+23.97%) ⬆️
src/ffi/array.rs 89.20% <100.00%> (-0.04%) ⬇️
src/io/ipc/read/array/boolean.rs 89.58% <0.00%> (-8.34%) ⬇️
src/array/fixed_size_binary/mod.rs 87.37% <0.00%> (-0.51%) ⬇️
src/io/ipc/read/reader.rs 96.18% <0.00%> (+0.38%) ⬆️
src/array/utf8/mod.rs 82.74% <0.00%> (+0.95%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 10b57c3...0220531. Read the comment docs.

@ritchie46
Copy link
Collaborator

ritchie46 commented Jun 24, 2022

So we remove 1 layer of indirection to the data? And only decide at Drop what to do with the Vec we created? That saves a lot of branches!

P.S. which Arc are you referring to?

@jorgecarleitao
Copy link
Owner Author

So we remove 1 layer of indirection to the data? And only decide at Drop what to do with the Vec we created? That saves a lot of branches!

I believe so - although it is unclear whether the compiler was not optimizing it already.

P.S. which Arc are you referring to?

The one at Foreign(Arc<dyn RefUnwindSafe + Send + Sync>) - this arc is only here due to some constraint in the FFI that imo can go away.

@ritchie46
Copy link
Collaborator

I believe so - although it is unclear whether the compiler was not optimizing it already

I don't think it can. It could be either branch, so it has to check. I think.

The one at Foreign(Arc<dyn RefUnwindSafe + Send + Sync>) - this arc is only here due to some constraint in the FFI that imo can go away.

Right, and if we remove it, the clone call goes over FFI go that smart atomic pointer?

// This line is technically outside the assumptions of `Vec::from_raw_parts`, since
// `ptr` was not allocated by `Vec`. However, one of the invariants of this struct
// is that we do not expose this region as a `Vec`; we only use `Vec` on it to provide
// immutable access to the region (via `Vec::deref` to `&[T]`).
// MIRI does not complain, which seems to agree with the line of thought.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could also use Box<[T]> and call Vec::from_raw_parts only on get_vec on native allocated data.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this but in get_vec we would be creating a temporary Vec and returning &mut Vec, which does not compile :(

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah.. :(

@jorgecarleitao
Copy link
Owner Author

Right, and if we remove it, the clone call goes over FFI go that smart atomic pointer?

We do not know about smart pointers in FFI - in this case the idea is that we can Arc the struct containing the ref counter of the FFI struct (i.e. instead of Arc<Arc<, we use Box<Arc<, just like we did for Array.

@jorgecarleitao jorgecarleitao merged commit 88f05bb into main Jun 25, 2022
@jorgecarleitao jorgecarleitao deleted the simpler_bytes branch June 25, 2022 03:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
testing PRs that only increase coverage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants