-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-7775: Don't let safe code arbitrarily transmute readers and writers #6256
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? Then could you also rename pull request title in the following format?
See also: |
The rust code could really use some auditing of the unsafe code, a quick glance tells me that there are both unnecessary unsafe and undefined behaviour in the crate. arrow/rust/arrow/src/array/array.rs Lines 210 to 211 in 2a54de9
Packing in arrow/rust/arrow/src/array/array.rs Line 305 in 2a54de9
Undefined behaviour if Undefined behaviour if the caller passes parameters that are out of bounds. arrow/rust/arrow/src/ipc/gen/Schema.rs Lines 50 to 63 in 2a54de9
This transmute to and from values which aren't specified in the enum definition. There are most likely more, those are just the first things I saw after grepping for unsafe
|
Hi @sunchao @liurenjie1024 @sadikovi, pulling you in the loop as you're more familiar with the parquet codebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch! Could you state the motivation for the change? Was there an outage or did your library/project panic when using the API? Are you changing the code because you don't want to use unsafe
?
I think the change needs a discussion, there are a few things that stood out:
- having column reader/writer imports in data_type.rs - data types are used everywhere, I would prefer not having dependency on column reader/writer there if possible.
- I would not want to panic, and instead update API to return an option so users can resolve None cases themselves.
- I don't remember if we have unit tests for this method, would be good to add.
See the inline comments.
make_type!( | ||
BoolType, | ||
Type::BOOLEAN, | ||
BoolColumnReader, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about this. We generate column readers and writers when making types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It generates the unpacking functions needed to unpack the enum. I don't know of another, better way of doing this so 🤷♂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this is just an odd place to put them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marwes@eb1ba90 is the only alternative I know which I would argue is a lot more verbose for no gain.
rust/parquet/src/column/reader.rs
Outdated
ColumnReader::ByteArrayColumnReader(r) => unsafe { mem::transmute(r) }, | ||
ColumnReader::FixedLenByteArrayColumnReader(r) => unsafe { mem::transmute(r) }, | ||
} | ||
T::get_column_reader(col_reader).expect("Column type mismatch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your fix does not change the outcome - we can still fail by passing the wrong type, except we would directly panic now. The original function implementation is actually "safe" despite using unsafe
- there is a strict list of physical types that are allowed to be used.
Actually, I would probably return an Option<ColumnReaderImpl> instead. Same for writing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your fix does not change the outcome - we can still fail by passing the wrong type, except we would directly panic now.
Sure, but that is much better than undefined behaviour.
The original function implementation is actually "safe" despite using unsafe - there is a strict list of physical types that are allowed to be used.
That is not enough to make it safe. To transmute a ColumnReader<A>
into a ColumnReader<B>
the representations of A
and B
must be identical. A
and B
are both empty structs in this case
arrow/rust/parquet/src/data_type.rs
Line 405 in a9ef828
pub struct $name {} |
In any case it can be implemented with safe code which also has the added benefit of actually panicking if you do the wrong thing instead of continuing silently.
Actually, I would probably return an Option instead. Same for writing.
That would force callers to unwrap instead which seems redundant. Propagating the error seems unnecessarily complicated as well as it should just be a programming error (should not be visible to users of the crate).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We never transmute ColumnReader<A>
into ColumnReader<B>
. We transmute ColumnReader_A
into ColumnReaderImpl<A>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but mistakes happen and now it will panic instead of causing undefined behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- there is no "we never do X" in a public interface.
- the code should immediately panic if someone uses it in the wrong way. This should not be an
Option
because it is a programing error. - failing as undefined bahavior is worse than failing by panic. (why are even having this discussion in rust...)
- changing the return type is a backward incompatible change... WTF?
@@ -23,6 +23,8 @@ use std::mem; | |||
use byteorder::{BigEndian, ByteOrder}; | |||
|
|||
use crate::basic::Type; | |||
use crate::column::reader::{ColumnReader, ColumnReaderImpl}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could result in a circular dependency, we should not be doing that, or we should at least factor out the interfaces and structs there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust handles circular, inter crate dependencies fine 🤷♂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still does not mean we should be doing it.
Transmuting is wildly unsafe, it should only be done if absolutely necessary (it is not, see this safe implementation).
Not, but with this it has a chance to crash instead of causing undefined behavior which is vastly preferred.
Yes.
It could be changed, I guess, to have these methods on their own trait. But I believe that will just make that new trait need need to be propagate to all/most of the placed that
It is always possible to explicitly match on the enum if one wants that but sure it could be changed, I just wanted to avoid breaking the API.
The methods are generated by a macro and there is really just one way to implement the method so any test would just be identical to the implementation which seems redundant. The methods are used throughout the rest of the crate anyway so they are implicitly used. |
I am not convinced, to be honest. I understand the motivation to get rid of transmute and agree that we should remove unsafe and transmute, but I don't agree with the proposed approach, I think there is a better way to handle this case. For example, having a method on ColumnReader enum and check the type compatibility on (type, reader) pair - you would know exactly what user is trying to do so we can assert on it and fail early, similar to what you did in this patch. I would offer a new "safe" API that would return a Result/Option and convert the old one to unwrap the result and report a meaningful error message. I would also try finding (if possible) the case when the current method with transmute fails and come up with the solution that fixes it. Again, this is just my opinion and suggestions. I would like to hear what you and other reviewers think. |
rust/parquet/src/column/writer.rs
Outdated
ColumnWriter::ByteArrayColumnWriter(r) => unsafe { mem::transmute(r) }, | ||
ColumnWriter::FixedLenByteArrayColumnWriter(r) => unsafe { mem::transmute(r) }, | ||
} | ||
T::get_column_writer(col_writer).expect("Column type mismatch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you improve the error message, it is just too generic? Would be good to have something like Failed to convert column writer into a typed column writer for <data type> type
or something similar. Thanks!
I might be missing something, but if I understand it that would still need From what I can gather, the |
LGTM The fact that @Marwes had to break module dependencies to make it safe hints that either there may be design problems in the code base, or there is some lack of clarity of what modules can depend on others or not. Regardless, tighter coupling is IMO less important than an unsafe public interface. So, yes, there is work to further reduce dependencies, but IMO it can be left to a future PR specifically for that. I would also consider panics as better than undefined behavior, and, AFAI can tell, this patch does not break backward compatibility, which I am pleasantly surprised. This PR also reduces code by leveraging macros. Finally, IMO As a user, I decide if I use this lib or just use the c++/java/etc counterparts. The main trade-off between others and rust is the safety provided by the compiler in both memory management, undefined behavior. Currently, public functions to read and write data, e.g.
which IMO puts parquet-rs' interface on the same safety level as its c++ counterpart. Which is a way of saying:
Then, while going through this issue here, I read:
Isn't it obvious? providing an unsafe public interface on a language whose main differentiator is being safe is a major reason to not rely on this library in the first place. Since this is the de-facto lib for handling parquet, this is a likely reason to not use rust or parquet in the first place. Finally, please consider that how the feedback on this PR was delivered likely drove @Marwes away from further contributing to this project. Specifically,
regardless of their specific track-record in rust projects (which in this particular case is quite significant), we may want to re-evaluate how we approach PRs... |
Sorry for chiming in late. Overall I like the PR and agree that we should reduce Also @Marwes , could you create a JIRA for this? we can't merge the PR without a JIRA. |
rust/parquet/src/column/reader.rs
Outdated
} | ||
T::get_column_reader(col_reader).unwrap_or_else(|| { | ||
panic!( | ||
"Failed to convert column reader into a typed column reader for `{}` typ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace typ
with type
? same for others.
Marwes@eb1ba90 There could be another trait, but it still has to be public since the functions that call these functions/methods are public. From how |
Can we define a private module in the |
That would work, but docs would still show trait DataType : ExtractReaderWriter { } If the methods should be hidden from users then |
@jorgecarleitao I would argue the problem comes from the language limitation of not having proper inheritance rather than the project design limitation, otherwise, we could have used generics and cast. However, we do have a few things that could have been done differently and much better. Correct me if I am wrong, but the approach in PR still throws an error at runtime, there are no compile time checks. My main point is providing a safe API vs not having By the way, I still think it would be a good idea to add a test case or two. |
I think @sadikovi 's approach should also work here. Something like: impl ColumnReader {
fn get_type(&self) -> Type {
match self {
ColumnReader::BoolColumnReader(_) => Type::BOOLEAN,
...
_ => unreachable!("Impossible"),
}
}
}
...
pub fn get_typed_column_reader<T: DataType>(
col_reader: ColumnReader,
) -> ColumnReaderImpl<T> {
if col_reader.get_type() != T::get_physical_type() {
panic!("type doesn't match!")
}
...
} while it doesn't eliminate the unsafe usage, the gun is in developers' hand now and only they can shoot their own foot. The API is safe in this regard I guess? With that said, I'm fine with either approach and I do like to see this PR being merged. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. LGTM.
@Marwes the build has failed. Could you rebase so I can merge this? |
This code were quite clearly unsafe (which seems to be known, judging from the comment about it). As the functions appeared to callable indirectly from other public functions I changed these to just panic instead.
Merged. Thanks @Marwes . BTW: I still think it is a good idea to hide these reader/writer getters from public though. Perhaps we can fix that in a follow up PR. |
This removes or corrects many instances of unsafe code in the rust crates. This is by no means a complete fix and some of the fixes do not entirely fix the issues with the particular `unsafe` (see comments), but in all instances it should put the code in a better place than before. Based on #6256 Closes #6395 from Marwes/unsafe_rust_2 Authored-by: Markus Westerlind <markus.westerlind@distilnetworks.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
This code were quite clearly unsafe (which seems to be known, judging
from the comment about it). As the functions appeared to callable
indirectly from other public functions I changed these to just panic
instead to keep the API.