-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid calling shutdown after failed write of AsyncWrite #11415
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @joroKr21 -- this change makes sense to me
cc @devinjdangelo as I think you are the expert in this area
@@ -50,7 +50,7 @@ pub(crate) async fn serialize_rb_stream_to_object_store( | |||
mut data_rx: Receiver<RecordBatch>, | |||
serializer: Arc<dyn BatchSerializer>, | |||
mut writer: WriterType, | |||
) -> std::result::Result<(WriterType, u64), (WriterType, DataFusionError)> { | |||
) -> std::result::Result<(WriterType, u64), (Option<WriterType>, DataFusionError)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not for this PR, but this signature is getting pretty gnarly. Maybe it is time to try and encapsulate some of this logic into structs 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rationale would be hard to discern only from the code. Could you please update the documentation to reflect the change of what is returned on error?
Specifically, I think it is that if there was an error on write, the writer is dropped so we don't accidentally write to it or try and close it. For any error not involving the writer, the writer is returned as well
cc @metesynnada as I remember at some point maybe you had a usecase for calling shutdown
on a writer after error 🤔 I may be misremembering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I will update the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it would be good for this PR to describe in docs what is what in return megatype and later we can factor this out into separate strict type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #11443 to track cleaning up the megatype
cd531e1
to
d6e11ad
Compare
d6e11ad
to
112d801
Compare
@@ -50,7 +50,7 @@ pub(crate) async fn serialize_rb_stream_to_object_store( | |||
mut data_rx: Receiver<RecordBatch>, | |||
serializer: Arc<dyn BatchSerializer>, | |||
mut writer: WriterType, | |||
) -> std::result::Result<(WriterType, u64), (WriterType, DataFusionError)> { | |||
) -> std::result::Result<(WriterType, u64), (Option<WriterType>, DataFusionError)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #11443 to track cleaning up the megatype
…pache#250) (apache#11415) in `serialize_rb_stream_to_object_store`
…pache#250) (apache#11415) in `serialize_rb_stream_to_object_store`
…pache#250) (apache#11415) in `serialize_rb_stream_to_object_store`
in
serialize_rb_stream_to_object_store
Which issue does this PR close?
Rationale for this change
Avoid "
async fn
resumed after completion" panics caused by trying to shutdown an errored writer.See e.g.
BufWriter
inobject_store
which holds on to an underlyingFuture
and doesn't transition the state on error. (https://github.com/apache/arrow-rs/blob/master/object_store/src/buffered.rs#L376-L379)I tried to figure out from the documentation of
AsyncWrite
whether we should be able to callpoll_shutdown
afterpoll_write
failed but I didn't manage to figure it out so I think it's better to be defensive here.What changes are included in this PR?
We make the
WriterType
returned on error optional and omit it when the error originated from the writer itself.Are these changes tested?
No, not sure how to reproduce it outside of real-world conditions, let alone test it 😢
Are there any user-facing changes?
No user-facing changes.