Skip to content

fix: try again to fix Miri in ParquetOpener#21680

Merged
mbutrovich merged 2 commits intoapache:mainfrom
mbutrovich:fix_miri_again
Apr 16, 2026
Merged

fix: try again to fix Miri in ParquetOpener#21680
mbutrovich merged 2 commits intoapache:mainfrom
mbutrovich:fix_miri_again

Conversation

@mbutrovich
Copy link
Copy Markdown
Contributor

@mbutrovich mbutrovich commented Apr 16, 2026

Which issue does this PR close?

Rationale for this change

#21663 removed the nested async block in PushDecoderStreamState::transition to fix a Stacked Borrows violation under miri. However, miri still flags the same violation because &mut self is still a single opaque borrow — inlining the async block alone doesn't let miri split the disjoint field borrows across the .await yield point.

Comet CI reproduces this reliably: https://github.com/apache/datafusion-comet/actions/runs/24518967597/job/71671004017?pr=3916

What changes are included in this PR?

Change PushDecoderStreamState::transition to take self by value instead of &mut self. With &mut self, the generated future stores a mutable reference, and when unfold pins and polls it, miri sees the &mut self as a single opaque borrow that conflicts across the .await yield point. With owned self, the future owns the state directly — no reference means no Stacked Borrows conflict. The struct fields are all heap-allocated or reference-counted, so the move is just pointer-sized copies, not a deep copy.

Are these changes tested?

Existing tests cover this code path. The fix is validated by miri passing in CI (the same test that currently fails: test_nested_types_extract_missing_struct_names_missing_field). We'll run Comet CI against this branch first to confirm the miri violation is resolved before merging.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the datasource Changes to the datasource crate label Apr 16, 2026
mbutrovich added a commit to mbutrovich/datafusion-comet that referenced this pull request Apr 16, 2026
@mbutrovich
Copy link
Copy Markdown
Contributor Author

Testing the fix in Comet CI: apache/datafusion-comet#3960

@mbutrovich mbutrovich self-assigned this Apr 16, 2026
@mbutrovich mbutrovich marked this pull request as ready for review April 16, 2026 17:22
@mbutrovich mbutrovich requested a review from Dandandan April 16, 2026 17:49
@mbutrovich
Copy link
Copy Markdown
Contributor Author

Miri is green in Comet CI with the latest commit: https://github.com/apache/datafusion-comet/actions/runs/24522903075/job/71685242268?pr=3960

let result = self.project_batch(&batch);
timer.stop();
return Some(result);
// Release the borrow on baseline_metrics before moving self
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Copy Markdown
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mbutrovich IMO it is good to go.

@mbutrovich mbutrovich added this pull request to the merge queue Apr 16, 2026
Merged via the queue into apache:main with commit 93ae1b8 Apr 16, 2026
31 checks passed
@mbutrovich mbutrovich deleted the fix_miri_again branch April 16, 2026 18:17
/// - [`Finished`](DecodeResult::Finished) – signals end-of-stream (`None`).
async fn transition(&mut self) -> Option<Result<RecordBatch>> {
///
/// Takes `self` by value (rather than `&mut self`) so the generated future
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Makes esne to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stacked Borrows violation in PushDecoderStreamState::transition

3 participants