Skip to content

[arrow-ipc]: dictionary builders for delta - doc fix and integration tests for nested types#9853

Merged
alamb merged 5 commits intoapache:mainfrom
albertlockett:albert/delta-dict-minor-improvements
May 3, 2026
Merged

[arrow-ipc]: dictionary builders for delta - doc fix and integration tests for nested types#9853
alamb merged 5 commits intoapache:mainfrom
albertlockett:albert/delta-dict-minor-improvements

Conversation

@albertlockett
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

  • Closes #NNN.

Rationale for this change

In #9600 we added capability to call finish_preserve_values on any array builder, and it will propagate the choice to preserve dictionary values down into the nested builders. I thought it would be good to extend the integration tests we have in https://github.com/apache/arrow-rs/blob/main/arrow-ipc/tests/test_delta_dictionary.rs to cover the use cases, which was to call the method on builders with nested child builders (such as StructBuilder and ListBuilder).

While reviewing the PR for #9600 I also noticed a small issue with the docs related to using the StreamWriter with delta dictionaries, notably that we set up some options to use delta dictionaries but don't pass the options into the StreamWriter constructor:
https://docs.rs/arrow-ipc/58.1.0/arrow_ipc/writer/struct.StreamWriter.html#example---efficient-delta-dictionaries

What changes are included in this PR?

  • adds cases to the integration tests for delta dictionaries covering ListBuilder and StructBuilder
  • small docs correction for StreamWriter

Are these changes tested?

it is simply docs & tests

Are there any user-facing changes?

No

@github-actions github-actions Bot added the arrow Changes to the arrow crate label Apr 29, 2026
@albertlockett albertlockett changed the title Delta Dictionary Builder doc fix and integration tests [arrow-ipc]: dictionary builders for delta - doc fix and integration tests for nested types Apr 29, 2026
@albertlockett
Copy link
Copy Markdown
Contributor Author

@JakeDern would you mind having a look at this?

@JakeDern
Copy link
Copy Markdown
Contributor

@JakeDern would you mind having a look at this?

Looks good to me, thanks for adding the tests!

@alamb
Copy link
Copy Markdown
Contributor

alamb commented May 1, 2026

FYI @adamreichold

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @albertlockett and @JakeDern

Comment thread arrow-ipc/src/tests/delta_dictionary.rs Outdated
RecordBatch::try_new(schema.clone(), vec![Arc::new(array) as ArrayRef]).unwrap()
}

/// build batches where the dictionary array is nested within a struct array
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth also mentioning that the dict array is the first array?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, can't hurt. added in e5d900c

@alamb alamb merged commit 70e4069 into apache:main May 3, 2026
26 checks passed
@alamb
Copy link
Copy Markdown
Contributor

alamb commented May 3, 2026

Thanks @albertlockett

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants