Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C#] Slice value buffers when writing sliced list view arrays in IPC format #41231

Closed
adamreeve opened this issue Apr 16, 2024 · 1 comment
Closed

Comments

@adamreeve
Copy link
Contributor

Describe the enhancement requested

Similar to #41225, but for list view arrays. We can slice the values buffer and compute shifted offsets and sizes to reduce IPC file sizes when writing sliced list view arrays.

Component(s)

C#

CurtHagenlocher added a commit that referenced this issue Apr 19, 2024
…ay to IPC format (#41255)

### Rationale for this change

Reduces IPC file sizes when writing sliced list view arrays.

### What changes are included in this PR?

Updates `ArrowSreamWriter` so it only writes the required range of values for a list view array, and adjusts the offset values accordingly.

### Are these changes tested?

Yes, this is covered by existing tests and I've also added a new test to verify the behaviour with list view arrays that have unordered offsets.

### Are there any user-facing changes?

Yes, this might reduce IPC file sizes for users writing sliced data.
* GitHub Issue: #41231

Lead-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: Curt Hagenlocher <curt@hagenlocher.org>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
@CurtHagenlocher CurtHagenlocher added this to the 16.1.0 milestone Apr 19, 2024
@CurtHagenlocher
Copy link
Contributor

Issue resolved by pull request 41255
#41255

raulcd pushed a commit that referenced this issue Apr 29, 2024
…ay to IPC format (#41255)

### Rationale for this change

Reduces IPC file sizes when writing sliced list view arrays.

### What changes are included in this PR?

Updates `ArrowSreamWriter` so it only writes the required range of values for a list view array, and adjusts the offset values accordingly.

### Are these changes tested?

Yes, this is covered by existing tests and I've also added a new test to verify the behaviour with list view arrays that have unordered offsets.

### Are there any user-facing changes?

Yes, this might reduce IPC file sizes for users writing sliced data.
* GitHub Issue: #41231

Lead-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: Curt Hagenlocher <curt@hagenlocher.org>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
…ew array to IPC format (apache#41255)

### Rationale for this change

Reduces IPC file sizes when writing sliced list view arrays.

### What changes are included in this PR?

Updates `ArrowSreamWriter` so it only writes the required range of values for a list view array, and adjusts the offset values accordingly.

### Are these changes tested?

Yes, this is covered by existing tests and I've also added a new test to verify the behaviour with list view arrays that have unordered offsets.

### Are there any user-facing changes?

Yes, this might reduce IPC file sizes for users writing sliced data.
* GitHub Issue: apache#41231

Lead-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: Curt Hagenlocher <curt@hagenlocher.org>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 4, 2024
…ew array to IPC format (apache#41255)

### Rationale for this change

Reduces IPC file sizes when writing sliced list view arrays.

### What changes are included in this PR?

Updates `ArrowSreamWriter` so it only writes the required range of values for a list view array, and adjusts the offset values accordingly.

### Are these changes tested?

Yes, this is covered by existing tests and I've also added a new test to verify the behaviour with list view arrays that have unordered offsets.

### Are there any user-facing changes?

Yes, this might reduce IPC file sizes for users writing sliced data.
* GitHub Issue: apache#41231

Lead-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: Curt Hagenlocher <curt@hagenlocher.org>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…ew array to IPC format (apache#41255)

### Rationale for this change

Reduces IPC file sizes when writing sliced list view arrays.

### What changes are included in this PR?

Updates `ArrowSreamWriter` so it only writes the required range of values for a list view array, and adjusts the offset values accordingly.

### Are these changes tested?

Yes, this is covered by existing tests and I've also added a new test to verify the behaviour with list view arrays that have unordered offsets.

### Are there any user-facing changes?

Yes, this might reduce IPC file sizes for users writing sliced data.
* GitHub Issue: apache#41231

Lead-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: Curt Hagenlocher <curt@hagenlocher.org>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…ew array to IPC format (apache#41255)

### Rationale for this change

Reduces IPC file sizes when writing sliced list view arrays.

### What changes are included in this PR?

Updates `ArrowSreamWriter` so it only writes the required range of values for a list view array, and adjusts the offset values accordingly.

### Are these changes tested?

Yes, this is covered by existing tests and I've also added a new test to verify the behaviour with list view arrays that have unordered offsets.

### Are there any user-facing changes?

Yes, this might reduce IPC file sizes for users writing sliced data.
* GitHub Issue: apache#41231

Lead-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: Curt Hagenlocher <curt@hagenlocher.org>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…ew array to IPC format (apache#41255)

### Rationale for this change

Reduces IPC file sizes when writing sliced list view arrays.

### What changes are included in this PR?

Updates `ArrowSreamWriter` so it only writes the required range of values for a list view array, and adjusts the offset values accordingly.

### Are these changes tested?

Yes, this is covered by existing tests and I've also added a new test to verify the behaviour with list view arrays that have unordered offsets.

### Are there any user-facing changes?

Yes, this might reduce IPC file sizes for users writing sliced data.
* GitHub Issue: apache#41231

Lead-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: Curt Hagenlocher <curt@hagenlocher.org>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants