Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align decimals (Int128) during IPC writes to enable zero-copy reads #16007

Open
nameexhaustion opened this issue May 2, 2024 · 0 comments
Open
Labels
enhancement New feature or an improvement of an existing feature

Comments

@nameexhaustion
Copy link
Collaborator

Description

This currently isn't the case so we have to pay a copy when reading decimals.

Reproducible Example

The copy occurs here -

} else {
let mut values = vec![P::default(); num_rows];
unsafe {
std::ptr::copy_nonoverlapping(
bytes.as_ptr(),
values.as_mut_ptr() as *mut u8,
bytes.len(),
)
};
. To observe the copy, set a breakpoint / add a panic on that line, and then write/read a decimal column to an IPC file:

import polars as pl
from decimal import Decimal

df = pl.Series("x", [Decimal("1.0")], dtype=pl.Decimal(18, 2)).to_frame()
df.write_ipc(path := ".env/data.ipc")
pl.read_ipc(path)

Although, I suspect this combination of decimal / IPC usage isn't very common, and even then you'd have to be throwing a very large amount of data to observe memory issues, so to me this is a low (or maybe even goal) priority.

@nameexhaustion nameexhaustion added the enhancement New feature or an improvement of an existing feature label May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

1 participant