Skip to content

v0.4.3: Python bindings for streaming EXISTS() filter

Choose a tag to compare

@bintocher bintocher released this 26 Mar 22:11
· 18 commits to main since this release

New Python functions

  • ExistsIndex.from_values(["7", "9"]) — create EXISTS index from explicit value list
  • QvdTable.filter_by_values(col, values) — filter rows by column values, returns new QvdTable
  • QvdTable.subset_rows(indices) — create sub-table from row indices
  • read_qvd_filtered(path, col, index, select=[], chunk_size=65536) — streaming filtered read with column selection (2.5x faster than Qlik Sense)

Example

import qvd

# Create EXISTS index from explicit values
idx = qvd.ExistsIndex.from_values(["7", "9"])

# Streaming filtered read — only matching rows loaded into memory
table = qvd.read_qvd_filtered(
    "large_table.qvd",
    "%Action_ID",
    idx,
    select=["%Client_ID", "Date_BK", "%Action_ID"]
)
print(f"{table.num_rows} rows x {table.num_cols} cols")

# Save result
table.save("filtered.qvd")
table.save_as_parquet("filtered.parquet", compression="zstd")

Or filter an already-loaded table:

table = qvd.read_qvd("data.qvd")
filtered = table.filter_by_values("%Action_ID", ["7", "9"])
filtered.save("output.qvd")