-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Insights: pola-rs/polars
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
py-1.24.0 Python Polars 1.24.0
published
Mar 2, 2025
65 Pull requests merged by 11 people
-
perf: Box large enum variants
#21657 merged
Mar 8, 2025 -
fix: Casting Struct to String panics if n_chunks > 1
#21656 merged
Mar 8, 2025 -
fix: Fix
Future attached to different loop
error onread_database_uri
#21641 merged
Mar 7, 2025 -
chore: Update rustc
#21647 merged
Mar 7, 2025 -
build: Bump ring from 0.17.11 to 0.17.13
#21649 merged
Mar 7, 2025 -
perf: Improve join performance for new-streaming engine
#21620 merged
Mar 7, 2025 -
refactor(rust): Remove
once_cell
in favor ofstd
equivalents#21639 merged
Mar 7, 2025 -
perf: Pre-fill caches
#21646 merged
Mar 7, 2025 -
perf: Optimize only a single cache input
#21644 merged
Mar 7, 2025 -
refactor(rust): Remove unused flag
#21642 merged
Mar 7, 2025 -
fix: Fix deadlock in cache + hconcat
#21640 merged
Mar 7, 2025 -
perf: Collect parquet statistics in one contiguous buffer
#21632 merged
Mar 7, 2025 -
fix: Properly handle phase transitions in row-wise sinks
#21600 merged
Mar 7, 2025 -
refactor: Support object from chunks
#21636 merged
Mar 7, 2025 -
docs(python): Document
read_().lazy()
antipattern#21623 merged
Mar 6, 2025 -
refactor(rust): Rename
block_on_potential_spawn
toblock_in_place_on
#21627 merged
Mar 6, 2025 -
feat: Enable new streaming memory sinks by default
#21589 merged
Mar 6, 2025 -
ci: Push versioned docs on workflow dispatch
#21630 merged
Mar 6, 2025 -
ci: Fail docs early
#21629 merged
Mar 6, 2025 -
ci: Check major/minor in docs
#21626 merged
Mar 6, 2025 -
ci: Add docs workflow
#21624 merged
Mar 6, 2025 -
fix: Always use global registry for object
#21622 merged
Mar 6, 2025 -
feat: Cloud support for new-streaming scans and sinks
#21621 merged
Mar 6, 2025 -
fix: Check enum categories when reading csv
#21619 merged
Mar 6, 2025 -
feat: Add len method to arr
#21618 merged
Mar 6, 2025 -
refactor(rust): Introduce
Writeable
andAsyncWriteable
#21599 merged
Mar 6, 2025 -
test: Add test for 21581
#21617 merged
Mar 6, 2025 -
docs: Update Polars Cloud interactive workflow examples
#21609 merged
Mar 6, 2025 -
perf: Update Cargo.lock (mainly for zstd 1.5.7)
#21612 merged
Mar 6, 2025 -
fix: Unspecialized prefiltering on nullable arrays
#21611 merged
Mar 6, 2025 -
refactor: Remove even more parquet multiscan handling
#21601 merged
Mar 5, 2025 -
fix(python): Release the gil on explain
#21607 merged
Mar 5, 2025 -
fix: Take into account scalar/partitioned columns in DataFrame::split_chunks
#21606 merged
Mar 5, 2025 -
feat: Closeable files on unix
#21588 merged
Mar 5, 2025 -
fix: Bad null handling in unordered row encoding
#21603 merged
Mar 5, 2025 -
docs(python): Add a
Plotnine
example to the visualization docs#21597 merged
Mar 5, 2025 -
fix: Fix deadlock in new streaming CSV / NDJSON sinks
#21598 merged
Mar 5, 2025 -
fix: Bad view index in BinaryViewBuilder
#21590 merged
Mar 4, 2025 -
perf: Don't maintain order when maintain_order=False in new streaming sinks
#21586 merged
Mar 4, 2025 -
refactor(rust): Add freeze_reset to the builders
#21587 merged
Mar 4, 2025 -
refactor: Remove multiscan handling from new streaming parquet source
#21584 merged
Mar 4, 2025 -
feat: Add new
PartitionMaxSize
sink#21573 merged
Mar 4, 2025 -
refactor(rust): Add opt_gather and extend_nulls to builders
#21582 merged
Mar 4, 2025 -
refactor(rust): Avoid downloading full parquet when initializing new streaming parquet source
#21580 merged
Mar 4, 2025 -
feat(rust): Implement
unpack_dtypes()
functionality with unit tests#21574 merged
Mar 4, 2025 -
feat(python,rust): Support engine callback for
LazyFrame.profile
#21534 merged
Mar 4, 2025 -
feat: Dispatch new-streaming CSV negative slice to separate node
#21579 merged
Mar 4, 2025 -
fix: Fix CSV count with comment prefix skipped empty lines
#21577 merged
Mar 4, 2025 -
refactor: Prepare skeleton for partitioning sinks
#21536 merged
Mar 3, 2025 -
fix: New streaming IPC enum scan
#21570 merged
Mar 3, 2025 -
perf: Pre-sort groups in group-by-dynamic
#21569 merged
Mar 3, 2025 -
refactor(rust): Add SeriesBuilder and DataFrameBuilder
#21567 merged
Mar 3, 2025 -
feat: Add NDJSON source to new streaming engine
#21562 merged
Mar 3, 2025 -
fix: Several aspects related to ParquetColumnExpr
#21563 merged
Mar 3, 2025 -
docs: Add cloud api reference to Ref guide
#21566 merged
Mar 3, 2025 -
fix: Don't hit parquet::pre-filtered in case of pre-slice
#21565 merged
Mar 3, 2025 -
ci: bump crate-ci/typos from 1.29.0 to 1.30.0 in the ci group
#21542 merged
Mar 3, 2025 -
build: bump the rust group across 1 directory with 10 updates
#21561 merged
Mar 3, 2025 -
feat(python): Support passing
token
instorage_options
for GCP cloud#21560 merged
Mar 3, 2025 -
Python Polars 1.24
#21555 merged
Mar 2, 2025 -
feat: Add lossy decoding to
read_csv
for non-utf8 encodings#21433 merged
Mar 2, 2025 -
docs: Fix typo
#21554 merged
Mar 2, 2025 -
fix: Categorical min/max panicking when string cache is enabled
#21552 merged
Mar 2, 2025 -
docs(python): Correct typos and grammar in Python docstrings
#21524 merged
Mar 1, 2025
3 Pull requests opened by 3 people
-
perf: Add `.list.struct_field` to obtain fields in lists of structs
#21556 opened
Mar 2, 2025 -
docs: Add polars-streaming-csv-decompression IO plugin
#21628 opened
Mar 6, 2025 -
perf: Allow elementwise functions in recursive lowering
#21653 opened
Mar 7, 2025
28 Issues closed by 9 people
-
PanicException during concat_str with Struct and String
#21650 closed
Mar 8, 2025 -
Expression dependency, injection, and introspection: List columns referenced by an expression.
#21655 closed
Mar 8, 2025 -
comm_subplan_elim with concat can be very slow to optimize and slow to run
#21637 closed
Mar 7, 2025 -
consolidate _utils and utils in py-polars/polars
#21625 closed
Mar 7, 2025 -
Hang (deadlock?) with horizontal concat and filter
#21616 closed
Mar 7, 2025 -
add df methods that are just like `.item()` but indicate datatype
#21635 closed
Mar 6, 2025 -
Add compat_level to sink_ipc
#19506 closed
Mar 6, 2025 -
New streaming quietly ignores failing source when sinking
#21527 closed
Mar 6, 2025 -
`pl.read_csv()` does not enforce `pl.Enum()` columns in schema argument
#20251 closed
Mar 6, 2025 -
Add 'snappy' compression option for `write_ipc` method
#21594 closed
Mar 6, 2025 -
Unable to deserialize `PyLazyFrame` and `PyExpr` back to Python from Rust since py-polars 1.18
#21605 closed
Mar 5, 2025 -
Deadlock in LazyFrame.explain() when using scan_csv on S3. Author thinks it's the GIL
#21482 closed
Mar 5, 2025 -
PanicException when combining filter, negative shift, and literal columns
#21581 closed
Mar 5, 2025 -
PanicException when printing/storing DataFrame after GroupBy with new Streaming engine
#21593 closed
Mar 5, 2025 -
Native Hex String Encoding in Polars
#21592 closed
Mar 4, 2025 -
More string sub-types to better cover the arrow standard
#21591 closed
Mar 4, 2025 -
Support GPU execution engine in LazyFrame profiler
#20039 closed
Mar 4, 2025 -
CSV select(len) does not count empty lines when comment prefix specified
#21576 closed
Mar 4, 2025 -
`scan_ipc` broken for Enum on new streaming engine
#21564 closed
Mar 3, 2025 -
Add NDJSON source for new-streaming engine
#21356 closed
Mar 3, 2025 -
Parquet `parallel=prefiltered` does not support `pre_slice`s
#21558 closed
Mar 3, 2025 -
Cannot pass GCP access token in storage_options (Google Cloud)
#13138 closed
Mar 3, 2025 -
[scan_parquet] Add the token to storage_options directly when trying to access to the GCS
#21417 closed
Mar 3, 2025 -
Simple operation gives the max hexadecimal for U32
#21540 closed
Mar 2, 2025 -
Documentation for polars.Expr.is_in / polars.Expr.ise seems broken.
#21539 closed
Mar 2, 2025 -
Missing thirdparty license information in PyPI Wheel Package
#21398 closed
Mar 2, 2025
26 Issues opened by 22 people
-
StructFieldNotFoundError not raised when accessing an invalid field in a list of structs
#21658 opened
Mar 8, 2025 -
.unique().collect(new_streaming=True) gives inconsistent results
#21654 opened
Mar 7, 2025 -
set_ascii_tables=True outputs non-ASCII characters
#21652 opened
Mar 7, 2025 -
resolve large enum variants
#21651 opened
Mar 7, 2025 -
Remove `engine="old-streaming"`
#21648 opened
Mar 7, 2025 -
Add `LazyFrame.branch()`
#21645 opened
Mar 7, 2025 -
Polars plugins should be referenced by .venv relative path
#21643 opened
Mar 7, 2025 -
Doing an empty aggregation may panic depending on the groups and the number of rows
#21634 opened
Mar 6, 2025 -
Can't sort dataframe by multiple columns if any sort column is list or array
#21633 opened
Mar 6, 2025 -
lazy optimization to move filter on enum values before casting to enum
#21615 opened
Mar 5, 2025 -
improve `null values in "by" column are not yet supported in "interpolate_by" expression` error message
#21614 opened
Mar 5, 2025 -
Add `to_binary` methods for integer types
#21613 opened
Mar 5, 2025 -
pl.Expr.get(0) behaves differently from pl.Expr.first()
#21610 opened
Mar 5, 2025 -
ppc64cle installer on conda-forge
#21608 opened
Mar 5, 2025 -
dt.offset_by fails if intermediate timestamp is non-existent
#21604 opened
Mar 5, 2025 -
df.write_parquet with partition_by should have a option to exclude partition columns in the output file
#21602 opened
Mar 5, 2025 -
Pow operator takes the type of a literal base over the type of the column exponent
#21596 opened
Mar 5, 2025 -
pl.lit no longer preserve tzinfo for datetime object
#21595 opened
Mar 4, 2025 -
What should happen when `ColumnNameOrSelector` receives an expression?
#21585 opened
Mar 4, 2025 -
Flaky test categorical_min_max
#21583 opened
Mar 4, 2025 -
PanicException when calling pl.repeat in group_by of empty frame
#21578 opened
Mar 4, 2025 -
Mistyped return type annotation for dtypes
#21575 opened
Mar 4, 2025 -
`scan_delta` does not accept `pathlib.Path`
#21572 opened
Mar 3, 2025 -
Silent failures with deeply nested rolling operations in lazy mode
#21571 opened
Mar 3, 2025 -
Include field names in field-related error messages
#21568 opened
Mar 3, 2025 -
`.get().dt.truncate()` throws SchemaMismatch error
#21553 opened
Mar 1, 2025
40 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
feat(rust,python): Add a config option to specify the default engine to attempt to use during lazyframe `collect` calls
#20717 commented on
Mar 7, 2025 • 10 new comments -
feat(python): Support pre_execution_query parameter from connectorx
#21288 commented on
Mar 6, 2025 • 4 new comments -
fix: Support List/Array in search_sorted(), and fix edge case where length 1 Series of array couldn't use index_of()
#21266 commented on
Mar 3, 2025 • 1 new comment -
Tracking issue for missing methods in `arr` namespace
#21302 commented on
Mar 6, 2025 • 0 new comments -
Error when writing Parquet from scan_csv with encoded files (iso-8859-1 to utf-8)
#20834 commented on
Mar 6, 2025 • 0 new comments -
Enable `new_streaming` under `collect_async` (python)
#21531 commented on
Mar 6, 2025 • 0 new comments -
Panic in CSV serializer
#20273 commented on
Mar 6, 2025 • 0 new comments -
Add cast from pl.List(pl.UInt8) to pl.Binary
#21549 commented on
Mar 6, 2025 • 0 new comments -
Serialization / ipc_stream roundtrip errors with ComputeError: out-of-spec: ExpectedBuffer
#21163 commented on
Mar 6, 2025 • 0 new comments -
Set bit-size of int dtype of .hash()
#4111 commented on
Mar 7, 2025 • 0 new comments -
pl.len() high memory usage
#21469 commented on
Mar 7, 2025 • 0 new comments -
read_database since 1.21 with async oracledb throwing "got Future <Future pending> attached to a different loop"
#21263 commented on
Mar 7, 2025 • 0 new comments -
Cannot call `.to_arrow()` on series on StructArray with offset
#19612 commented on
Mar 7, 2025 • 0 new comments -
`failed to determine supertype of datetime[ns, UTC] and datetime[ns]` for when-then-otherwise expression that yields NULL
#12959 commented on
Mar 7, 2025 • 0 new comments -
Move `assert_series_equal` and `assert_frame_equal` implementation to Rust
#21388 commented on
Mar 7, 2025 • 0 new comments -
rust-polars SQL read and write
#3540 commented on
Mar 7, 2025 • 0 new comments -
Tracking issue for the new streaming engine
#20947 commented on
Mar 8, 2025 • 0 new comments -
`group_by().agg()` may cause panic when the number of rows in the original data frame is 15 or more
#19337 commented on
Mar 8, 2025 • 0 new comments -
fix(rust): Too-strict SQL UDF schema validation
#20202 commented on
Mar 7, 2025 • 0 new comments -
fix: Allow init from BigQuery Arrow data containing ExtensionType cols with irrelevant metadata
#21492 commented on
Mar 4, 2025 • 0 new comments -
Update with a single value with object dtype column PanicException
#21547 commented on
Mar 1, 2025 • 0 new comments -
`scan_parquet` push down hive filtering breaks on categorical columns
#21532 commented on
Mar 1, 2025 • 0 new comments -
row-wise `is_in` computations do not propagate nulls
#21485 commented on
Mar 2, 2025 • 0 new comments -
assert_series_equal and assert_frame_equal are inconsistent
#18389 commented on
Mar 3, 2025 • 0 new comments -
Tracking issue for Polars Cloud
#21487 commented on
Mar 3, 2025 • 0 new comments -
Categorical hash only hashes underlying physical
#21533 commented on
Mar 3, 2025 • 0 new comments -
`hash()` list of string PanicException
#21523 commented on
Mar 3, 2025 • 0 new comments -
`scan_parquet` + `filter` on `S3` with Hive schema `pl.Date` breaks
#21526 commented on
Mar 3, 2025 • 0 new comments -
`with_columns(dict)` silently ignores dictionary values
#17879 commented on
Mar 3, 2025 • 0 new comments -
Allow to use `repeat_by` with lists
#21151 commented on
Mar 3, 2025 • 0 new comments -
Add binary string slicing
#21514 commented on
Mar 4, 2025 • 0 new comments -
Parquet scanner doesn't do predicate pushdown for categoricals/enums
#18868 commented on
Mar 4, 2025 • 0 new comments -
`rolling` and `agg` on time with an empty `DataFrame` throws an `InvalidOperationError`
#21457 commented on
Mar 5, 2025 • 0 new comments -
`scan_parquet` runs forever on AWS airflow - version 1.17
#20330 commented on
Mar 5, 2025 • 0 new comments -
Typing error with SchemaDict
#14468 commented on
Mar 5, 2025 • 0 new comments -
Cache HTTP requests
#15649 commented on
Mar 5, 2025 • 0 new comments -
write_database is Significantly slower than .to_pandas().to_sql()
#7852 commented on
Mar 5, 2025 • 0 new comments -
`polars` and `polars-lts-cpu` as package dependencies
#12880 commented on
Mar 5, 2025 • 0 new comments -
Allow arithmetic between `Time` and `Duration` columns
#7972 commented on
Mar 5, 2025 • 0 new comments -
Request for skim function in R
#20204 commented on
Mar 5, 2025 • 0 new comments