Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

polars join panics if an empty list of columns to join is passed. #12732

Open
maxim-uvarov opened this issue May 2, 2024 · 3 comments
Open
Assignees
Labels
🐛 bug Something isn't working dataframe issues related to the dataframe implementation needs-triage An issue that hasn't had any proper look

Comments

@maxim-uvarov
Copy link
Contributor

Describe the bug

When polars join is used in scripts, there are situations where the empty list of columns will be passed as arguments. Currently, polars panics in such cases, but it would be better to provide an informative message indicating that the columns list is empty.

❯ [[a]; [0] [1]] | polars into-df | polars join ([[a]; [0] [1]] | polars into-df) [] []
thread 'plugin runner (primary)' panicked at /Users/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-core-0.39.2/src/chunked_array/ops/sort/arg_sort_multiple.rs:125:15:
index out of bounds: the len is 0 but the index is 0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error:   × Failed to receive response to plugin call from `polars`
   ╭─[entry #3:1:35]
 1 │ [[a]; [0] [1]] | polars into-df | polars join ([[a]; [0] [1]] | polars into-df) [] []
   ·                                   ─────┬─────
   ·                                        ╰── while waiting for this operation to complete
   ╰────
  help: try restarting the plugin with `plugin use 'polars'`

How to reproduce

[[a]; [0] [1]] | polars into-df | polars join ([[a]; [0] [1]] | polars into-df) [] []

Expected behavior

to inform the user that the empty list of columns was passed

Screenshots

No response

Configuration

key value
version 0.93.0
major 0
minor 93
patch 0
branch
commit_hash
build_os macos-aarch64
build_target aarch64-apple-darwin
rust_version rustc 1.77.2 (25ef9e3d8 2024-04-09)
rust_channel 1.77.2-aarch64-apple-darwin
cargo_version cargo 1.77.2 (e52e36006 2024-03-26)
build_time 2024-05-01 04:01:20 +00:00
build_rust_channel release
allocator mimalloc
features dataframe, default, sqlite, system-clipboard, trash, which
installed_plugins explore, image, inc, polars

Additional context

No response

@maxim-uvarov maxim-uvarov added the needs-triage An issue that hasn't had any proper look label May 2, 2024
@sholderbach sholderbach added 🐛 bug Something isn't working dataframe issues related to the dataframe implementation labels May 2, 2024
@ayax79
Copy link
Contributor

ayax79 commented May 2, 2024

A couple of thoughts with this one:

  • The behavior of polars join is going to be changing shortly as I move things to lazy only
  • Given polar's love of panicking, this is going to be an issue with lazy functions. There won't be a way to easily guard against this. What probably needs to happen is that the plugin needs a panic handler.

@ayax79 ayax79 self-assigned this May 14, 2024
@ayax79
Copy link
Contributor

ayax79 commented May 14, 2024

Full stack:

󰎔 ❯ : [[a]; [0] [1]] | polars into-df | polars join ([[a]; [0] [1]] | polars into-df) [] [] | polars collect
thread 'plugin runner (primary)' panicked at /Users/jack.wright/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-core-0.39.2/src/chunked_array/ops/sort/arg_sort_multiple.rs:125:15:
index out of bounds: the len is 0 but the index is 0
stack backtrace:
   0: rust_begin_unwind
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panicking.rs:72:14
   2: core::panicking::panic_bounds_check
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panicking.rs:208:5
   3: polars_core::chunked_array::ops::sort::arg_sort_multiple::encode_rows_vertical_par_unordered_broadcast_nulls
             at /Users/jack.wright/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-core-0.39.2/src/chunked_array/ops/sort/arg_sort_multiple.rs:125:15
   4: polars_ops::frame::join::prepare_keys_multiple
             at /Users/jack.wright/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-ops-0.39.2/src/frame/join/mod.rs:482:9
   5: polars_ops::frame::join::DataFrameJoinOps::_join_impl
             at /Users/jack.wright/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-ops-0.39.2/src/frame/join/mod.rs:268:24
   6: polars_ops::frame::join::DataFrameJoinOps::_join_impl
             at /Users/jack.wright/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-ops-0.39.2/src/frame/join/mod.rs:153:24
   7: <polars_lazy::physical_plan::executors::join::JoinExec as polars_lazy::physical_plan::executors::executor::Executor>::execute::{{closure}}
             at /Users/jack.wright/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-lazy-0.39.2/src/physical_plan/executors/join.rs:148:22
   8: polars_lazy::physical_plan::state::ExecutionState::record
             at /Users/jack.wright/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-lazy-0.39.2/src/physical_plan/state.rs:120:21
   9: <polars_lazy::physical_plan::executors::join::JoinExec as polars_lazy::physical_plan::executors::executor::Executor>::execute
             at /Users/jack.wright/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-lazy-0.39.2/src/physical_plan/executors/join.rs:86:9
  10: polars_lazy::frame::LazyFrame::collect
             at /Users/jack.wright/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-lazy-0.39.2/src/frame/mod.rs:672:9
  11: nu_plugin_polars::dataframe::values::nu_lazyframe::NuLazyFrame::collect
             at /Users/jack.wright/src/nushell/nushell/crates/nu_plugin_polars/src/dataframe/values/nu_lazyframe/mod.rs:56:9
  12: <nu_plugin_polars::dataframe::lazy::collect::LazyCollect as nu_plugin::plugin::command::PluginCommand>::run
             at /Users/jack.wright/src/nushell/nushell/crates/nu_plugin_polars/src/dataframe/lazy/collect.rs:67:29
  13: nu_plugin::plugin::serve_plugin_io::{{closure}}::{{closure}}::{{closure}}
             at /Users/jack.wright/src/nushell/nushell/crates/nu-plugin/src/plugin/mod.rs:465:21
  14: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/core/src/panic/unwind_safe.rs:272:9
  15: std::panicking::try::do_call
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/std/src/panicking.rs:554:40
  16: ___rust_try
  17: std::panicking::try
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/std/src/panicking.rs:518:19
  18: std::panic::catch_unwind
             at /rustc/25ef9e3d85d934b27d9dada2f9dd52b1dc63bb04/library/std/src/panic.rs:142:14
  19: nu_plugin::plugin::serve_plugin_io::{{closure}}::{{closure}}
             at /Users/jack.wright/src/nushell/nushell/crates/nu-plugin/src/plugin/mod.rs:462:33
  20: nu_plugin::plugin::serve_plugin_io::{{closure}}::{{closure}}
             at /Users/jack.wright/src/nushell/nushell/crates/nu-plugin/src/plugin/mod.rs:495:21
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Error:   × Failed to receive response to plugin call from `polars`
   ╭─[entry #5:1:89]
 1 │ [[a]; [0] [1]] | polars into-df | polars join ([[a]; [0] [1]] | polars into-df) [] [] | polars collect
   ·                                                                                         ───────┬──────
   ·                                                                                                ╰── while waiting for this operation to complete
   ╰────
  help: try restarting the plugin with `plugin use 'polars'`

@ayax79
Copy link
Contributor

ayax79 commented May 14, 2024

Given that this happens because of a polars panic and it now happens when collecting the dataframe, it is impossible to improve the error message. What I was able to do is ensure the panic does not kill the plugin instance, thereby causing all cached state to be lost. Fix in #12866

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working dataframe issues related to the dataframe implementation needs-triage An issue that hasn't had any proper look
Projects
None yet
Development

No branches or pull requests

3 participants