Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

with_context not present for pl.DataFrame #14775

Open
h4ck4l1 opened this issue Feb 29, 2024 · 5 comments
Open

with_context not present for pl.DataFrame #14775

h4ck4l1 opened this issue Feb 29, 2024 · 5 comments
Labels
enhancement New feature or an improvement of an existing feature

Comments

@h4ck4l1
Copy link

h4ck4l1 commented Feb 29, 2024

Description

Hello polars team,
Big fan. I just observed that lazyframes have the method with_context while eager ones do not.
I just wanted to know reason behind that.
Thankyou!

@h4ck4l1 h4ck4l1 added the enhancement New feature or an improvement of an existing feature label Feb 29, 2024
@h4ck4l1
Copy link
Author

h4ck4l1 commented Feb 29, 2024

It makes no difference to me though as its only a matter of calling .lazy() when using .with_context. I am just curious.

@cmdlineluser
Copy link
Contributor

I think .with_context existed as a workaround for the lack of a horizontal concat on LazyFrames.

That was recently added:

I recall reading in one of the previous issues that .with_context may end up being deprecated now?

@mcrumiller
Copy link
Contributor

mcrumiller commented Feb 29, 2024

I recall reading in one of the previous issues that .with_context may end up being deprecated now?

I don't think it should, with_context allows for aggregate operations on different sized frames, which is nice:

>>> import polars as pl
>>> ldf1 = pl.LazyFrame({"a": [1, 2, 3]})
>>> ldf2 = pl.LazyFrame({"b": [1, 2, 3, 4]})
>>> ldf1.with_context(ldf2).select(pl.all().sum()).collect()
shape: (1, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 6   ┆ 10  │
└─────┴─────┘

@h4ck4l1
Copy link
Author

h4ck4l1 commented Feb 29, 2024

oh....I didn't know that. So it was a workaround.
bro doesn't .with_context have its own unique advantages?
let me give you an example where I am kinda struggling without .with_context on dataframes/lazyframes

If I want to get a mask of different dataframe and directly plot.

A = (
    pl.DataFrame({
        "id":np.arange(1000),
        "some_ranodm_strcol":np.random.choice(a=["a","b","c"],size=1000)
    })
)

B = (
    pl.DataFrame({
        "id":np.random.choice(1000,size=500,replace=False),
        "some_ranodm_strcol":np.random.choice(a=["a","b","c"],size=500)
    })
    .sort(by="id")
)
(
    A
    .lazy()
    .with_context(
        B.lazy().select(pl.all().suffix("_b"))
    )
    .select(
        pl.col("id"),
        pl.col("id").is_in(pl.col("id_b")).cast(pl.Int8).alias("new_col")
    )
    .collect()
    .plot.line(
        x="id",
        y="new_col"
    )
)

Is there a different alternative?. I learnt .with_context in udemy's course and its quite helpful when I want to use other dataframes columns to fast visualize(though they didn't teach that)
Both frames have different sizes. so with .with_context omits the necessity to select/extract series from both the frames and do operations on them.

This is why I was curious as to why eager frames don't have .with_context. Its such a unique and novel feature compared to pandas

@h4ck4l1
Copy link
Author

h4ck4l1 commented Feb 29, 2024

I recall reading in one of the previous issues that .with_context may end up being deprecated now?

I don't think it should, with_context allows for aggregate operations on different sized frames, which is nice:

>>> import polars as pl
>>> ldf1 = pl.LazyFrame({"a": [1, 2, 3]})
>>> ldf2 = pl.LazyFrame({"b": [1, 2, 3, 4]})
>>> ldf1.with_context(ldf2).select(pl.all().sum()).collect()
shape: (1, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 6   ┆ 10  │
└─────┴─────┘

Yes exactly. and .with_context should also be extended to pl.DataFrames if theres no downsides to it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

3 participants