Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass down backend to lazy series and enable re_named_captures/2 usage #896

Merged
merged 7 commits into from Apr 21, 2024

Conversation

philss
Copy link
Member

@philss philss commented Apr 17, 2024

This is a way to enable the usage of backend functions for series inside the context of Query (in the LazySeries implementation). With that, we can use re_named_captures/2 in the functions that are using lazy series.

This is related to the following discussion: #895 (comment)

The idea is to simplify and validate the dtypes for lazy series.
The idea is to enable using it in case of any requirement.
@@ -73,10 +74,21 @@ defmodule Explorer.Backend.LazyFrame do
@impl true
def pull(df, column) do
dtype_for_column = df.dtypes[column]
series_backend = get_series_backend(df.data.original_data)

data = LazySeries.new(:column, [column], dtype_for_column)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we should probably pass the backend as a required argument here. We need to make sure we always pass it forward, otherwise something like re_named_captures(s.foo <> s.bar, ~S"...") won't work if we forget to pass the backend inside? Would this be too large of a change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm passing down based on the args - see the new/3 function at LazySeries -, which will mostly contain a lazy series. The only exceptions are from_list/2 and from_binary/2. I honestly don't know how to solve this, because there is no point of reference to get the backend. I was thinking about getting the default backend for them. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the default in c708f46


Logger.warning(
"cannot get backend from LazySeries. Using the default one: #{inspect(backend)}"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this be the case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this function I couldn't find a way to reproduce the exception. But if a series operation that accepts scalar values for all its arguments is called, it will call "from_list", and it will raise here. But this is unlikely to happen IMHO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log*

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we raise instead then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it makes sense! I will update it later today :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Co-authored-by: José Valim <jose.valim@dashbit.co>
@philss philss merged commit 8e95c47 into main Apr 21, 2024
4 checks passed
@philss philss deleted the ps-pass-down-backend-to-lazy-series branch April 21, 2024 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants