Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inital stab at lazy data frame scan for R #164

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

hannes
Copy link
Member

@hannes hannes commented May 8, 2024

duckdb:::sql('from iris')

now "just" works

CC @krlmlr

@hannes
Copy link
Member Author

hannes commented May 8, 2024

Needs still tests and some protect stuff I guess

Copy link
Collaborator

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice!

return nullptr;
}
if (TYPEOF(df) == PROMSXP) {
df = Rf_eval(df, rho);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpp11 has a global protection mechanism, but how to control the lifetime?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed lifetime is a problematic issue. For example, once someone creates a view of this and then deletes the df and gc runs we might be in trouble.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, support for lifetime control has landed via duckdb/duckdb#11761, which is why we now have conflicts here?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already had support for lifetime control, but it was broken in a couple ways

We use the ExternalDependency class for lifetime control, but this wasn't being kept alive long enough, which this PR aimed to address.

We changed the prototype of the replacement scan, we changed the const string &table_name into ReplacementScanInput

I have a follow up PR scheduled that removes the newly added table_ref &TableRef from this struct though
Instead we will collect replacement scans inside QueryRelation::Bind (virtual overload) and push these replacement scans as CTEs onto the parsed QueryNode stored inside the QueryRelation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Thijs. May I get back to you for help with lifetime control after that follow-up PR has made it into the R package? I don't find that PR in https://github.com/duckdb/duckdb/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-desc+author%3ATishj.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tishj: how would I go about controlling the lifetime of the bound object in this PR?

src/register.cpp Show resolved Hide resolved
return nullptr;
}
// TODO: do utf conversion
auto table_function = make_uniq<TableFunctionRef>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps subclass TableFunctionRef ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants