Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

think about ways to let the user bias towards memory-efficient impls/algos #194

Open
dhalperi opened this issue Aug 22, 2014 · 2 comments
Open

Comments

@dhalperi
Copy link
Member

In general, we should have slow techniques that we believe will lead to much less memory usage.

One simple idea would be to replace every in-memory-java stateful operator (GroupBy, Join) with a Store followed by a QueryScan. This way all stateful operations [1,2] happen inside the backend DBMS.

[1] At a cursory level, it seems hard to push StatefulApply and UserDefinedAggregate into the database -- we would need a way to compile them into window functions and the dbms-specific UDA language. However @billhowe has ideas, at least about UDAs for Postgres.
[2] The other potential challenge might be pushing Myria semantics into the database expression language. E.g. floating vs integer division, etc. I believe we should be able to do this via appropriate insertion of cast operators.

@dhalperi
Copy link
Member Author

Another good idea is let the user supply hints about physical operations. E.g., maybe they choose a SummetricHashJoin vs a MergeJoin vs a RightHashJoin.

@billhowe
Copy link
Contributor

+1

Seems like an easy and relatively elegant way of getting out of core
processing for (almost) any plan.

And, as we move forward with making use of materialized results, all these
extra stored copies of data will not go to waste. If I keep running small
variants of the same program over and over, the partitionings we need will
tend to already be created.

On Fri, Aug 22, 2014 at 1:07 PM, Daniel Halperin notifications@github.com
wrote:

In general, we should have slow techniques that we believe will lead to
much less memory usage. The basic idea would be to replace every
in-memory-java stateful operator (GroupBy, Join) with a Store followed by a
QueryScan. This way all stateful operations [1,2] happen inside the backend
DBMS.

[1] At a cursory level, it seems hard to push StatefulApply and
UserDefinedAggregate into the database -- we would need a way to compile
them into window functions and the dbms-specific UDA language. However
@billhowe https://github.com/billhowe has ideas, at least about UDAs
for Postgres.
[2] The other potential challenge might be pushing Myria semantics into
the database expression language. E.g. floating vs integer division, etc. I
believe we should be able to do this via appropriate insertion of cast
operators.


Reply to this email directly or view it on GitHub
#194.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants