think about ways to let the user bias towards memory-efficient impls/algos #194

dhalperi · 2014-08-22T20:07:42Z

In general, we should have slow techniques that we believe will lead to much less memory usage.

One simple idea would be to replace every in-memory-java stateful operator (GroupBy, Join) with a Store followed by a QueryScan. This way all stateful operations [1,2] happen inside the backend DBMS.

[1] At a cursory level, it seems hard to push StatefulApply and UserDefinedAggregate into the database -- we would need a way to compile them into window functions and the dbms-specific UDA language. However @billhowe has ideas, at least about UDAs for Postgres.
[2] The other potential challenge might be pushing Myria semantics into the database expression language. E.g. floating vs integer division, etc. I believe we should be able to do this via appropriate insertion of cast operators.

dhalperi · 2014-08-22T20:08:22Z

Another good idea is let the user supply hints about physical operations. E.g., maybe they choose a SummetricHashJoin vs a MergeJoin vs a RightHashJoin.

billhowe · 2014-08-22T20:19:25Z

+1

Seems like an easy and relatively elegant way of getting out of core
processing for (almost) any plan.

And, as we move forward with making use of materialized results, all these
extra stored copies of data will not go to waste. If I keep running small
variants of the same program over and over, the partitionings we need will
tend to already be created.

On Fri, Aug 22, 2014 at 1:07 PM, Daniel Halperin notifications@github.com
wrote:

In general, we should have slow techniques that we believe will lead to
much less memory usage. The basic idea would be to replace every
in-memory-java stateful operator (GroupBy, Join) with a Store followed by a
QueryScan. This way all stateful operations [1,2] happen inside the backend
DBMS.

[1] At a cursory level, it seems hard to push StatefulApply and
UserDefinedAggregate into the database -- we would need a way to compile
them into window functions and the dbms-specific UDA language. However
@billhowe https://github.com/billhowe has ideas, at least about UDAs
for Postgres.
[2] The other potential challenge might be pushing Myria semantics into
the database expression language. E.g. floating vs integer division, etc. I
believe we should be able to do this via appropriate insertion of cast
operators.

—
Reply to this email directly or view it on GitHub
#194.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

think about ways to let the user bias towards memory-efficient impls/algos #194

think about ways to let the user bias towards memory-efficient impls/algos #194

dhalperi commented Aug 22, 2014

dhalperi commented Aug 22, 2014

billhowe commented Aug 22, 2014

think about ways to let the user bias towards memory-efficient impls/algos #194

think about ways to let the user bias towards memory-efficient impls/algos #194

Comments

dhalperi commented Aug 22, 2014

dhalperi commented Aug 22, 2014

billhowe commented Aug 22, 2014