[ZEPPELIN-3545] save all tables to ResourcePool #3024

Savalek · 2018-06-15T10:55:52Z

What is this PR for?

Now if paragraph's output contains more than one table in ResourcePool saves only last table.
It would be desirable that in ResoursePool stores all tables.

What type of PR is it?

Improvement

What is the Jira issue?

ZEPPELIN-3545

Screenshots

Questions:

Does the licenses files need update? no
Is there breaking changes for older versions? no
Does this needs documentation? no

zjffdu · 2018-06-21T04:00:47Z

Thanks @Savalek for this contribution, but I think putting all tables into ResourcePool doesn't make sense. As it would occupy lots of memory. I plan to introduce paragraph level properties (ZEPPELIN-3348), so that user can control whether to put the interpreter result into ResourcePool.

mebelousov · 2018-06-21T05:58:44Z

@zjffdu thank you about ResourcePool improving.

Share please your vision how it would be. For example, a paragraph has 5 table results. How user will define which of them would be added into ResourcePool?

zjffdu · 2018-06-21T06:24:07Z

I plan to introduce one paragraph property to indicate whether the result should be put into ResourcePool (Because I think most of time people don't want to save it into ResourcePool, so it doesn't make sense to save it into ResourcePool by default). The following is what I imagine.

%spark(saveToResourcePool=true)

...
spark code
...

Regarding your scenario of multiple tables, I am not sure the exact scenario, But at least we could introduce more fine grained properties to control that. It would be better to share your real scenario, so that we can see which approach is better.

mebelousov · 2018-06-25T14:24:26Z

@zjffdu
I support adding only selected table results to Resource Pool.
As paragraph can have multiple results than I propose to add result level properties.

zjffdu · 2018-06-26T00:39:28Z

@mebelousov There're many options for how to specify which result to be stored into resource pool.
e.g.

%spark(saveToResourcePool=1,2,4)

Or

%spark(1.saveToResourcePool=true, 2.saveToResourcePool=true, 4.saveToResourcePool=true)

We can discuss more about which is the best approach, the key point here is to allow user to customize it via paragraph levle properties.

[ZEPPELIN-3545] save all tables to ResourcePool

2209e8d

Savalek closed this Jun 27, 2018

Savalek deleted the ZEPPELIN-3545 branch January 11, 2019 09:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ZEPPELIN-3545] save all tables to ResourcePool #3024

[ZEPPELIN-3545] save all tables to ResourcePool #3024

Savalek commented Jun 15, 2018

zjffdu commented Jun 21, 2018

mebelousov commented Jun 21, 2018 •

edited

zjffdu commented Jun 21, 2018

mebelousov commented Jun 25, 2018

zjffdu commented Jun 26, 2018

[ZEPPELIN-3545] save all tables to ResourcePool #3024

[ZEPPELIN-3545] save all tables to ResourcePool #3024

Conversation

Savalek commented Jun 15, 2018

What is this PR for?

What type of PR is it?

What is the Jira issue?

Screenshots

Questions:

zjffdu commented Jun 21, 2018

mebelousov commented Jun 21, 2018 • edited

zjffdu commented Jun 21, 2018

mebelousov commented Jun 25, 2018

zjffdu commented Jun 26, 2018

mebelousov commented Jun 21, 2018 •

edited