Leverage disk for storing query states #7

DifferentSC · 2015-09-23T14:54:09Z

To process multiple queries in one machine, we need to store the states of those queries. However, memory space can be not enough for storing all the states. This case can happen in some queries like online ML (need to store many parameters).

The basic approach for this is to store its state in disk when the query is inactive, and reload it to memory and process when the data comes in. However, reading data from the disk should be slow so processing time can increase. Because of that, we need to address those things below.

Which data should stay on memory? Which criteria (query SLA, data incoming frequency, ...) should we use?
Can we apply batched processing for some cases? i.e. Stack multiple data and update the state at one go when the state is available on memory. It can reduce the number of disk read/write, but it will delay the state update.
Can we predict the data arrival time and apply pre-fetching for some queries?

DifferentSC · 2015-12-16T07:51:56Z

This issue has been splitted into SSM category, so close this issue.

DifferentSC mentioned this issue Sep 25, 2015

Implement on-demand query state loading #10

Closed

DifferentSC closed this as completed Dec 16, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leverage disk for storing query states #7

Leverage disk for storing query states #7

DifferentSC commented Sep 23, 2015

DifferentSC commented Dec 16, 2015

Leverage disk for storing query states #7

Leverage disk for storing query states #7

Comments

DifferentSC commented Sep 23, 2015

DifferentSC commented Dec 16, 2015