Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leverage disk for storing query states #7

Closed
DifferentSC opened this issue Sep 23, 2015 · 1 comment
Closed

Leverage disk for storing query states #7

DifferentSC opened this issue Sep 23, 2015 · 1 comment

Comments

@DifferentSC
Copy link
Contributor

To process multiple queries in one machine, we need to store the states of those queries. However, memory space can be not enough for storing all the states. This case can happen in some queries like online ML (need to store many parameters).

The basic approach for this is to store its state in disk when the query is inactive, and reload it to memory and process when the data comes in. However, reading data from the disk should be slow so processing time can increase. Because of that, we need to address those things below.

  • Which data should stay on memory? Which criteria (query SLA, data incoming frequency, ...) should we use?
  • Can we apply batched processing for some cases? i.e. Stack multiple data and update the state at one go when the state is available on memory. It can reduce the number of disk read/write, but it will delay the state update.
  • Can we predict the data arrival time and apply pre-fetching for some queries?
@DifferentSC
Copy link
Contributor Author

This issue has been splitted into SSM category, so close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant