You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To process multiple queries in one machine, we need to store the states of those queries. However, memory space can be not enough for storing all the states. This case can happen in some queries like online ML (need to store many parameters).
The basic approach for this is to store its state in disk when the query is inactive, and reload it to memory and process when the data comes in. However, reading data from the disk should be slow so processing time can increase. Because of that, we need to address those things below.
Which data should stay on memory? Which criteria (query SLA, data incoming frequency, ...) should we use?
Can we apply batched processing for some cases? i.e. Stack multiple data and update the state at one go when the state is available on memory. It can reduce the number of disk read/write, but it will delay the state update.
Can we predict the data arrival time and apply pre-fetching for some queries?
The text was updated successfully, but these errors were encountered:
To process multiple queries in one machine, we need to store the states of those queries. However, memory space can be not enough for storing all the states. This case can happen in some queries like online ML (need to store many parameters).
The basic approach for this is to store its state in disk when the query is inactive, and reload it to memory and process when the data comes in. However, reading data from the disk should be slow so processing time can increase. Because of that, we need to address those things below.
The text was updated successfully, but these errors were encountered: