Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cached user states in MySQL layer #169

Open
eozturk1 opened this issue Mar 9, 2022 · 5 comments
Open

Use cached user states in MySQL layer #169

eozturk1 opened this issue Mar 9, 2022 · 5 comments
Assignees

Comments

@eozturk1
Copy link
Contributor

eozturk1 commented Mar 9, 2022

get_user_state currently is not using the cache and rather queries the requested data (according to the filters) from MySQL.

We should consider caching user states and implementing a filtering mechanism. For instance, for user states that are in the cache if they match the filter we can directly use them. But for the rest, we'd need to go to MySQL storage and not request the ones already in cache.

@eozturk1 eozturk1 self-assigned this Mar 9, 2022
@slawlor
Copy link
Contributor

slawlor commented Mar 9, 2022

So the big problem here (and the reason we don't have caching atm in this call) is that most of the queries we do are

.storage
  .get_user_state(&uname, ValueStateRetrievalFlag::LeqEpoch(epoch))
  .await

in other words "Retrieve the user's value state where the epoch is <= this target epoch". However if a new state is entered in the DB, and we add a cache, how would we detect a cache miss? We'd have to pass through the DB anyways to know if there a more up-to-date entry. In the bulk lookup generation, we'll really want to give a vector of user id's which we can also do with a single query (generally). So something like

.storage
  .get_users_states(users, ValueStateRetrievalFlag::LeqEpoch(epoch))
  .await

where users is a vector of the AkdLabels to retrieve for.

@slawlor
Copy link
Contributor

slawlor commented Mar 9, 2022

Caching helps when we're doing specific get operations, but this type of filtered scan is near impossible

@eozturk1
Copy link
Contributor Author

eozturk1 commented Mar 9, 2022

Just ideating... Could we use the same MySQL query but exclude the value states in the cache by using a similar temp table creation for batch_get?

@slawlor
Copy link
Contributor

slawlor commented Mar 10, 2022

So yeah I think I understand what you mean. We would indeed take the batch of user id's and utilize a temp-table over a select * join (in mysql that is). And we'd just need the single epoch argument. Something like

SELECT * FROM `users` WHERE `epoch` < :epoch AND `user` IN (SELECT * FROM `temp_table`)

Or something like that

@slawlor
Copy link
Contributor

slawlor commented Nov 24, 2022

I think this might be addressed with #269 now because it's handling additions and management of elements in the transaction log. We could definitely add more tests around this logic, but it should synchronize the queries and make sure the most up-to-date elements are picked. I still think we need to go to the DB however, and can't trust the cache to have enough information, but we can use the transaction log to fill out potentially future information, or not fully committed information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants