Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: cache returns old data #26

Closed
mercmobily opened this issue Jun 5, 2014 · 25 comments
Closed

Bug: cache returns old data #26

mercmobily opened this issue Jun 5, 2014 · 25 comments

Comments

@mercmobily
Copy link

Hi,

The fact that Memory returns subcollections, and the way Caching works, is creating an issue with results being cached even though the data has changed.
Here is from my Chrome console:

require(['dojo/_base/declare', 'hotplate/hotDojoExtras/dstore/Memory', 'hotplate/hotDojoExtras/dstore/Cache'], function( declare, Memory, Cache ){ D = declare, M = Memory, C = Cache } );

M1 = D( [M, C], { idProperty: 'id' } )
m = new M1()
m.setData( [ {id: 0}, {id: 1}, {id: 2}, {id: 3}, {id: 4}, {id: 5}, {id: 6} ] )
> undefined

m.fetch()
> [ Object, Object, Object, Object, Object, Object, Object]

m.sort().fetch()
> [ Object, Object, Object, Object, Object, Object, Object]

m.put( { id: 8 } )
> Object {id: 8, constructor: function, getInherited: function, isInstanceOf: function, inherited: function…}

m.sort().fetch()
> [ Object, Object, Object, Object, Object, Object, Object]
// WRONG There should be 8 results here!!! 

m.sort({ p: 1 } ).fetch()
> [ Object, Object, Object, Object, Object, Object, Object, Object]
// CORRECT -- because it's for a different query

It's entirely possible I am using the Cache wrong.
Slightly related: #25 and SitePen/dgrid#950

@brandonpayton
Copy link
Contributor

Hi @mercmobily,

This is expected behavior as Cache caches subcollection results. It does appear that we need a way to evict subcollection results from the cache. I'll leave this issue open until I have an answer about whether we will add support for that.

I am surprised to see Cache used with Memory as its typical use is with a store like dstore/Rest that does not already have all the data in memory. Why not use just Memory? That would allow you to make the same query multiple times and get updated results each time.

@brandonpayton
Copy link
Contributor

One reason I can think of for using Cache with Memory is that the same query made by multiple components can share the same query results in memory. Perhaps Cache should track cached subcollections so their results are updated as individual objects are updated.

@mercmobily
Copy link
Author

Hi,

This is expected behavior as Cache caches subcollection results.

I know. It's the expected the result, but it's not really a "good" result...

It does appear that we need a way to evict subcollection results from the cache. I'll leave this issue open until I have an answer about whether we will add support for that.

OK. Evicting might be a solution.

I am surprised to see Cache used with Memory as its typical use is with a store like dstore/Rest that does not already have all the data in memory. Why not use just Memory? That would allow you to make the same query multiple times and get updated results each time.

I know, this was the easiest way to get it tested. The problem persists if you have a Rest store though. It was just harder to "show" (you need a functioning Rest store etc.)

Merc.

@mercmobily
Copy link
Author

Hi,

One reason I can think of for using Cache with Memory is that the same query made by multiple components can share the same query results in memory.

See my other message. That is a good use case, but I wasn't as evolved as that :D

Perhaps Cache should track cached subcollections so their results are updated as individual objects are updated.

I tried that. But, it gets tricky because you can have queries like store.range(0,10).range(0,2). OK, it's a strange use-case, but it is possible. Cache would need to recursively update sub-entries -- feels like madness to me. (But, maybe I am missing something)

@brandonpayton
Copy link
Contributor

This is expected behavior as Cache caches subcollection results.

I know. It's the expected the result, but it's not really a "good" result...

@mercmobily, what would a good result look like? In my mind, either results are cached, or they are not. I might be missing something.

@mercmobily
Copy link
Author

This is expected behavior as Cache caches subcollection results.

I know. It's the expected the result, but it's not really a "good" result...

@mercmobily, what would a good result look like? In my mind, either results are cached, or they are not. I might be missing something.

Sorry, my fault. I didn't it mean it vague/nastly. I meant to say that if the main store changes its elements, then Cache really should be smart enough and return the right results.
The more I think about it, the more I am convinced that evicting is indeed the only possible way -- at least in the current implementation of Cache.

I am trying to think a common use-case: I have a cached Rest store. The app makes the first query, which gets cached. Then from then on results for that specific query are cached. If I then add an element, and want to display the data, the query is no longer cached and I will hit the server with another query to find out placement. What I wrote here #23 in the section "Observed, CACHED Rest store" (aiming at avoiding refresh, and keeping the cache "fresh") is not even doable anymore, because you'd have to reposition the element in the collection that holds the cached query, which realistically isn't gonna happen.

Which takes us to #25 -- do we really want to cache this way?

@brandonpayton
Copy link
Contributor

hi @mercmobily, I didn't think you were being mean at all. I asked what "good" meant because I didn't understand where you were coming from.

In addition to evicting subcollections, we may want to consider updating Cache to track cached subcollections so they are kept up to date with every add, put, and remove seen on the client side.

@mercmobily
Copy link
Author

I am looking into this now.

@mercmobily
Copy link
Author

@brandonpayton:

In addition to evicting subcollections, we may want to consider updating Cache to track cached subcollections so they are kept up to date with every add, put, and remove seen on the client side.

I am afraid you can't. I only discovered this after implementing it, with a recursive function and everything.

You cannot reliably do that because changing an element is likely to also change its placement. For example if you are sorting by lastName, and you do a put() on the main store, you don't want to update the cached stores because you would end up with unsorted data...

The update only really works for deletion (which is the only operation that really is going to do the same thing across the board). For put(), you would actually have to re-run the operation -- which would be 100% pointless, because you might as well clear the cache and then have it rebuilt when needed.

I am afraid I am still wondering #25 . I can totally see the point of creating subcollections for query caching --it's neat. But, we have to accept that anything other than a delete means zapping the cache. Which might be OK, but it's a pretty important assumption.

@mercmobily
Copy link
Author

BTW I discovered another problem with Memory.js: this.data and this.storage.fullData end up being different things when creating subCollections using Store._createSubCollection... I am putting a bug report together, although I am not sure if there is a reason to have this.data (again, maybe I am missing something)

@mercmobily
Copy link
Author

About my previous comment, I have just figured out that it's not a bug -- it's the obvious way it should be: there is the full data (in storage), and then there is the partial, per-query data in data (which obviously isn't copied over when create a subCollection).

This design makes it basically impossible for Cache.js to keep data up to date.

@brandonpayton
Copy link
Contributor

I am afraid you can't. I only discovered this after implementing it, with a recursive function and everything.

You cannot reliably do that because changing an element is likely to also change its placement. For example if you are sorting by lastName, and you do a put() on the main store, you don't want to update the cached stores because you would end up with unsorted data...

Observable should maintain the correct placement if your store has a queryEngine that provides a client-side equivalent of how the server applies sort and filter queries. There may be additional complexities to tracking a subcollection in Cache.

About my previous comment, I have just figured out that it's not a bug -- it's the obvious way it should be: there is the full data (in storage), and then there is the partial, per-query data in data (which obviously isn't copied over when create a subCollection).

This design makes it basically impossible for Cache.js to keep data up to date.

Between #25 and #26, it seems like we are talking about three issues:

  1. A Memory subcollection is based on its parent collection's data, which may not be up-to-date with store.storage.fullData (Is it actually good to create a new collection after sort(), range()? #25)
  2. Cache provides no way to evict cached subcollections (Bug: cache returns old data #26)
  3. Cache does not track cached subcollections so they become unnecessarily stale (Bug: cache returns old data #26)

Is that a correct understanding?

@mercmobily
Copy link
Author

OK...

Observable should maintain the correct placement if your store has a queryEngine that provides a client-side equivalent of how the server applies sort and filter queries. There may be additional complexities to tracking a subcollection in Cache.

True. Sorry, I knew this, it just escaped me for a second.

Between #25 #25 and #26 #26, it seems like we are talking about three issues:

OK let's see...

  1. A Memory subcollection is based on its parent collection's data, which may not be up-to-date with store.storage.fullData (Is it actually good to create a new collection after sort(), range()? #25 Is it actually good to create a new collection after sort(), range()? #25)

Correct.

  1. Cache provides no way to evict cached subcollections (Bug: cache returns old data #26
    Bug: cache returns old data #26)

Or to update their contents (which can only be done re-running the queries)

  1. Cache does not track cached subcollections so they become unnecessarily stale (Bug: cache returns old data #26 Bug: cache returns old data #26)

Yes.

Love it when people sum it up. I wish I could have done that with my first
ticket, without so much blabber.

Merc.

On 7 June 2014 13:41, Brandon Payton notifications@github.com wrote:

I am afraid you can't. I only discovered this after implementing it,
with a recursive function and everything.

You cannot reliably do that because changing an element is likely to also
change its placement. For example if you are sorting by lastName, and you
do a put() on the main store, you don't want to update the cached stores
because you would end up with unsorted data...

Observable should maintain the correct placement if your store has a
queryEngine that provides a client-side equivalent of how the server
applies sort and filter queries. There may be additional complexities to
tracking a subcollection in Cache.

About my previous comment, I have just figured out that it's not a bug --
it's the obvious way it should be: there is the full data (in storage), and
then there is the partial, per-query data in data (which obviously isn't
copied over when create a subCollection).

This design makes it basically impossible for Cache.js to keep data up to
date.

Between #25 #25 and #26
#26, it seems like we are
talking about three issues:

  1. A Memory subcollection is based on its parent collection's data, which
    may not be up-to-date with store.storage.fullData (Is it actually good to create a new collection after sort(), range()? #25
    Is it actually good to create a new collection after sort(), range()? #25)
  2. Cache provides no way to evict cached subcollections (Bug: cache returns old data #26
    Bug: cache returns old data #26)
  3. Cache does not track cached subcollections so they become unnecessarily
    stale (Bug: cache returns old data #26 Bug: cache returns old data #26)

Is that a correct understanding?


Reply to this email directly or view it on GitHub
#26 (comment).

@mercmobily
Copy link
Author

About this one:

Observable should maintain the correct placement if your store has a queryEngine that provides a client-side equivalent of how the server applies sort and filter queries. There may be additional complexities to tracking a subcollection in Cache.

True, but this means that when "updating the cache", you need to effectively re-run every single query, rather than just updating the item in the query cached query result. Now, that would possibly become resource intensive...

@mercmobily
Copy link
Author

So, the possible solutions are:

  • Evicting from cache when master store is changed.

Pros: no need to worry about pretty much anything, as the cache is effectively zapped
Cons: if used with a rest store, it effectively means that every time we do a put(), the next query will use the REST store

  • Updating sub-stores by hand

Pros: cache doesn't get zapped. When using a REST store, you keep querying the cache
Cons: updating the cache means re-running the queryEngine on each one of the sub-stores, and it needs to be recursive (for sub-sub stores etc.)

The old implementation of stores didn't have this issue because we only ever cached the data -- any query would then be run on the cache if available. It was a simpler implementation, although obviously much less powerful.

In my framework, I do this:

  • I preload all of the data into the cache when the app starts up
  • I update the cache on the fly when there is a comet update to that store

Basically, this way there is never a need to query the main server for queries, since the "local" data is always up to date.

I wonder if I can still do this with this new architecture. If we evict, the "no server queries" idea goes out of the window. If we update, we need to make sure that every sub-store is indeed updated (which can be tricky and time-consuming).

@mercmobily
Copy link
Author

Brandon, am I right when I state that lazy querying + range specified in fetch() would fix all of these problems?
Kris mentioned that you were addressing some of these issues in https://github.com/brandonpayton/dstore/compare/fetch-query-results, but I am trying to look at the diff without much luck ( brandonpayton/dstore@SitePen:master...fetch-query-results )

Let me see:

Lazy querying

If the memory store did lazy querying, just like the REST one does, we would end up with the following situation:

  • There would only ever be one data collection.
  • The returned sub-collections would be collections with the same data, but with different entries in the queryLog
  • When fetch()ing, the queries would be applied one after the other till the actual result would be reached
  • We would be able to "inject" data into the master store, or change it, or whatever, without ever worry about sub-collections being out of sync

Range in fetch()

This would ensure that situations like these: store.range(50,100).sort( 'name').range(0, 25).sort('surname'); wouldn't become a problem. Basically, any query becomes reproducible (both in REST and in Memory) even though there is lazy querying

Brandon, is this 100% correct? How far is https://github.com/brandonpayton/dstore/compare/fetch-query-results from this end result?

Please note that the fact that the cache returns old data is a pretty serious issue, which made me switch off any caching in my application (waiting for this to be addresses)

@kriszyp
Copy link
Contributor

kriszyp commented Jun 23, 2014

I just merged the fetch-query-results branch, can you verify that this issue is fixed?

@mercmobily
Copy link
Author

My goodness, these changes are absolutely huge. It must have been a
truckload of work. I am going to study the current code and test things
out.

On 24 June 2014 01:20, Kris Zyp notifications@github.com wrote:

I just merged the fetch-query-results branch, can you verify that this
issue is fixed?


Reply to this email directly or view it on GitHub
#26 (comment).

@mercmobily
Copy link
Author

Do you have a branch of drgid (dev-0.4) which incorporates these changes? I
am running a patched version with beforeId rather than before, for
example, but looking at the API changes I think dgrid now needs more
changes (now that range() is gone). I was just wondering if this had
already been done!

On 24 June 2014 08:56, Tony Mobily merc@mobily1.com wrote:

My goodness, these changes are absolutely huge. It must have been a
truckload of work. I am going to study the current code and test things
out.

On 24 June 2014 01:20, Kris Zyp notifications@github.com wrote:

I just merged the fetch-query-results branch, can you verify that this
issue is fixed?


Reply to this email directly or view it on GitHub
#26 (comment).

@kriszyp
Copy link
Contributor

kriszyp commented Jun 24, 2014

I was using https://github.com/kriszyp/dgrid/tree/fetch-query-results to test. I doubt it is complete in conversion to the latest, but I did enough conversion to get it to work.

@mercmobily
Copy link
Author

Sorry I am being slow. I just moved house (!) and it's been a little hectic. I can start working again tomorrow, and will get things tested out. Thanks for your patience...!

@kfranqueiro
Copy link
Contributor

dgrid's dev-0.4 branch is now updated to work with the latest dstore changes.

@mercmobily
Copy link
Author

Thanks Ken!

On 26 June 2014 22:29, Kenneth G. Franqueiro notifications@github.com
wrote:

dgrid's dev-0.4 branch is now updated to work with the latest dstore
changes.


Reply to this email directly or view it on GitHub
#26 (comment).

@kriszyp
Copy link
Contributor

kriszyp commented Jul 16, 2014

Have you had a chance to test this? Can we close this?

@mercmobily
Copy link
Author

Yesterday, actually. And yes, totally fixed on my end!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants