-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache Rewrite #49
Cache Rewrite #49
Conversation
filesystem/src/main/java/com/dropbox/android/external/fs3/filesystem/FileSystemImpl.kt
Outdated
Show resolved
Hide resolved
Wow this is incredible! Thank you so much for all the hard work. I left one comment about cache loader otherwise looks good from first glance. After the holidays we will take this through a more detailed review |
|
||
@Test | ||
fun `get(key, loader) returns existing value when an entry with the associated key is absent initially but present after executing the loader`() = | ||
runBlocking { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to use TestCoroutineScope.runBlockingTest
here to advance the delays, but it doesn't seem to wait for the coroutines to finish when I'm launching with a different dispatcher: launch(Dispatchers.Default)
which I need to be able to test concurrency. Is there a better way to test this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yigit has strategies for testing async coroutines maybe he can chime in
* the new value, the existing value won't be replaced by the new value. Instead the existing | ||
* value will be returned. | ||
*/ | ||
fun get(key: Key, loader: () -> Value): Value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as i remember, we never use this loader in Store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FileSystemImpl.kt was using the CacheLoader implementation from guava so I added this as suggested by @digitalbuddha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you document what happens if loader throws an exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets also document that loader needs to be quick (i.e it's not a suspend function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@digitalbuddha mentioned the loader could be expensive (currently used by filesystem
module).
I also don't think we should make the loader suspend
as we would have to also make the get(...)
API itself suspend
in order to execute it since the cache
module does not depend on the coroutines library.
Can we document that the loader will be executed on the caller's thread
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we want to be a replacement for Store3, I figured we should respect this api in case there are consumers of the cache only
cache/src/main/kotlin/com/dropbox/android/external/cache4/CacheBuilder.kt
Outdated
Show resolved
Hide resolved
cache/src/main/kotlin/com/dropbox/android/external/cache4/CacheBuilder.kt
Outdated
Show resolved
Hide resolved
} | ||
|
||
override fun get(key: Key, loader: () -> Value): Value { | ||
synchronized(this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets sync on an explicit object instead of this since this
can be accessed outside the cache. I realize it is internal but it still means that can be accessed within the library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not super granular, but I guess we can improve it in the future to be key based if we want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we verify if old behavior blocked per key or not? Ideally we should be granular since an app might have a single store used among many features
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I'll add key-based locks then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added key-based synchronization with new tests.
*/ | ||
internal data class CacheEntry<Key, Value>( | ||
val key: Key, | ||
@Volatile var value: Value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as i can read, these are always accessed inside a lock which will put a memory barrier hence you should not need volatile for these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I’m not sure I follow. These entries live in the ConcurrentHashMap and a couple of synchronized LinkedHashSet but we are updating these values directly (eg when replacing the value for an existing key) instead of recreating and putting a new entry into the map. And currently only get(key: Key, () -> Value): Value is protected by an explicit synchronized block to prevent calling the potentially expensive loader function multiple times concurrently. So I think those mutable fields in the CacheEntry should be the only things we need to protect from concurrent access and hence the volatile. Am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although using access to ConcurrentHashMap
as a memory barrier should be enough, we do not write Entry
back into the map after updating it so I don't believe we relay on it. If we wanted to drop Volatile
here we would probably need to make these changes.
Whatever we end up with here (either volatile or using the map as a barrier), we should add some documentation here on why we're reusing entries (to be efficient) and document the concurrency policy if it's not obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using map.put(...)
as a barrier feels a bit subtle. Should we use an explicit lock to synchronize all updates to the CacheEntry
objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've studied this a bit more and now I believe explicit lock or volatile
shouldn't be necessary :)
Will remove volatile
and add documentation about reuse.
thanks for doing this btw, looks pretty good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow nice PR. Thanks for doing all the work. I reviewed the interface and took a higher level look at the implementation and assumed that the tests covered most of the functionality. let me know if there's anything you'd like me to look at more closely.
cache/src/main/kotlin/com/dropbox/android/external/cache4/Cache.kt
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,37 @@ | |||
package com.dropbox.android.external.cache4 | |||
|
|||
interface Cache<in Key, Value> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Key
should be non-nullable, and in order to not have ambiguous get
Value
should be non-nullable too. Lets change to Cache<in Key: Any, Value: Any>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I've also updated the generic types in Store interface to be bound by Any
.
* the new value, the existing value won't be replaced by the new value. Instead the existing | ||
* value will be returned. | ||
*/ | ||
fun get(key: Key, loader: () -> Value): Value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you document what happens if loader throws an exception
cache/src/main/kotlin/com/dropbox/android/external/cache4/Cache.kt
Outdated
Show resolved
Hide resolved
cache/src/main/kotlin/com/dropbox/android/external/cache4/CacheBuilder.kt
Outdated
Show resolved
Hide resolved
* the new value, the existing value won't be replaced by the new value. Instead the existing | ||
* value will be returned. | ||
*/ | ||
fun get(key: Key, loader: () -> Value): Value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets also document that loader needs to be quick (i.e it's not a suspend function).
*/ | ||
internal data class CacheEntry<Key, Value>( | ||
val key: Key, | ||
@Volatile var value: Value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although using access to ConcurrentHashMap
as a memory barrier should be enough, we do not write Entry
back into the map after updating it so I don't believe we relay on it. If we wanted to drop Volatile
here we would probably need to make these changes.
Whatever we end up with here (either volatile or using the map as a barrier), we should add some documentation here on why we're reusing entries (to be efficient) and document the concurrency policy if it's not obvious.
} | ||
|
||
override fun get(key: Key, loader: () -> Value): Value { | ||
synchronized(this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not super granular, but I guess we can improve it in the future to be key based if we want to.
if (existingEntry != null) { | ||
// cache entry found | ||
existingEntry.recordWrite(nowNanos) | ||
existingEntry.value = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can add cacheEntries[key] = existingEntry
to create a memory barrier using the map access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason not to make the loader suspend
rather than blocking? Could be follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no CoroutineScope
to execute the suspend
function. If we made the get(key, loader)
API itself suspend
we will need to use Mutex
which is also part of kotlinx-coroutines-core
for synchronization. Or do we want to introduce coroutines library dependency to cache
?
cache/src/test/kotlin/com/dropbox/android/external/cache4/CacheLoaderTest.kt
Outdated
Show resolved
Hide resolved
… entry reuse and thread-safty assumption.
- Update docs. - Update Key, Value, Input, Output generic types to be bound by Any. - Replace spy with TestLoader for asserting invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe all comments have been addressed except whether loader should be suspend
.
Again thanks for the review!
@@ -0,0 +1,37 @@ | |||
package com.dropbox.android.external.cache4 | |||
|
|||
interface Cache<in Key, Value> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I've also updated the generic types in Store interface to be bound by Any
.
cache/src/main/kotlin/com/dropbox/android/external/cache4/Cache.kt
Outdated
Show resolved
Hide resolved
cache/src/main/kotlin/com/dropbox/android/external/cache4/Cache.kt
Outdated
Show resolved
Hide resolved
cache/src/main/kotlin/com/dropbox/android/external/cache4/CacheBuilder.kt
Outdated
Show resolved
Hide resolved
cache/src/test/kotlin/com/dropbox/android/external/cache4/CacheLoaderTest.kt
Outdated
Show resolved
Hide resolved
} | ||
|
||
override fun get(key: Key, loader: () -> Value): Value { | ||
synchronized(this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added key-based synchronization with new tests.
cache/src/main/kotlin/com/dropbox/android/external/cache4/CacheEntry.kt
Outdated
Show resolved
Hide resolved
return keyBasedLocks.getLock(key).withLock { | ||
action() | ||
}.also { | ||
keyBasedLocks.remove(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need some sort of reference counting to make this work. you would have:
thread 1 takes lock.
thread 2 gets lock from map and waits
thread 1 finishes and removes key
thread 3 comes in and does not find a lock in map. proceeds with a new lock
thread 2 resumes on old lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTTOMH I can't think of a better way than a to add a counter to each lock entry in the map and have a global lock for adding/removing entries (that is aside from using Guava's ConcurrentHashMultiset which defeats the porpuse of rewriting this cache)
val LockMapSynchronizer = Any()
val keyBasedLocks = HashMap<Key, Pair<Lock, Int>>
private fun <Key> MutableMap<Key, Pair<Lock, Int>>.getLock(key: Key): Lock {
synchronzied(lockMapSynchronizder) {
val entry = get(key)
if (entry != null) {
entry.second++
return entry.first
}
map.put(key, Pair(ReentrantLock(), AtomicInt(1)))
return get(key).first
}
}
private fun <Key> MutableMap<Key, Pair<Lock, Int>>.removeLock(key: Key) {
synchronzied(lockMapSynchronizder) {
val entry = get(key) ?: return
entry.second--
if (entry.second == 0) {
remove(key)
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively, we can take an alternative approach and not use the key as the synchronizer. we can instead use lock striping: create an array of 8/16/254/whatever entries of Lock
. hash the key and grab the lock that matches the key's hash (mod size of array). if you lazy init this array you'll need to sync that too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does guava do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell Guava doesn't synchronize loader execution by key, but relies on key hash and Segments to guard concurrent get
in general but on a less granular level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need some sort of reference counting to make this work. you would have:
thread 1 takes lock.
thread 2 gets lock from map and waits
thread 1 finishes and removes key
thread 3 comes in and does not find a lock in map. proceeds with a new lock
thread 2 resumes on old lock
Thanks for this failing case. I'll try to implement the reference counting lock map as suggested and add more tests.
* otherwise associates a new [Lock] with the specified [key] in the map and returns the new lock. | ||
*/ | ||
private fun <Key> MutableMap<Key, Lock>.getLock(key: Key): Lock { | ||
val lock: Lock? = get(key) ?: put(key, ReentrantLock()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the very least this would need to be putIfAbsent
val lock: Lock? = get(key) ?: put(key, ReentrantLock()) | |
val lock: Lock? = get(key) ?: putIfAbsent(key, ReentrantLock()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Android putIfAbsent
is only available on API 24+, so I implemented the same logic here by checking get(key)
first.
cache/src/test/kotlin/com/dropbox/android/external/cache4/testutil/ConcurrencyTestRule.kt
Outdated
Show resolved
Hide resolved
cache/src/test/kotlin/com/dropbox/android/external/cache4/testutil/ConcurrencyTestRule.kt
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're really getting close to done with this! Thank you for keeping at it. After taking a deeper look and taking memory issues into account I found 2 more issues that needs to be fixed. I think these are the final 2 issues
cache/src/main/kotlin/com/dropbox/android/external/cache4/RealCache.kt
Outdated
Show resolved
Hide resolved
cache/src/main/kotlin/com/dropbox/android/external/cache4/RealCache.kt
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I found one more bug. Caching is hard
cache/src/main/kotlin/com/dropbox/android/external/cache4/RealCache.kt
Outdated
Show resolved
Hide resolved
I thought I knew what I signed up for when I decided to do this... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the hard work!
@yigit I unblocked only needs your ok now |
merging tonight unless any qualms |
Could you resolve conflicts and then I'll merge. Thank you |
* master: fix dokka, update comments, integrate persister (MobileNativeFoundation#56) Switch POM packaging format to jar for cache, filesystem and store. (MobileNativeFoundation#54) Add an optional param to `Multicaster` to keep the fetches alive even if all downstreams are closed. (MobileNativeFoundation#40) Replace Store.fetch() on fresh() methods in ReadMe (MobileNativeFoundation#52) Properly implement multicast closing (MobileNativeFoundation#48) # Conflicts: # store/src/main/java/com/dropbox/android/external/store4/Store.kt # store/src/main/java/com/dropbox/android/external/store4/StoreBuilder.kt
Resolved conflicts. As a result of making the
|
Great job!! We will be doing our first alpha release today along with a blog post 😁 |
Resolves #14
This is a rewrite of the cache module in Kotlin. The new implementation supports sized-based and time-based (time-to-live and time-to-idle) evictions.
Cache and Builder APIs
Note that currently only APIs required by
RealStore
are supported. Please let know if more APIs are required.Usage
Tests
./gradlew cache:test
100% Test Coverage:
Migration
All usages have been switched to use the new cache implementation.
I've also converted the remaining Java files to Kotlin, except
MultiParserTest.java
which is commented out.Will update
cache/README.md
in a followup.