Caching Library Discussions #8

matzuk · 2021-11-18T17:59:28Z

Discussions about Caching Library.
First of all, thank you for your job! It's a great base for mobile system design interviews.
But, I have some suggestions/comments. I hope these things will help you. Let's go!

When I tried to design this library (before watching your solution sure) I've thought about the next picture:

Let me describe the main points:

Cache - is a library with which a user interacts
CacheDispatcher - is the same you've described in your article
Job, Worker - the same. Honestly, I absolutely agree with your approach.
CacheRepository - the first difference. Pattern "Repository" is one of the most recognizable patterns in mobile development when the interaction with the data layer is being designed. I suggest keeping this pattern because CacheRepository hides the implementation of a cache.
CacheHeapClient, CacheFSClient. Our requirements contain work with heap and FS. So, it's better to define separate entities for them that will hide the implementation. Also, I think about the following common advice: "When you design a system, pay an attention to keywords like different features(use-cases), policies, services, filesystem, cache and etc. It's better to show these entities on your scheme even to keep SOLID principle".
CacheJournal is the same as your concept. Also, we can highlight, that DB is a good choice to be a Single Source of True and a synchronization manager. Here, we can continue with a statement that SQLite (and other DB) is optimized to work in a concurrent mode following ACID principles and so on.
EvictionPolicyService is working with a Journal and Clients (Heap, FS).
The last. I highlighted that we work with FS, DB, and others only through wrappers that give the possibility to change implementation further.

The text was updated successfully, but these errors were encountered:

weeeBox · 2021-11-18T18:14:46Z

Thanks, @matzuk!

Good point on the repository pattern (although it's less popular among iOS engineers).
Are you advocating for storing cache items in the filesystem? I found this approach much more complex compared to database BLOB storage - mostly due to synchronization issues when you try to update the journal and files separately.
Why would you use "Service" in EvictionPolicyService? In my experience, "service" would most likely describe "API Service" (network requests) or Android Service (a piece of background work). The same could be applied to "Client". Maybe, I'm just missing something.
SOLID principles are tricky in a way that it's really easy to over-engineer your approach just for the sake of "clean" design. I believe it's OK to introduce some coupling is it makes the system simpler and more performant. For example, when the eviction policy knows about the Journal database to SQL prepared statements instead of reading all items and selecting them in a for-loop. There's another design principle to keep in mind: Yagni

Please, let me know your thoughts!

matzuk · 2021-11-19T03:03:52Z

@weeeBox my thoughts =)

Why would you use "Service" in EvictionPolicyService? In my experience, "service" would most likely describe "API Service" (network requests) or Android Service (a piece of background work). The same could be applied to "Client". Maybe, I'm just missing something.

Hm. I used the word "Service" because I assume EvictionPolicy can be some kind of separate technical business logic that is executing somewhere (maybe in Android Service =)). It's not a typical domain like UseCase or Interactor does.
I used to use such words as "UseCase", "Service", "Client" and others to highlight the kind of the object. For me, It simplifies further reading and maintaining, it helps to understand the purpose of the object. When I read "CacheEviction" I have the very first question like "What does it mean, What is it. Maybe, it's a data model, maybe something else". When I read "BlaBlaService" I know that this one does some technical business logic related to BlaBla.
So, I agree it's very individual and speculative. Maybe, it makes sense to use another set of keywords. The most important thing here is an agreement inside the team.

Are you advocating for storing cache items in the filesystem? I found this approach much more complex compared to database BLOB storage - mostly due to synchronization issues when you try to update the journal and files separately.

I lean towards this idea because we have a deal with Heap Memory additionally. We can't save all things in Heap even during one session because Heap has some memory limits and we can catch OutOfMemory. All of this leads to working with HeapMemory through a separate Entity/Client to control it. So, it means that we can't put everything to DB, that's why it makes sense to separate all these things (FS, Heap, DB) in the beginning.
Maybe, I am overengineering here, I admit =)

weeeBox · 2021-11-19T04:07:13Z

Hm. I used the word "Service" because I assume EvictionPolicy can be some kind of separate technical business logic that is executing somewhere (maybe in Android Service =))

Here are my thoughts on Android services:

It's really hard to find a good use-case for running EvictionPolicy in a service: especially if the service runs in a separate memory space and requires IPC for communication.
Google is moving away from using background services and favor deferred jobs instead of foreground services.
A service (as any other Android component) need to be registered in AndroidManifest and there's a hard limit on how many components your can register for each app. Believe it or not - but we reached that limit at Google Play Services 😅.

I lean towards this idea because we have a deal with Heap Memory additionally. We can't save all things in Heap even during one session because Heap has some memory limits and we can catch OutOfMemory. All of this leads to working with HeapMemory through a separate Entity/Client to control it. So, it means that we can't put everything to DB, that's why it makes sense to separate all these things (FS, Heap, DB) in the beginning.

I advocated for a simpler high-level diagram for the following reasons:

You don't have enough time during the interview to draw many boxes
Your diagram should be as simple as possible and you only add implementation details if your interviewer wants it.
I would not include the design or architectural patterns into the high-level diagram since it should be a 30,000 feet view without implementation details.

p.s. this is just my opinion and it's not necessarily correct. Please, let me know your thoughts!

weeeBox · 2021-11-19T04:11:46Z

A side note: I have submitted the article draft to ProAndroidDev. It's not perfect but it's good enough. We can have more discussions and refine the design later on.

matzuk · 2021-11-19T07:32:50Z

Yep, sure, I am against using Android Services too. My point was about only using the word "Service" =)
I agree with your points about simplicity. I think it's okay to change some details during an interview.

chipbk10 · 2021-11-23T10:09:01Z

@weeeBox : how do you restore the last Memory-Cache for the next time of using CachingLibrary?

chipbk10 · 2021-11-23T14:19:11Z

@weeeBox : can we use simply a dictionary or a map for Memory-Cache. Or we can use a Heap built based on the eviction strategy. Why do we have to use a self-balancing tree here? Could you please be more specific how this data structure looks like or how it works?

matzuk · 2021-11-30T12:10:51Z

@weeeBox there are some comments from @chipbk10

weeeBox · 2021-12-07T03:00:09Z

@weeeBox : how do you restore the last Memory-Cache for the next time of using CachingLibrary?

Most likely, you don't want to do anything special since accessing the cached elements would put them into the memory. If the user wants to "warm-up" the cache - they can access certain items shortly after the initialization.

@weeeBox : can we use simply a dictionary or a map for Memory-Cache. Or we can use a Heap built based on the eviction strategy. Why do we have to use a self-balancing tree here? Could you please be more specific how this data structure looks like or how it works?

A self-balancing tree allows in-order traversal using custom iterators. So for example, you can get the largest element, delete it, and repeat until the cache size is adjusted. A regular map won't work since you won't have a log n guarantee for selecting the largest element. A heap won't work since lookup operations would take linear time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching Library Discussions #8

Caching Library Discussions #8

matzuk commented Nov 18, 2021

weeeBox commented Nov 18, 2021

matzuk commented Nov 19, 2021

weeeBox commented Nov 19, 2021

weeeBox commented Nov 19, 2021

matzuk commented Nov 19, 2021

chipbk10 commented Nov 23, 2021

chipbk10 commented Nov 23, 2021

matzuk commented Nov 30, 2021

weeeBox commented Dec 7, 2021

Caching Library Discussions #8

Caching Library Discussions #8

Comments

matzuk commented Nov 18, 2021

weeeBox commented Nov 18, 2021

matzuk commented Nov 19, 2021

weeeBox commented Nov 19, 2021

weeeBox commented Nov 19, 2021

matzuk commented Nov 19, 2021

chipbk10 commented Nov 23, 2021

chipbk10 commented Nov 23, 2021

matzuk commented Nov 30, 2021

weeeBox commented Dec 7, 2021