Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching Library Discussions #8

Open
matzuk opened this issue Nov 18, 2021 · 9 comments
Open

Caching Library Discussions #8

matzuk opened this issue Nov 18, 2021 · 9 comments

Comments

@matzuk
Copy link

matzuk commented Nov 18, 2021

Discussions about Caching Library.
First of all, thank you for your job! It's a great base for mobile system design interviews.
But, I have some suggestions/comments. I hope these things will help you. Let's go!

When I tried to design this library (before watching your solution sure) I've thought about the next picture:
image
Let me describe the main points:

  1. Cache - is a library with which a user interacts
  2. CacheDispatcher - is the same you've described in your article
  3. Job, Worker - the same. Honestly, I absolutely agree with your approach.
  4. CacheRepository - the first difference. Pattern "Repository" is one of the most recognizable patterns in mobile development when the interaction with the data layer is being designed. I suggest keeping this pattern because CacheRepository hides the implementation of a cache.
  5. CacheHeapClient, CacheFSClient. Our requirements contain work with heap and FS. So, it's better to define separate entities for them that will hide the implementation. Also, I think about the following common advice: "When you design a system, pay an attention to keywords like different features(use-cases), policies, services, filesystem, cache and etc. It's better to show these entities on your scheme even to keep SOLID principle".
  6. CacheJournal is the same as your concept. Also, we can highlight, that DB is a good choice to be a Single Source of True and a synchronization manager. Here, we can continue with a statement that SQLite (and other DB) is optimized to work in a concurrent mode following ACID principles and so on.
  7. EvictionPolicyService is working with a Journal and Clients (Heap, FS).
  8. The last. I highlighted that we work with FS, DB, and others only through wrappers that give the possibility to change implementation further.
@weeeBox
Copy link
Owner

weeeBox commented Nov 18, 2021

Thanks, @matzuk!

  • Good point on the repository pattern (although it's less popular among iOS engineers).
  • Are you advocating for storing cache items in the filesystem? I found this approach much more complex compared to database BLOB storage - mostly due to synchronization issues when you try to update the journal and files separately.
  • Why would you use "Service" in EvictionPolicyService? In my experience, "service" would most likely describe "API Service" (network requests) or Android Service (a piece of background work). The same could be applied to "Client". Maybe, I'm just missing something.
  • SOLID principles are tricky in a way that it's really easy to over-engineer your approach just for the sake of "clean" design. I believe it's OK to introduce some coupling is it makes the system simpler and more performant. For example, when the eviction policy knows about the Journal database to SQL prepared statements instead of reading all items and selecting them in a for-loop. There's another design principle to keep in mind: Yagni

Please, let me know your thoughts!

@matzuk
Copy link
Author

matzuk commented Nov 19, 2021

@weeeBox my thoughts =)

Why would you use "Service" in EvictionPolicyService? In my experience, "service" would most likely describe "API Service" (network requests) or Android Service (a piece of background work). The same could be applied to "Client". Maybe, I'm just missing something.

Hm. I used the word "Service" because I assume EvictionPolicy can be some kind of separate technical business logic that is executing somewhere (maybe in Android Service =)). It's not a typical domain like UseCase or Interactor does.
I used to use such words as "UseCase", "Service", "Client" and others to highlight the kind of the object. For me, It simplifies further reading and maintaining, it helps to understand the purpose of the object. When I read "CacheEviction" I have the very first question like "What does it mean, What is it. Maybe, it's a data model, maybe something else". When I read "BlaBlaService" I know that this one does some technical business logic related to BlaBla.
So, I agree it's very individual and speculative. Maybe, it makes sense to use another set of keywords. The most important thing here is an agreement inside the team.

Are you advocating for storing cache items in the filesystem? I found this approach much more complex compared to database BLOB storage - mostly due to synchronization issues when you try to update the journal and files separately.

I lean towards this idea because we have a deal with Heap Memory additionally. We can't save all things in Heap even during one session because Heap has some memory limits and we can catch OutOfMemory. All of this leads to working with HeapMemory through a separate Entity/Client to control it. So, it means that we can't put everything to DB, that's why it makes sense to separate all these things (FS, Heap, DB) in the beginning.
Maybe, I am overengineering here, I admit =)

@weeeBox
Copy link
Owner

weeeBox commented Nov 19, 2021

Hm. I used the word "Service" because I assume EvictionPolicy can be some kind of separate technical business logic that is executing somewhere (maybe in Android Service =))

Here are my thoughts on Android services:

  • It's really hard to find a good use-case for running EvictionPolicy in a service: especially if the service runs in a separate memory space and requires IPC for communication.
  • Google is moving away from using background services and favor deferred jobs instead of foreground services.
  • A service (as any other Android component) need to be registered in AndroidManifest and there's a hard limit on how many components your can register for each app. Believe it or not - but we reached that limit at Google Play Services 😅.

I lean towards this idea because we have a deal with Heap Memory additionally. We can't save all things in Heap even during one session because Heap has some memory limits and we can catch OutOfMemory. All of this leads to working with HeapMemory through a separate Entity/Client to control it. So, it means that we can't put everything to DB, that's why it makes sense to separate all these things (FS, Heap, DB) in the beginning.

I advocated for a simpler high-level diagram for the following reasons:

  • You don't have enough time during the interview to draw many boxes
  • Your diagram should be as simple as possible and you only add implementation details if your interviewer wants it.
  • I would not include the design or architectural patterns into the high-level diagram since it should be a 30,000 feet view without implementation details.

p.s. this is just my opinion and it's not necessarily correct. Please, let me know your thoughts!

@weeeBox
Copy link
Owner

weeeBox commented Nov 19, 2021

A side note: I have submitted the article draft to ProAndroidDev. It's not perfect but it's good enough. We can have more discussions and refine the design later on.

@matzuk
Copy link
Author

matzuk commented Nov 19, 2021

Yep, sure, I am against using Android Services too. My point was about only using the word "Service" =)
I agree with your points about simplicity. I think it's okay to change some details during an interview.

@chipbk10
Copy link

@weeeBox : how do you restore the last Memory-Cache for the next time of using CachingLibrary?

@chipbk10
Copy link

@weeeBox : can we use simply a dictionary or a map for Memory-Cache. Or we can use a Heap built based on the eviction strategy. Why do we have to use a self-balancing tree here? Could you please be more specific how this data structure looks like or how it works?

@matzuk
Copy link
Author

matzuk commented Nov 30, 2021

@weeeBox there are some comments from @chipbk10

@weeeBox
Copy link
Owner

weeeBox commented Dec 7, 2021

@weeeBox : how do you restore the last Memory-Cache for the next time of using CachingLibrary?

Most likely, you don't want to do anything special since accessing the cached elements would put them into the memory. If the user wants to "warm-up" the cache - they can access certain items shortly after the initialization.

@weeeBox : can we use simply a dictionary or a map for Memory-Cache. Or we can use a Heap built based on the eviction strategy. Why do we have to use a self-balancing tree here? Could you please be more specific how this data structure looks like or how it works?

A self-balancing tree allows in-order traversal using custom iterators. So for example, you can get the largest element, delete it, and repeat until the cache size is adjusted. A regular map won't work since you won't have a log n guarantee for selecting the largest element. A heap won't work since lookup operations would take linear time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants