Skip to content

Collection caching #114

@joepio

Description

@joepio

Collections are currently dynamic resources, which means that they are fully calculated when a user sends a request. That works fine, but it comes at a performance cost, since the DB must be queried.

How to cache this? How does this interact with the get_extended_resource function? How to invalidate the cache? Let's discuss some considerations.

Collections can be sorted and filtered by adding query params. These of course change the dynamic properties such as members and total_count. These should be cached seperately.

Since all changes should be done using Commits, we can perform cache invalidations while handling Commits. How does the Commit handler know which resources should be invalidated? For example, let's say I remove the firstName property with the john value from some person Resource. The person first appeared in the collections of people named john, but this collection should now be invalidated.

Invalidation approaches

Invalidate when any attribute of a resource chages

When a Collection iterates over its members, it adds the subject of the collection (including query params) to a K/V incomingLinks store where each K is a subject, and each V stands for an array of subjects that link to it. When a commit is applied to resource X, it takes subject X and opens the incomingLinks instance of that X. It then proceeds to invalidate all the V items. This will invalidate many collections that could very well result in exactly the same members, when re-run.

Use TPF index / cache

#14

If we build an index for all values, most of the expensive part is solve. It just leaves sorting - which still is expensive.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions