Skip to content
This repository has been archived by the owner on Feb 29, 2024. It is now read-only.

Caching POA #1454

Merged
merged 8 commits into from May 10, 2019
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
103 changes: 103 additions & 0 deletions docs/design/013-caching/README.md
@@ -0,0 +1,103 @@
# Caching of data from ledger

Currently whenever credential definitions and/or schemas is needed, it is being fetched from the ledger.
This operation may last multiple seconds and is slowing down usage of credentials.
Caching also enables usage of anoncreds in areas where user do not have internet coverage (eg. Using passport credential on foreign airport).

## Goals and ideas

* Allow users to cache credential definitions and schemas.
* Local wallet to be used because although this data is public, possession of some credential definition or schema reveals possession of respective credential.
* Provide higher level api for fetching this data so it is easier to use.
* Caching should be transparent to the user.
* Enable purging of old (not needed more) data.

## Public API

Note: In all calls `pool_handle` may be removed if did resolver is implemented.

```Rust
/// Gets credential definition json data for specified credential definition id.
/// If data is present inside of cache, cached data is returned.
/// Otherwise data is fetched from the ledger and stored inside of cache for future use.
///
/// #Params
/// command_handle: command handle to map callback to caller context.
/// pool_handle: pool handle (created by open_pool_ledger).
/// wallet_handle: wallet handle (created by open_wallet).
/// submitter_did: DID of the submitter stored in secured Wallet.
/// id: identifier of credential definition.
/// cb: Callback that takes command result as parameter.
#[no_mangle]
pub extern fn indy_get_cred_def(command_handle: IndyHandle,
pool_handle: IndyHandle,
wallet_handle: IndyHandle,
submitter_did: *const c_char,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I suggest to remove submitter_did param as it can be hard-coded or auto-generated for each request
  2. Do we need to return cache timestamp also?
  3. I suggest to add options_json that will allow to:
  • Skip caching
  • Provide required freshness

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Skip caching have sense in a meaning do not store retrieved value inside of cache.
    For freshness, I think that schema and credential_def cannot be updated (without change of id) so freshness is not needed. For that reason I do not see need for cache timestamp.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my understanding CredDef and Schema can be updated. Schema can get backward compatible changes. Keys in CredDef can be rotated in some cases. Also some entities we plan to cache like DID Doc can be updated much often. @ashcherbakov What do you think?

id: *const c_char,
cb: Option<extern fn(command_handle_: IndyHandle,
err: ErrorCode,
cred_def_json: *const c_char)>) -> ErrorCode {
}

/// Gets schema json data for specified schema id.
/// If data is present inside of cache, cached data is returned.
/// Otherwise data is fetched from the ledger and stored inside of cache for future use.
///
/// #Params
/// command_handle: command handle to map callback to caller context.
/// pool_handle: pool handle (created by open_pool_ledger).
/// wallet_handle: wallet handle (created by open_wallet).
/// submitter_did: DID of the submitter stored in secured Wallet.
/// id: identifier of schema.
/// cb: Callback that takes command result as parameter.
#[no_mangle]
pub extern fn indy_get_schema(command_handle: IndyHandle,
pool_handle: IndyHandle,
wallet_handle: IndyHandle,
submitter_did: *const c_char,
id: *const c_char,
cb: Option<extern fn(command_handle_: IndyHandle,
err: ErrorCode,
schema_json: *const c_char)>) -> ErrorCode {
}
```

## Storing of the data into wallet

Data would be stored with specific cache type so that it is separated and easy to be managed.
Schema_id or cred_def_id would be used for id of wallet data.
This way data may be fetched very efficiently and also easy to be deleted when needed.

## Purging the cache

Several methods may be implemented for purging the cached data:

#### Purge all
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to decide what exact option to use or provide an API for configuration of cache. For example, we can add behavior configuration to indy_init() method.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To define purge behavior i also suggest to look to http cache options:

Cache-Control: max-age=<seconds>
Cache-Control: max-stale[=<seconds>]
Cache-Control: min-fresh=<seconds>
Cache-Control: no-cache 
Cache-Control: no-store
Cache-Control: no-transform
Cache-Control: only-if-cached

In this case we can define exact behavior for each entity.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkulic @jovfer @ashcherbakov @dhh1128 What do you think on ^?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow your thoughts regarding defining exact behavior for each entity. Do you mean adding specific caching policy to each entity, so that purging process may selectively purge only entities which satisfy its own policy? If this is the case, that seams to complicated...

Copy link

@vimmerru vimmerru Mar 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkulic The idea is very simple:

  1. When app asks for some entity it can explicitly say how long it want to store this entity in cache
  2. On start app can call purge method that will remove all outdated records.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds doable. :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also purge can have force param that will remove all cache records.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the timeout idea, because you may lock in some cache items in cache forever. (eg. so you may use your passport on foreign airport, without internet).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some thoughts I figured out I do not like idea about purging on start/specifying ttl for every entity. I would expect that normally you would want data to be in cache for longer time, so you may during request choose max-age of data you want. I would modify it like this:

  1. When requesting a data, app may use some cache control options (like max-age, no-cache, no-store, only-if-cached, for eg.)
  2. purge method would have option of specifying max-age (or similar), so older data would be deleted.


Advantages:
* Very simple to implement.
* Only one method would be needed (eg. `indy_purge_cache`).

Disadvantages:
* Low selectivity of purging

#### Purge all of one type

Advantages:
* Very simple to implement.
* Only one method per cache type is needed (eg. `indy_purge_cred_def_cache`).

Disadvantages:
* Low selectivity of purging

#### LRU mechanism limited by size/number of cached data.

Advantages:
* Limited wallet data.
* Only useful data is being kept, older not needed data is being purged automatically.

Disadvantages:
* Complex to implement.
* Every fetching of data from cache would also introduce update of timestamp data (last used).

Also some data can be locked-in to be always present (useful for out of internet scenario).