Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various ADR regarding JMAP and blobStore performance enhancements #170

Closed
wants to merge 15 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions src/adr/0012-jmap-partial-reads.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# 12. Projections for JMAP Messages

Date: 2019-10-09

## Status

Proposed

Adoption needs to be backed by some performance tests.

## Context

JMAP core RFC8620 requires that the server responds only properties requested by the client.

James currently computes all of the properties regardless of their cost, and if it had been asked by the client.

Clearly we can save some latencies and resources by avoiding reading/computing expensive properties that had not been explicitly requested by the client.

## Decision

Introduce two new datastructures representing JMAP messages:
- One with only metadata
- One with metadata + headers

Given the properties requested by the client, the most appropriate message datastructure will be computed, on top of
existing message storage APIs that should remain unchanged.

Some performance tests will be run in order to evaluate the improvements.

## Consequences
chibenwa marked this conversation as resolved.
Show resolved Hide resolved

GetMessages with a limited set of requested properties will no longer result necessarily in full database message read. We
thus expect a significant improvement, for instance when only metadata are requested.

In case of a less than 5% improvement, the code will not be added to the codebase and the proposal will get the status 'rejected'.

## References

- /get method: https://tools.ietf.org/html/rfc8620#section-5.1
- [JIRA](https://issues.apache.org/jira/browse/JAMES-2919)
48 changes: 48 additions & 0 deletions src/adr/0013-precompute-jmap-preview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# 13. Precompute JMAP Email preview

Date: 2019-10-09

## Status

Proposed

Adoption needs to be backed by some performance tests.

## Context

JMAP messages have a handy preview property displaying the firsts 256 characters of meaningful test of a message.

This property is often displayed for message listing in JMAP clients, thus it is queried a lot.

Currently, to get the preview, James retrieves the full message body, parse it using MIME parsers, removes HTML and keep meaningful text.

## Decision

We should pre-compute message preview.

A MailboxListener will compute the preview and store it in a MessagePreviewStore.

We should have a Cassandra and memory implementation.

When the preview is precomputed then for these messages we can consider the "preview" property as a metadata.

When the preview is not precomputed then we should compute the preview for these messages, and save the result for later.

We should provide a webAdmin task allowing to rebuild the projection. The computing and storing in MessagePreviewStore
is idempotent and the task can be run in live without any concurrency problem.

Some performance tests will be run in order to evaluate the improvements.

## Consequences

We expect a huge performance enhancement for JMAP clients relying on preview for listing mails.

In case of a less than 5% improvement, the code will not be added to the codebase and the proposal will get the status 'rejected'.

## References

- https://jmap.io/server.html#1-emails JMAP client guice states that preview needs to be quick to retrieve

- Similar decision had been taken at FastMail: https://fastmail.blog/2014/12/15/dec-15-putting-the-fast-in-fastmail-loading-your-mailbox-quickly/

- [JIRA](https://issues.apache.org/jira/browse/JAMES-2919)
57 changes: 57 additions & 0 deletions src/adr/0014-blobstore-storage-policies.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# 14. Add storage policies for BlobStore

Date: 2019-10-09

## Status

Proposed

Adoption needs to be backed by some performance tests, as well as data repartition between Cassandra and object storage shifts.

## Context

James exposes a simple BlobStore API for storing raw data. However such raw data often vary in size and access patterns.

As an example:

- Mailbox message headers are expected to be small and frequently accessed
- Mailbox message body are expected to have sizes ranging from small to big but are unfrequently accessed
- DeletedMessageVault message headers are expected to be small and unfrequently accessed

Also, the capabilities of the various implementations of BlobStore have different strengths:

- CassandraBlobStore is efficient for small blobs and offers low latency. However it is known to be expensive for big blobs. Cassandra storage is expensive.
- Object Storage blob store is good at storing big blobs, but it induces higher latencies than Cassandra for small blobs for a cost gain that isn't worth it.

Thus, significant performance and cost ratio refinement could be unlocked by using the right blob store for the right blob.

## Decision

Introduce StoragePolicies at the level of the BlobStore API.

The proposed policies include:

- SizeBasedStoragePolicy: The blob underlying storage medium will be chosen depending on its size.
- LowCostStoragePolicy: The blob is expected to be saved in low cost storage. Access is expected to be unfrequent.
- PerformantStoragePolicy: The blob is expected to be saved in performant storage. Access is expected to be frequent.

An HybridBlobStore will replace current UnionBlobStore and will allow to choose between Cassandra and ObjectStorage implementations depending on the policies.

DeletedMessageVault, BlobExport & MailRepository will rely on LowCostStoragePolicy. Other BlobStore users will rely on SizeBasedStoragePolicy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need to use the policy based blob store for all our blob store uses? I'm not sure LowCostStoragePolicy and PerformantStoragePolicy are really worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LowCostStoragePolicy -> For DeletedMessageVault (mail headers)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PerformantStoragePolicy might not be worth it though...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean why configuring DeletedMessageVault with a particular blob store policy while we probably ever want to store its data in swift?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because having to declare what we expect in term of storage policy is better than knowing that store X has this specific behavior

chibenwa marked this conversation as resolved.
Show resolved Hide resolved

Some performance tests will be run in order to evaluate the improvements.

## Consequences
chibenwa marked this conversation as resolved.
Show resolved Hide resolved

We expect small frequently accessed blobs to be located in Cassandra, allowing ObjectStorage to be used mainly for large costly blobs.

In case of a less than 5% improvement, the code will not be added to the codebase and the proposal will get the status 'rejected'.

We expect more data to be stored in Cassandra. We need to quantify this for adoption.

As reads will be reading the two blobStores, no migration is required to use this composite blobstore on top an existing implementation,
however we will benefits of the performance enhancements only for newly stored blobs.

## References

- [JIRA](https://issues.apache.org/jira/browse/JAMES-2921)
59 changes: 59 additions & 0 deletions src/adr/0015-objectstorage-blobid-list.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# 15. Persist BlobIds for avoiding persisting several time the same blobs within ObjectStorage

Date: 2019-10-09

## Status

Proposed

Adoption needs to be backed by some performance tests.

## Context

A given mail is often written to the blob store by different components. And mail traffic is heavily duplicated (several recipients receiving similar email, same attachments). This causes a given blob to often be persisted several times.

Cassandra was the first implementation of the blobStore. Cassandra is a heavily write optimized NoSQL database. One can assume writes to be fast on top of Cassandra. Thus we assumed we could always overwrite blobs.

This usage pattern was also adopted for BlobStore on top of ObjectStorage.

However writing in Object storage:
- Takes time
- Is billed by most cloud providers

Thus choosing a right strategy to avoid writing blob twice is desirable.

However, ObjectStorage (OpenStack Swift) `exist` method was not efficient enough to be a real cost and performance saver.

## Decision

Rely on a StoredBlobIdsList API to know which blob is persisted or not in object storage. Provide a Cassandra implementation of it.
Located in blob-api for convenience, this it not a top level API. It is intended to be used by some blobStore implementations
(here only ObjectStorage). We will provide a CassandraStoredBlobIdsList in blob-cassandra project so that guice products combining
object storage and Cassandra can define a binding to it.

- When saving a blob with precomputed blobId, we can check the existence of the blob in storage, avoiding possibly the expensive "save".
- When saving a blob too big to precompute its blobId, once the blob had been streamed using a temporary random blobId, copy operation can be avoided and the temporary blob could be directly removed.

Cassandra is probably faster doing "write every time" rather than "read before write" so we should not use the stored blob projection for it

Some performance tests will be run in order to evaluate the improvements.

## Consequences
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how it will interact with the blob garbage collection we plan to implement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question!

And I have an answer for it: as you plan to add a generation component in the blob id, blobs will be different, and this idList will thus be specific for a generation. We can thus safely remove entries from it as we know there will be no new blobs generated with the old generation.

However ADR for blob deletion are not presented yet, thus detailing interactions is hard.

Another argument I have: blob deletion acts at the BlobStore level, while this proposed implementation will be implemented at the ObjectStorageBlobDAO level, which should limit conflicts.


We expect to reduce the amount of writes to the object storage. This is expected to improve:
- operational costs on cloud providers
- performance improvement
- latency reduction under load

As id persistence in StoredBlobIdsList will be done once the blob successfully saved, inconsistencies in StoredBlobIdsList
will lead to duplicated saved blobs, which is the current behaviour.

In case of a less than 5% improvement, the code will not be added to the codebase and the proposal will get the status 'rejected'.

## Reference

Previous optimization proposal using blob existence checks before persist. This work was done using ObjectStorage exist method and was prooven not efficient enough.
chibenwa marked this conversation as resolved.
Show resolved Hide resolved

https://github.com/linagora/james-project/pull/2011 (V2)

- [JIRA](https://issues.apache.org/jira/browse/JAMES-2921)