Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve blob store efficiency #2011

Closed

Conversation

chibenwa
Copy link
Member

@chibenwa chibenwa commented Dec 7, 2018

Here is a list of improvements proposed:

  • [v1] : when we have the byte array at hand we can directly compute the blobId and save and expensive rename...
  • [v2]: We could perform read before writes to avoid saving a blob that already exists
  • [v3]: We can position the data length when doable
  • [v4]: Allow Store to choose between byte[]and InputStream as an underlying representation. Note that byte[] makes more sense given our implementation of MimeMessageStore

Here is the master run we should refer to:

rr_base

Here is [v1]

rr_v1

Here is [v2]

rr_v2

Here is [v3] (not present in the changeset)

rr_v3

Here is [v4]

rr_v4


Conclusion:

  • Moving swift blobs is expensive. Significant improvements come from exploiting the byte[] format of our data (what v1 & v4 does). This leads to drastic latency reduction (p99) as well as to a boost of throughtput.

Note that this higher throughput triggers some unwanted Cassandra/reactor concerns....

  • [v3] Position content-length has a neglictible impact...

  • [v2] Read before writes have a slightly negative impact. Thus it was not included in this changeset.

@chibenwa chibenwa added the perf Contributes some performance enhencements label Dec 7, 2018
@chibenwa chibenwa added this to the Sprint 1 - Robusta beans milestone Dec 7, 2018
@rouazana
Copy link

rouazana commented Dec 7, 2018

why is v4 introducing errors?

@chibenwa
Copy link
Member Author

chibenwa commented Dec 7, 2018

Errors are either 400 without cause (null) or the error i posted in the channel.

Higher loads might also create issues

@chibenwa
Copy link
Member Author

test this please

Copy link

@mbaechler mbaechler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correcly, all this allows, given a specific context, to choose whether we want to save blobs from raw arrays or inputstreams.
It happens to benefit to performance for the swift use case, great, but it's just an accident.
What if we don't have raw array for important use cases in the future ?
If move is slow, can't we ensure the payload is always passed with the BlobId so that we have opportunities to compute it at the best time possible in the server ?

@@ -31,6 +32,14 @@ public Payload write(InputStream is) {
return Payloads.newInputStreamPayload(is);
}

@Override
public Payload write(byte[] bytes) {
if (bytes.length == 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need this weird case ?

Copy link
Member Author

@chibenwa chibenwa Dec 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in that case writing the byte array fails.

But if I do handle it as an InputStream it works.

@chibenwa
Copy link
Member Author

test this please

ManageableMailQueueContract.browsingShouldNotAffectDequeue:496 expected:<"name[1]"> but was:<"name[2]">

@chibenwa
Copy link
Member Author

Can be considered 🍏

@chibenwa chibenwa force-pushed the improve-blob-store-efficiency-v4 branch from 12a6884 to 97d266c Compare December 11, 2018 07:04
@chibenwa
Copy link
Member Author

chibenwa commented Dec 11, 2018

If I understand correcly, all this allows, given a specific context, to choose whether we want to save blobs from raw arrays or inputstreams.

Very true.

What if we don't have raw array for important use cases in the future ?

Well, we will not benefit from the performance improvement for that feature. (mailbox export I am thinking to you)

I don't understand why we have to pay the huge price of move when we do not need it.

If move is slow, can't we ensure the payload is always passed with the BlobId so that we have opportunities to compute it at the best time possible in the server ?

I don't understand this.

the payload is always passed with the BlobId the big deal is that the BlobId depends on the payload.

If the payload is not yet known, (ie not yet fully read InputStream) then so do the blobId.

Not sure we do want to consume the stream locally to compute the blobId before passing it to swift....

Yet I believe I did prove here that move do have a huge impact.

@rouazana
Copy link

test this please

@chibenwa
Copy link
Member Author

chibenwa commented Jan 4, 2019

Status: InputStream::reset seems a dead end as I don't succeed to do anything valuable with this.

We could merge V1 & V3 as it has a positive impact, no? And let the v4 part for later...

@blackheaven blackheaven force-pushed the improve-blob-store-efficiency-v4 branch from 97d266c to 5544345 Compare April 23, 2019 08:41
}

public static class InputStreamToSave implements ValueToSave {
private final FixedLengthInputStream inputStream;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be removed soon, no?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time to remove this?

@blackheaven blackheaven force-pushed the improve-blob-store-efficiency-v4 branch from 5544345 to 2b08491 Compare May 14, 2019 12:59
@blackheaven
Copy link

[d9a57c484c3c3d1f73528d96d515014d415d6bd6] [INFO] Tests run: 19, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.123 s - in JUnit Vintage
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] [INFO] 
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] [INFO] Results:
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] [INFO] 
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] [ERROR] Failures: 
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] [ERROR]   DeletedMessagesVaultTest.vaultExportShouldExportZipContainsVaultMessagesToShareeWhenImapDeleteMessage:473->exportAndGetFileLocationFromLastMail:732 1 expectation failed.
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] JSON path status doesn't match.
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] Expected: is "completed"
[d9a57c484c3c3d1f73528d96d515014d415d6bd6]   Actual: failed
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] 
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] [ERROR]   DeletedMessagesVaultTest.vaultExportShouldExportZipContainsVaultMessagesToShareeWhenImapDeletedMailbox:498->exportAndGetFileLocationFromLastMail:732 1 expectation failed.
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] JSON path status doesn't match.
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] Expected: is "completed"
[d9a57c484c3c3d1f73528d96d515014d415d6bd6]   Actual: failed
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] 
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] [ERROR]   DeletedMessagesVaultTest.vaultExportShouldExportZipContainsVaultMessagesToShareeWhenJmapDeleteMessage:450->exportAndGetFileLocationFromLastMail:732 1 expectation failed.
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] JSON path status doesn't match.
[d9a57c484c3c3d1f73528d96d515014d415d6bd6] Expected: is "completed"
[d9a57c484c3c3d1f73528d96d515014d415d6bd6]   Actual: failed

test this please

@blackheaven
Copy link

test this please

1 similar comment
@blackheaven
Copy link

test this please

Copy link
Member Author

@chibenwa chibenwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you validate the performance enhencements?

The gatling runs are quite outdated...

@blackheaven
Copy link

Can you validate the performance enhencements?

The gatling runs are quite outdated...

I'd like to, but benchs are broken

@blackheaven
Copy link

[ERROR]   SpamAssassinContract.deletingSpamMessageShouldNotImpactSpamAssassinState:557 » ConditionTimeout

test this please

@blackheaven
Copy link

[ERROR]   SpamAssassinContract.imapCopiesToSpamMailboxShouldBeConsideredAsSpam:194 » ConditionTimeout

test this please

@blackheaven
Copy link


[bbbfd140e76985ec5085680f539684f7f486e7ab] [INFO] Tests run: 19, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.794 s - in JUnit Vintage
[bbbfd140e76985ec5085680f539684f7f486e7ab] [INFO] 
[bbbfd140e76985ec5085680f539684f7f486e7ab] [INFO] Results:
[bbbfd140e76985ec5085680f539684f7f486e7ab] [INFO] 
[bbbfd140e76985ec5085680f539684f7f486e7ab] [ERROR] Failures: 
[bbbfd140e76985ec5085680f539684f7f486e7ab] [ERROR]   DeletedMessagesVaultTest.vaultExportShouldExportZipContainsVaultMessagesToShareeWhenImapDeleteMessage:473->exportAndGetFileLocationFromLastMail:732 1 expectation failed.
[bbbfd140e76985ec5085680f539684f7f486e7ab] JSON path status doesn't match.
[bbbfd140e76985ec5085680f539684f7f486e7ab] Expected: is "completed"
[bbbfd140e76985ec5085680f539684f7f486e7ab]   Actual: failed
[bbbfd140e76985ec5085680f539684f7f486e7ab] 
[bbbfd140e76985ec5085680f539684f7f486e7ab] [ERROR]   DeletedMessagesVaultTest.vaultExportShouldExportZipContainsVaultMessagesToShareeWhenImapDeletedMailbox:498->exportAndGetFileLocationFromLastMail:732 1 expectation failed.
[bbbfd140e76985ec5085680f539684f7f486e7ab] JSON path status doesn't match.
[bbbfd140e76985ec5085680f539684f7f486e7ab] Expected: is "completed"
[bbbfd140e76985ec5085680f539684f7f486e7ab]   Actual: failed
[bbbfd140e76985ec5085680f539684f7f486e7ab] 
[bbbfd140e76985ec5085680f539684f7f486e7ab] [ERROR]   DeletedMessagesVaultTest.vaultExportShouldExportZipContainsVaultMessagesToShareeWhenJmapDeleteMessage:450->exportAndGetFileLocationFromLastMail:732 1 expectation failed.
[bbbfd140e76985ec5085680f539684f7f486e7ab] JSON path status doesn't match.
[bbbfd140e76985ec5085680f539684f7f486e7ab] Expected: is "completed"
[bbbfd140e76985ec5085680f539684f7f486e7ab]   Actual: failed

test this please

@blackheaven blackheaven force-pushed the improve-blob-store-efficiency-v4 branch 2 times, most recently from c6993dd to 30dc0b9 Compare July 1, 2019 13:42
@blackheaven
Copy link

[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] [ERROR] Failures: 
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] [ERROR]   DeletedMessagesVaultTest.vaultExportShouldExportZipContainsVaultMessagesToShareeWhenImapDeleteMessage:490->exportAndGetFileLocationFromLastMail:748 1 expectation failed.
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] JSON path status doesn't match.
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] Expected: is "completed"
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35]   Actual: failed
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] 
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] [ERROR]   DeletedMessagesVaultTest.vaultExportShouldExportZipContainsVaultMessagesToShareeWhenImapDeletedMailbox:515->exportAndGetFileLocationFromLastMail:748 1 expectation failed.
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] JSON path status doesn't match.
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] Expected: is "completed"
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35]   Actual: failed
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] 
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] [ERROR]   DeletedMessagesVaultTest.vaultExportShouldExportZipContainsVaultMessagesToShareeWhenJmapDeleteMessage:467->exportAndGetFileLocationFromLastMail:748 1 expectation failed.
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] JSON path status doesn't match.
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35] Expected: is "completed"
[30dc0b9f04d5bd3fc7121aedf8af8a972a356d35]   Actual: failed

test this please

@blackheaven
Copy link

[088123b1cc97da4b48bc277a04f27604971615be] docker: Error response from daemon: Cannot link to a non running container: /rabbitmq-improve-blob-store-efficiency-v4-c0239dda-158a-405c-9829-430270953a63 AS /james-server-cassandra-rabbitmq-improve-blob-store-efficiency-v4-c0239dda-158a-405c-9829-430270953a63/rabbitmq.

test this please

@blackheaven
Copy link

Before:

master

After:

benwa

@blackheaven blackheaven force-pushed the improve-blob-store-efficiency-v4 branch from 088123b to fc5dd8a Compare July 4, 2019 09:09
@blackheaven blackheaven closed this Jul 4, 2019
@blackheaven blackheaven force-pushed the improve-blob-store-efficiency-v4 branch from fc5dd8a to 2548451 Compare July 4, 2019 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf Contributes some performance enhencements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants