# CoreCrypto Transactions - Performance Improvements

## Current context
After the [initial event performance investigation](../event-performance-investigation/PerformanceImprovements.ipynb), we have cut processing time.
However, the decryption of events still takes ~37% of event processing. It is the biggest single time consumption that is common to all messages.

By decryption, we mean something like `proteusClient.decryptMessage(session,data)`, or `mlsClient.decryptMessage(groupId, message)`, which internally
writes data to the filesystem to store the new cryptographic state of the relevant Proteus session or MLS group.

Writing to the filesystem operations is... _s l o w_.

## Improvements

The CoreCrypto team has been working on improving the performance of the decryption process by introducing a transaction API.
The theory is that by using transactions, we can reduce the number of filesystem operations, decrypting multiple messages and reserving a single file write operation at the end.

Adapting to the new transaction format required intense changing in the way we handle events, and we're here looking for a before/after comparison.

## Data collection

Data was collected by running the [PocIntegrationTest](../../../tango-tests/src/integrationTest/kotlin/PocIntegrationTest.kt) test suite before and after the changes to decrypt multiple messages in a single transaction, and the results are in the neighbour [proteus_benchmark_summary.csv](./proteus_benchmark_summary.csv) file.

#### Conditions of the test
The tests are completely end-to-end, with a mocked network layer.
This means that the test simulates a user logging in, doing initial sync, etc. and we have a whole proper client setup.
We begin measuring the time only after this initial setup is done, and we are ready to start processing events.

Here we begin showing and exploring the collected data:

In [77]:
%use kandy
%use dataframe

@file:DependsOn("org.jetbrains.kotlinx:kotlinx-datetime:0.6.1")

val proteusBenchmarkSummary = DataFrame.readCsv("proteus_benchmark_summary.csv", delimiter = ',')
proteusBenchmarkSummary

Scenario,Before (ms),After (ms)
2000 messages from 1 conversation,10218,6046
1000 messages across 5 conversations,5025,2920
5000 messages across 6 conversations,24768,13717


In [78]:
import org.jetbrains.letsPlot.core.spec.plotson.scale

fun plotRow(row: DataRow<Line_104_jupyter._DataFrameType2>) = plot {
    bars {
        layout.title = "Time to process ${row.Scenario}, lower is better"
        y(listOf(row.`Before (ms)`, row.`After (ms)`), name = "ms")
        x(listOf("Before", "With Transactions"))
        fillColor("x") {
            scale = continuous(Color.GREEN..Color.LIGHT_GREEN)
        }
    }
}
plotRow(proteusBenchmarkSummary[0])

In [79]:
plotRow(proteusBenchmarkSummary[1])

In [80]:
plotRow(proteusBenchmarkSummary[2])

## Analysis
It's easy to see that the graphs look very similar.
1. The number of messages seems to affect directly the time it takes to process them, which seems logical.
2. The new approach seems consistently faster

In [81]:
val improvementPercentages = proteusBenchmarkSummary.map { row ->
    1 - (row.`After (ms)`.toDouble() / row.`Before (ms)`.toDouble())
}
plot {
    points {
        x(improvementPercentages, "Improvement %")
        x.axis.limits = 0.0..1.0
        y(proteusBenchmarkSummary.map { row -> row.Scenario }, "Scenario")
        color = Color.LIGHT_GREEN
    }
}

In [82]:
import java.text.NumberFormat

val formatter = NumberFormat.getPercentInstance()
    formatter.maximumFractionDigits = 2
formatter.format(improvementPercentages.average())

42,45 %

# Results

So yeah, it looks like a quick 42% improvement across the board? Yey :)

Kudos to everyone involved.