<h1>
    Introduction
    </h1>
 <hr>

<h3>
    Naive Tx Validation
</h3>
<hr>

<hr>

<center>
    <h1> Individual Component Benchmarks </h1>
    </center>
    <hr>

<h4>Understanding these report:</h4>
                <p>The plot on the top displays the average time per iteration for this benchmark. The shaded region
                    shows the estimated probability of an iteration taking a certain amount of time, while the line
                    shows the mean. Click on the plot for a larger view showing the outliers.</p>
                <p>The plot on the bottom shows the linear regression calculated from the measurements. Each point
                    represents a sample, though here it shows the total time for the sample rather than time per
                    iteration. The line is the line of best fit for these measurements.</p>
                <p>See <a href="https://bheisler.github.io/criterion.rs/book/user_guide/command_line_output.html#additional-statistics">the
                        documentation</a> for more details on the additional statistics.</p>
                        <hr>

<h3>
    Cloning the Mempool ReadGuard
</h3>
<hr>

<center>
    <p>
        For highly concurrent transaction validation, the first step is to build a Left-Right Mempool. For the sake of ease and this naive tx validation benchmark,
        we are using an evmap (see <a href="https://docs.rs/evmap/10.0.2/evmap/index.html">evmap documentation for more info on this particular implementation of a
        left-right use case</a>)
    </p>
    <p>
        We then add 100k tx to the mempool, to insure there's a large number of tx's to benchmark against. A big part of the concurrent tx validation is to clone
        the ReadHandle from the evmap, which is embeded in the MemPool struct. For this usecase the mempool looks like this:
    </p>
</center>

```rust
pub struct Mempool {
    pub r: evmap::ReadHandle<usize, String>
    w: evmap::WriteHandle<usize, String>,
    nonce: usize
}
```

This struct takes an ```evmap::ReadHandle<usize, String>``` and it's counterpart, the ```evmap::WriteHandle<usize, String>```... 
    One might be wondering why this is a String instead of a Tx or a generic... The reasons for this, in this case is: 
        <ol>
            <li>This is just a naive benchmark </li>
            <li>The evmap generic arguments for ReadHandle must have the ```ShallowCopy``` trait, which in turn means all the of the Tx struct fields must also be ```ShallowCopy```,  for the sake of the benchmark, the deserialization of a String Tx is negligible (as you will see) despite not being a perfect optimizaiton, it works for the sake of these benchmarks</li>
    
We will then need to clone this ```ReadHandle``` many many times, (30-50 is the typical optimization) so that we can concurrently get and validate transactions across threads </p>
<hr>

Our first benchmark here, as an individual component, measures the cost of cloning the ```evmap::ReadGuard<usize, String>```
<hr>
<hr>

This benchmark starts with calling ```create_100000_tx()``` function... This creates a large ```Mempool``` and returns it, stores it in a variable called ```mempool``` and then runs the benchmark.

The ```create_100000_tx()``` function:

```rust
fn create_100000_txs() -> Mempool {
    let mut mempool = Mempool::new();
    (0..100000).for_each(|_| {
        let tx = Tx::random();
        mempool.add(&tx);
    });
    mempool.refresh();
    
    mempool
}
```

and the ```clone_large_read_handle``` benchmark:

```rust
fn clone_large_read_handle(c: &mut Criterion) {
    let mempool = create_100000_txs();
    c.bench_function("clone_large_mempool_reader", |b| b.iter(|| {
        let _ = mempool.r.clone();
     }));
}
```

Below are the results of the benchmark described above
<hr>

<center>
    <h3> Average Time Per Iteration </h3>
<div>
    <img src="./clone_large_mempool_reader/report/pdf_small.svg">
</div>
</center>    
<hr>
<center>
    <h3> Linear Regression </h3>
<div class="col-1-2">
        <img src="./clone_large_mempool_reader/report/regression_small.svg">
    </div>
</center>

<table>
    <thead>
        <tr>
            <th></th>
            <th title="0.95 confidence level" class="ci-bound">Lower bound</th>
            <th>Estimate</th>
            <th title="0.95 confidence level" class="ci-bound">Upper bound</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Slope</td>
            <td class="ci-bound">102.61 ns</td>
            <td>103.05 ns</td>
            <td class="ci-bound">103.55 ns</td>
        </tr>
        <tr>
            <td>R&#xb2;</td>
            <td class="ci-bound">0.9620564</td>
            <td>0.9643576</td>
            <td class="ci-bound">0.9614913</td>
        </tr>
        <tr>
            <td>Mean</td>
            <td class="ci-bound">102.75 ns</td>
            <td>103.12 ns</td>
            <td class="ci-bound">103.54 ns</td>
        </tr>
        <tr>
            <td title="Standard Deviation">Std. Dev.</td>
            <td class="ci-bound">1.3888 ns</td>
            <td>2.0189 ns</td>
            <td class="ci-bound">2.6851 ns</td>
        </tr>
        <tr>
            <td>Median</td>
            <td class="ci-bound">102.42 ns</td>
            <td>102.69 ns</td>
            <td class="ci-bound">103.07 ns</td>
        </tr>
        <tr>
            <td title="Median Absolute Deviation">MAD</td>
            <td class="ci-bound">1.0263 ns</td>
            <td>1.5254 ns</td>
            <td class="ci-bound">1.8137 ns</td>
        </tr>
    </tbody>
</table>

<hr>
<div id="footer">
    <p>This report was generated by
        <a href="https://github.com/bheisler/criterion.rs">Criterion.rs</a>, a statistics-driven benchmarking
        library in Rust.</p>
</div>

<hr>
<h3> Deserializing the Tx String into Tx Struct <h3>
<hr>

As mentioned previously, in these benchmarks, the Tx is is saved in the ReadGuard as a serialized string. As a result, to extract the fields and, for example, recreate the payload or validate the signature, we must first deserialize the Tx. In order to get a guage of how expensive this process is, we benchmark the cost of deserializing the Tx.

First, the we create a single deserialize Tx function that takes in the String representing the Tx and uses ```serde``` to deserialize it.

```rust
fn deserialize_tx(tx: String) -> Tx {
    serde_json::from_str(&tx).unwrap()
}
```

We then need a tx to deserialize, so for simplicty and reuse we have a function to create 1 random Tx and serialize it into a string before returning it:

```rust
fn create_1_tx() -> String {
    let tx = Tx::random();
    serde_json::to_string(&tx).unwrap()
```

We can now create the benchmark:

```rust
fn deserialize_tx_benchmark(c: &mut Criterion) {
    let tx = create_1_tx();
    c.bench_function("deserialize_tx", |b| b.iter(|| deserialize_tx(tx.clone())));
}
```

See the results below:

<center>
    <h3> Average Time Per Iteration </h3>
<div>
    <img src="./deserialize_tx/report/pdf_small.svg">
</div>
</center>    
<hr>
<center>
    <h3> Linear Regression </h3>
<div class="col-1-2">
        <img src="./deserialize_tx/report/regression_small.svg">
    </div>
</center>

<table>
                        <thead>
                            <tr>
                                <th></th>
                                <th title="0.95 confidence level" class="ci-bound">Lower bound</th>
                                <th>Estimate</th>
                                <th title="0.95 confidence level" class="ci-bound">Upper bound</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td>Slope</td>
                                <td class="ci-bound">8.2246 us</td>
                                <td>8.2873 us</td>
                                <td class="ci-bound">8.3609 us</td>
                            </tr>
                            <tr>
                                <td>R&#xb2;</td>
                                <td class="ci-bound">0.8550906</td>
                                <td>0.8612448</td>
                                <td class="ci-bound">0.8528208</td>
                            </tr>
                            <tr>
                                <td>Mean</td>
                                <td class="ci-bound">8.3445 us</td>
                                <td>8.4358 us</td>
                                <td class="ci-bound">8.5458 us</td>
                            </tr>
                            <tr>
                                <td title="Standard Deviation">Std. Dev.</td>
                                <td class="ci-bound">294.64 ns</td>
                                <td>512.54 ns</td>
                                <td class="ci-bound">734.65 ns</td>
                            </tr>
                            <tr>
                                <td>Median</td>
                                <td class="ci-bound">8.2228 us</td>
                                <td>8.2868 us</td>
                                <td class="ci-bound">8.4133 us</td>
                            </tr>
                            <tr>
                                <td title="Median Absolute Deviation">MAD</td>
                                <td class="ci-bound">218.33 ns</td>
                                <td>309.95 ns</td>
                                <td class="ci-bound">377.71 ns</td>
                            </tr>
                        </tbody>
                    </table>

<hr>
<div id="footer">
    <p>This report was generated by
        <a href="https://github.com/bheisler/criterion.rs">Criterion.rs</a>, a statistics-driven benchmarking
        library in Rust.</p>
</div>

<hr>
As can be seen from the above report, though negligble to a degree, deserializing the Tx structure does add some expense, roughly 80x that of simply cloning the mempool reader. Since this is a process that occurs as a part of every single tx validation (100s of thousands of times per second, across multiple threads), it adds up. This is an area that can be optimized significantly, and time can be saved. Over the course of 100k Tx validations, this process would add 843 milliseconds, or roughly 1 second, and that outside the context of single threaded processes. In a single threaded process, we would expect the above process to take even longer. By optimizing this, and removing the need to deserialize the Tx struct on every validation, we could increase performance significantly.

<hr>
<h3> Recreating the Tx Payload to be Verified <h3>
<hr>

To validate the signature we must have a perfect replica of the message that is signed. To do this, we must have a known payload that can be hashed and signed. To facilitate this process, the Tx struct has a public ```get_payload()``` method:

```rust
pub fn get_payload(&self) -> String {
    format!(
        "{:x?}{:x?}{:x?}{:x?}{:x?}{:x?}{:x?}{:x?}",
        self.id, self.pk, self.to, self.amt, self.code, self.nonce, self.fee, self.data
        )
}
```

We can then create a tx, calling the same ```create_1_tx()``` function as above, and then deserialize the tx with ```deserialize_tx()``` function again, before measuring the cost of
creating the payload:

```rust
fn create_payload_benchmark(c: &mut Criterion) {
    let tx = deserialize_tx(create_1_tx());
    c.bench_function("create_payload", |b| b.iter(|| tx.clone().get_payload()));
}
```

Below are the performance statistics for this benchmark:

<center>
    <h3> Average Time Per Iteration </h3>
<div>
    <img src="./create_payload/report/pdf_small.svg">
</div>
</center>    
<hr>
<center>
    <h3> Linear Regression </h3>
<div class="col-1-2">
        <img src="./create_payload/report/regression_small.svg">
    </div>
</center>

<table>
    <thead>
        <tr>
            <th></th>
            <th title="0.95 confidence level" class="ci-bound">Lower bound</th>
            <th>Estimate</th>
            <th title="0.95 confidence level" class="ci-bound">Upper bound</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Slope</td>
            <td class="ci-bound">4.9597 us</td>
            <td>5.0028 us</td>
            <td class="ci-bound">5.0546 us</td>
        </tr>
        <tr>
            <td>R&#xb2;</td>
            <td class="ci-bound">0.8056514</td>
            <td>0.8127644</td>
            <td class="ci-bound">0.8025360</td>
        </tr>
        <tr>
            <td>Mean</td>
            <td class="ci-bound">4.9944 us</td>
            <td>5.0331 us</td>
            <td class="ci-bound">5.0741 us</td>
        </tr>
        <tr>
            <td title="Standard Deviation">Std. Dev.</td>
            <td class="ci-bound">164.65 ns</td>
            <td>204.92 ns</td>
            <td class="ci-bound">244.80 ns</td>
        </tr>
        <tr>
            <td>Median</td>
            <td class="ci-bound">4.9249 us</td>
            <td>4.9499 us</td>
            <td class="ci-bound">4.9942 us</td>
        </tr>
        <tr>
            <td title="Median Absolute Deviation">MAD</td>
            <td class="ci-bound">95.036 ns</td>
            <td>141.31 ns</td>
            <td class="ci-bound">191.93 ns</td>
        </tr>
    </tbody>
</table>

<hr>
<div id="footer">
    <p>This report was generated by
        <a href="https://github.com/bheisler/criterion.rs">Criterion.rs</a>, a statistics-driven benchmarking
        library in Rust.</p>
</div>

Similar to the deserialization, albeit less expensive, this process is one that can be significantly optimized (and is in fully integrated protocol) through smart Tx & message design. In the context of this benchmark, this process takes an average of 5.03 microseconds, as a one off function call, that is not much to sweat over, but in the context of trying to validate 100s of thousands of Tx's per second, this adds up. Further, this is not conducted in a single thread context, the way that our highly concurrent validation protocol behaves in reality. We would expect even poorer performance in a single threaded context.

<hr>
<h3> Converting Payload to Hashed Message <h3>
<hr>

As mentioned in the previous benchmark, an ECDSA signature signs a specific message, in the context of the ```secp256k1``` crate being used in this benchmark, that message must be in the form of a ```secp256k1::Message``` struct. After deserializing the Tx and recreating the payload, we must then recreate the ```Message``` to be signed.

The ```Message``` struct has a method called ```from_hashed_data<H: ThirtyTwoByteHash + hashes::Hash>(data: &[u8])```, which takes a reference to an array of ```u8``` bytes.

This is the method we are benchmarking, and as such, we can build the benchmark:

```rust
fn recreate_tx_message(c: &mut Criterion) {
    let payload = deserialize_tx(create_1_tx()).get_payload();
    let payload = payload.as_bytes();
    c.bench_function("recreate_tx_message", |b| b.iter(|| {
        Message::from_hashed_data::<sha256::Hash>(payload);
    }));
```

The performance of this benchmark can be seen below:

<center>
    <h3> Average Time Per Iteration </h3>
<div>
    <img src="./recreate_tx_message/report/pdf_small.svg">
</div>
</center>    
<hr>
<center>
    <h3> Linear Regression </h3>
<div class="col-1-2">
        <img src="./recreate_tx_message/report/regression_small.svg">
    </div>
</center>

<table>
    <thead>
        <tr>
            <th></th>
            <th title="0.95 confidence level" class="ci-bound">Lower bound</th>
            <th>Estimate</th>
            <th title="0.95 confidence level" class="ci-bound">Upper bound</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Slope</td>
            <td class="ci-bound">1.2808 us</td>
            <td>1.2867 us</td>
            <td class="ci-bound">1.2933 us</td>
        </tr>
        <tr>
            <td>R&#xb2;</td>
            <td class="ci-bound">0.9489330</td>
            <td>0.9517718</td>
            <td class="ci-bound">0.9481080</td>
        </tr>
        <tr>
            <td>Mean</td>
            <td class="ci-bound">1.2913 us</td>
            <td>1.3001 us</td>
            <td class="ci-bound">1.3120 us</td>
        </tr>
        <tr>
            <td title="Standard Deviation">Std. Dev.</td>
            <td class="ci-bound">23.251 ns</td>
            <td>53.911 ns</td>
            <td class="ci-bound">84.517 ns</td>
        </tr>
        <tr>
            <td>Median</td>
            <td class="ci-bound">1.2865 us</td>
            <td>1.2898 us</td>
            <td class="ci-bound">1.2931 us</td>
        </tr>
        <tr>
            <td title="Median Absolute Deviation">MAD</td>
            <td class="ci-bound">13.412 ns</td>
            <td>21.545 ns</td>
            <td class="ci-bound">28.312 ns</td>
        </tr>
    </tbody>
</table>

<hr>
<div id="footer">
    <p>This report was generated by
        <a href="https://github.com/bheisler/criterion.rs">Criterion.rs</a>, a statistics-driven benchmarking
        library in Rust.</p>
</div>

As this is an external (and widely used) crate, it's no surpise that performance is, to a degree, optimized, however, can likely still be improved. Again, given the highly concurrent nature of the VRRB validation protocol, we would expect that this performance would decline in a single-threaded context.

<hr>
<h3> Verifying Signature <h3>
<hr>

The final individual component to the the naive Tx validation is to verify the signature using the sig in the Tx and the recreated message.

For this, we are using the built in ```secp256k1::ecdsa::Signature::verify()``` method, which is a single signature verification method. This can be signficantly improved upon (2-6x+ faster) using batched verification (see <a href="https://cse.iitkgp.ac.in/~abhij/publications/ECDSA-SP-ACNS2014.pdf">Karati and Das, 2014</a>).

With that said, the single signature verification benchmark below:

```rust
fn signature_validation_benchmark(c: &mut Criterion) {
    let tx = deserialize_tx(create_1_tx());
    let payload = tx.get_payload();
    let message = Message::from_hashed_data::<sha256::Hash>(payload.as_bytes());
    c.bench_function("signature_validation", |b| b.iter(|| {
        tx.sig.verify(&message, &tx.pk)
    }));
```

results in the performance below:

<center>
    <h3> Average Time Per Iteration </h3>
<div>
    <img src="./signature_validation/report/pdf_small.svg">
</div>
</center>    
<hr>
<center>
    <h3> Linear Regression </h3>
<div class="col-1-2">
        <img src="./signature_validation/report/regression_small.svg">
    </div>
</center>

<table>
    <thead>
        <tr>
            <th></th>
            <th title="0.95 confidence level" class="ci-bound">Lower bound</th>
            <th>Estimate</th>
            <th title="0.95 confidence level" class="ci-bound">Upper bound</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Slope</td>
            <td class="ci-bound">62.582 us</td>
            <td>63.155 us</td>
            <td class="ci-bound">63.824 us</td>
        </tr>
        <tr>
            <td>R&#xb2;</td>
            <td class="ci-bound">0.7700924</td>
            <td>0.7775110</td>
            <td class="ci-bound">0.7674143</td>
        </tr>
        <tr>
            <td>Mean</td>
            <td class="ci-bound">63.259 us</td>
            <td>63.973 us</td>
            <td class="ci-bound">64.830 us</td>
        </tr>
        <tr>
            <td title="Standard Deviation">Std. Dev.</td>
            <td class="ci-bound">2.4126 us</td>
            <td>4.0472 us</td>
            <td class="ci-bound">5.7409 us</td>
        </tr>
        <tr>
            <td>Median</td>
            <td class="ci-bound">62.307 us</td>
            <td>62.748 us</td>
            <td class="ci-bound">63.232 us</td>
        </tr>
        <tr>
            <td title="Median Absolute Deviation">MAD</td>
            <td class="ci-bound">1.2532 us</td>
            <td>1.7383 us</td>
            <td class="ci-bound">2.1482 us</td>
        </tr>
    </tbody>
</table>

<hr>
<div id="footer">
    <p>This report was generated by
        <a href="https://github.com/bheisler/criterion.rs">Criterion.rs</a>, a statistics-driven benchmarking
        library in Rust.</p>
</div>

This is by far the most expensive operation we've seen as of yet, clocking in at 63 microseconds per iteration average. In a single threaded context it is even more expensive, at roughly 81 microseconds. Improving the speed of signature verification, using batch verification, is a very important component, as a 2-6x+ speedup would result in speeds improving to between ~15 and ~40 microseconds, meaning in a single threaded context, when combined with all other operations (once optimized), each thread will be able to validate between ~20,000 and ~80,000 tps. With optimal number of validator threads in a validator unit being between 30 and 50, this means that a single validator node would be able to locally validate between ~600k and ~4mm tps. 

<hr>
<h3> Full Naive Validation Process <h3>
<hr>

Putting it all together, we build a benchmark to conduct the full process of naive validation (deserialize tx, recreate payload, recreate payload, recreate message, validate signature).

```rust
fn full_validation_benchmark(c: &mut Criterion) {
    let tx = create_1_tx();
    c.bench_function("full_validation", |b| b.iter(|| {
        let tx = deserialize_tx(tx.clone());
        let payload = tx.get_payload();
        let message = Message::from_hashed_data::<sha256::Hash>(payload.as_bytes());
        tx.sig.verify(&message, &tx.pk)
    }));
}
```

We see the below performance:

<center>
    <h3> Average Time Per Iteration </h3>
<div>
    <img src="./full_validation/report/pdf_small.svg">
</div>
</center>    
<hr>
<center>
    <h3> Linear Regression </h3>
<div class="col-1-2">
        <img src="./full_validation/report/regression_small.svg">
    </div>
</center>

<table>
    <thead>
        <tr>
            <th></th>
            <th title="0.95 confidence level" class="ci-bound">Lower bound</th>
            <th>Estimate</th>
            <th title="0.95 confidence level" class="ci-bound">Upper bound</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Slope</td>
            <td class="ci-bound">181.56 us</td>
            <td>182.32 us</td>
            <td class="ci-bound">183.16 us</td>
        </tr>
        <tr>
            <td>R&#xb2;</td>
            <td class="ci-bound">0.9585883</td>
            <td>0.9607744</td>
            <td class="ci-bound">0.9580322</td>
        </tr>
        <tr>
            <td>Mean</td>
            <td class="ci-bound">182.16 us</td>
            <td>183.10 us</td>
            <td class="ci-bound">184.29 us</td>
        </tr>
        <tr>
            <td title="Standard Deviation">Std. Dev.</td>
            <td class="ci-bound">3.0403 us</td>
            <td>5.4803 us</td>
            <td class="ci-bound">8.1534 us</td>
        </tr>
        <tr>
            <td>Median</td>
            <td class="ci-bound">180.96 us</td>
            <td>181.75 us</td>
            <td class="ci-bound">182.69 us</td>
        </tr>
        <tr>
            <td title="Median Absolute Deviation">MAD</td>
            <td class="ci-bound">2.2361 us</td>
            <td>3.0043 us</td>
            <td class="ci-bound">4.0312 us</td>
        </tr>
    </tbody>
</table>

<hr>
<div id="footer">
    <p>This report was generated by
        <a href="https://github.com/bheisler/criterion.rs">Criterion.rs</a>, a statistics-driven benchmarking
        library in Rust.</p>
</div>

Under this naive framework, we see that the sum of the parts do not perform as well as the parts individually, however, we are still able to achieve solid performance in this context. Since the protocol calls for a highly concurrent approach, it would make sense for us to gain a better understanding of the single thread context of the above benchmark:

```rust
fn full_validation_single_thread_benchmark(c: &mut Criterion) {
    c.bench_function("full_validation_1_thread", |b| b.iter(|| {
        std::thread::spawn(move || {
            let tx = create_1_tx();
            let tx = deserialize_tx(tx);
            let payload = tx.get_payload();
            let message = Message::from_hashed_data::<sha256::Hash>(payload.as_bytes());
            tx.sig.verify(&message, &tx.pk)
        }).join().unwrap()
    }));
}
```

The single threaded approach under performs, as expected, by about 93.88%:

<center>
    <h3> Average Time Per Iteration </h3>
<div>
    <img src="./full_validation_1_thread/report/pdf_small.svg">
</div>
</center>    
<hr>
<center>
    <h3> Linear Regression </h3>
<div class="col-1-2">
        <img src="./full_validation_1_thread/report/regression_small.svg">
    </div>
</center>

<table>
    <thead>
        <tr>
            <th></th>
            <th title="0.95 confidence level" class="ci-bound">Lower bound</th>
            <th>Estimate</th>
            <th title="0.95 confidence level" class="ci-bound">Upper bound</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Slope</td>
            <td class="ci-bound">340.17 us</td>
            <td>346.24 us</td>
            <td class="ci-bound">352.77 us</td>
        </tr>
        <tr>
            <td>R&#xb2;</td>
            <td class="ci-bound">0.5180761</td>
            <td>0.5302300</td>
            <td class="ci-bound">0.5161875</td>
        </tr>
        <tr>
            <td>Mean</td>
            <td class="ci-bound">348.38 us</td>
            <td>355.14 us</td>
            <td class="ci-bound">362.42 us</td>
        </tr>
        <tr>
            <td title="Standard Deviation">Std. Dev.</td>
            <td class="ci-bound">28.830 us</td>
            <td>36.229 us</td>
            <td class="ci-bound">42.630 us</td>
        </tr>
        <tr>
            <td>Median</td>
            <td class="ci-bound">337.78 us</td>
            <td>344.72 us</td>
            <td class="ci-bound">349.29 us</td>
        </tr>
        <tr>
            <td title="Median Absolute Deviation">MAD</td>
            <td class="ci-bound">18.134 us</td>
            <td>25.566 us</td>
            <td class="ci-bound">33.735 us</td>
        </tr>
    </tbody>
    </table>

<hr>
<div id="footer">
    <p>This report was generated by
        <a href="https://github.com/bheisler/criterion.rs">Criterion.rs</a>, a statistics-driven benchmarking
        library in Rust.</p>
</div>

In the context of highly concurrent validation protocol, this would mean that a local node could only validate roughly 141k tps, which is still significantly higher throughput with the benefits of a synchronous decentralized fault tolerant network, however, improvement upon this naive approach is not only possible, but could be significant. Further, when combined into a validator unit function that more closely resembles the actual VRRB protocol, what we find is that we are able to achieve fairly significant increases over the numbers above.

To measure this, we build a small module:

```rust
#![allow(dead_code, unused_imports)]
use crate::mempool::Mempool;
use crate::tx::Tx;
use secp256k1::rand::prelude::*;
use secp256k1::rand::thread_rng;
use secp256k1::{generate_keypair, Message};
use secp256k1::hashes::sha256;
use secp256k1::PublicKey;
use secp256k1::ecdsa::Signature;
use serde::{Serialize, Deserialize};
use std::str::FromStr;
use std::thread::JoinHandle;
use std::ops::Add;
use rand;
use rand::RngCore;
use evmap::ReadGuard;
use std::io::prelude::*;
use crossbeam_channel::unbounded;
use std::io::{BufWriter, Write};

fn create_100000_txs() -> Mempool {
    let mut mempool = Mempool::new();
    (0..100000).for_each(|_| {
        let tx = Tx::random();
        mempool.add(&tx);
    });
    mempool.refresh();
    mempool
}

fn deserialize_tx(tx: String) -> Tx {
    serde_json::from_str(&tx).unwrap()
}

// TODO: add sender/receiver to below function to have confirmations sent to separate thread
// to handle writes.
#[allow(unused_must_use)]
fn validate_full_mempool(
    mempool: &mut Mempool, 
    n_readers: usize, 
    batch: bool, 
    batch_size: usize,
    timed: bool,
    sx: crossbeam_channel::Sender<std::collections::HashSet<String>>
) -> std::io::Result<()> {
    let start = std::time::Instant::now();
    let end = start + std::time::Duration::from_millis(1000);
    let handles: Vec<_> = (0..n_readers).map(|i| {
        let r = mempool.r.clone();
        let thread_sender = sx.clone();
        std::thread::spawn(move || {
            let mut confs = std::collections::HashSet::new();
            let mut rng = thread_rng();
            let mut count: u32 = 0;
            while std::time::Instant::now() < end {
                if i == 0 {
                    let id = rng.gen_range(0, 100_000 / n_readers);
                    if let Some(v) = r.get_one(&id) {
                        let mut tx = deserialize_tx(v.to_string());
                        let sig = &tx.sig;
                        let payload = tx.get_payload();
                        let message = Message::from_hashed_data::<sha256::Hash>(payload.as_bytes());
                        let _ = sig.verify(&message, &tx.pk);
                        tx.conf += 1;
                        confs.insert(serde_json::to_string(&tx).unwrap());
                        count += 1;

                        if batch {
                            if confs.len() == batch_size {
                                let _ = thread_sender.send(confs.clone());
                                confs.clear();
                            }

                        } else if !batch && !timed {
                            let _ = thread_sender.send(confs.clone());
                            confs.clear()
                        }
                        
                    };
                } else {
                    let id = rng.gen_range((i * 100_000 / n_readers) + 1, (i + 1) * (100_000 / n_readers));
                    if let Some(v) = r.clone().get_one(&id) {
                        let v = v.clone();
                        let mut tx = deserialize_tx(v.to_string());
                        let payload = tx.get_payload();
                        let message = Message::from_hashed_data::<sha256::Hash>(payload.as_bytes());
                        let _ = tx.sig.verify(&message, &tx.pk);
                        tx.conf += 1;
                        confs.insert(serde_json::to_string(&tx).unwrap());
                        count += 1;

                        if batch {
                            if confs.len() == batch_size {
                                let _ = thread_sender.send(confs.clone());
                                confs.clear();
                            }

                        } else if !batch && !timed {
                            let _ = thread_sender.send(confs.clone());
                            confs.clear()
                        }
                    };
                }
            }

            if timed {

            }

            (confs, count)
            
        })
    }).collect();

    let values: Vec<(std::collections::HashSet<String>, u32)> = handles.into_iter().map(|h| {
        h.join().unwrap()
    }).collect();

    let total = values.clone().into_iter().fold(0, |acc, x| acc + x.1);

    if timed {
        let confs = values.into_iter().map(|x| x.0).collect::<Vec<std::collections::HashSet<String>>>();
        let confs = confs.into_iter().flatten().collect::<std::collections::HashSet<String>>();
        let mut f = BufWriter::new(std::fs::File::create("C:/Bench/confs.json").unwrap());
        let write_start = std::time::Instant::now();
        timed_writes(confs.clone(), &mut f);
        let end = write_start.elapsed();
        println!("Write Conf Length: {} in {:?}", confs.len(), end);
    }

    println!("Total Validations: {}", total);

    Ok(())
}

fn single_write(
    confs: &mut std::collections::HashSet<String>, 
    tx: std::collections::HashSet<String>, 
    file: &mut BufWriter<std::fs::File>
) -> std::io::Result<()> {
    confs.extend(tx);
    file.write(&serde_json::to_vec(confs)?)?;
    Ok(())
}

fn batched_writes(
    confs: &mut std::collections::HashSet<String>, 
    tx_list: std::collections::HashSet<String>,
    file: &mut BufWriter<std::fs::File>
) -> std::io::Result<()> {
    confs.extend(tx_list);
    file.write(&serde_json::to_vec(confs)?)?;
    Ok(())
}

fn timed_writes(
    tx_list: std::collections::HashSet<String>,
    file: &mut BufWriter<std::fs::File>
) -> std::io::Result<()> {
    file.write(&serde_json::to_vec(&tx_list)?)?;
    Ok(())
}
```

These functions are then combined into a a test function:

```rust
#[cfg(test)]
mod tests {
    use crate::*;
    use std::sync::mpsc::channel;
    #[test]
    fn test_full_validate() {
        println!("Warming up, creating BufWriter file disk writing");
        let mut f = BufWriter::new(std::fs::OpenOptions::new().write(true).append(false).create(true).open("C:/Bench/confs.json").unwrap());
        println!("Warming up, creating 100k random txs, and setting in mempool....");
        let mut mempool = create_100000_txs();
        let (sx, rx) = unbounded();
        println!("Spinning up validator unit thread(s)");
        let validator_handle = std::thread::spawn(move || {
            let _ = validate_full_mempool(&mut mempool, 50, true, 300, false, sx.clone());
        });
        println!("Starting disk writing loop");
        let mut conf_set = std::collections::HashSet::new();
        let write_handle = std::thread::spawn(move || {

            while let Ok(conf) = rx.recv() {
                conf_set.extend(conf);
            }
            
            f.write(&serde_json::to_vec(&conf_set).unwrap()).unwrap();
            let reader = std::io::BufReader::new(std::fs::OpenOptions::new().write(false).read(true).open("C:/Bench/confs.json").unwrap());
            let conf_set = reader.bytes().map(|b| b.unwrap()).collect::<Vec<u8>>();
            let conf_set = serde_json::from_slice::<std::collections::HashSet<String>>(&conf_set).unwrap();
            println!("Wrote {:?} unique, validated txs to disk in {:?}", conf_set.len());
        });
        
        validator_handle.join().unwrap();
        write_handle.join().unwrap();
    }
}
```

The above instance uses 50 readers over a mempool of 100k randomly generated Txs to validate, and uses batched writes (every 300 validations) to disk. 

Below is the output when running ```cargo test -- --nocapture```
<hr>

```
    Finished test [optimized + debuginfo] target(s) in 0.15s
     Running unittests (target\debug\deps\validator-e4d9a8362c51a686.exe)

running 1 test
Warming up, creating BufWriter file disk writing
Warming up, creating 100k random txs, and setting in mempool....
Spinning up validator unit thread(s)
Starting disk writing loop
Total Validations: 272624
Wrote 92831 unique, validated txs to disk
test tests::test_full_validate ... ok

```

<hr>
You may notice a discrepency between the number of unique writes to disk, and the number of total validations. Two things account for this. One, once the validator threads shut down, the senders dshut down and the receiver in the disk writing thread & loop can no longer receive messages (once all senders are dropped), and second, and more importantly, there are only 100k Txs in the mempool, and we are randomly selecting and validating them in this example, so there are duplicated confirmations in this naive example.

<hr>
For the full source code, see <a href=