Skip to content
Permalink
Browse files
updating the blog links
  • Loading branch information
animeshtrivedi committed Jan 18, 2018
1 parent 7345c87 commit 22a729451f1209d7260f584c367e3a6ae8cadc4b
Showing 2 changed files with 8 additions and 9 deletions.
@@ -143,7 +143,7 @@ Consequently, to improve the runtime of the sorting benchmark and to make good u

<div style="text-align: justify">
<p>
An overview of the Crail shuffler is provided in the <a href="http://www.crail.io/docs">documentation section</a>. The main difference between the Crail shuffler and the Spark built-in shuffler lies in the way data from the network is processed in a reduce task. The Spark shuffler is based on TCP sockets, thus, many CPU instructions are necessary to bring the data from the networking interface to the buffer inside Spark. In contrast, the Crail shuffler shares shuffle data through the Crail file system, and therefore data is transferred directly via DMA from the network interface to the Spark shuffle buffer within the JVM.
An overview of the Crail shuffler is provided in the <a href="{{ site.base }}/docs">documentation section</a>. The main difference between the Crail shuffler and the Spark built-in shuffler lies in the way data from the network is processed in a reduce task. The Spark shuffler is based on TCP sockets, thus, many CPU instructions are necessary to bring the data from the networking interface to the buffer inside Spark. In contrast, the Crail shuffler shares shuffle data through the Crail file system, and therefore data is transferred directly via DMA from the network interface to the Spark shuffle buffer within the JVM.
</p>
</div>

@@ -157,7 +157,7 @@ During the map phase, the Crail shuffler organizes each key range in a set of Cr
</p>

<p>
As illustrated in the <a href="http://www.crail.io/docs">documentation section</a>, the Crail shuffler allows applications to use their own custom serializer and sorter. The recommended serializer for Spark is Kryo, which is a generic serializer. Being generic, however, comes at a cost. Specifically, Kryo requires more type information to be stored along with the serialized data than a custom serializer would need, and also the parsing is more complex for a generic serializer. On top of this, Kryo also comes with its own buffering, introducing additional memory copies. In our benchmark, we use a custom serializer that takes advantage of the fact that the data consists of fixed size key/value pairs. The custom serializer further avoids extra buffering and directly interfaces with Crail file system streams when reading and writing data.
As illustrated in the <a href="{{ site.base }}/docs">documentation section</a>, the Crail shuffler allows applications to use their own custom serializer and sorter. The recommended serializer for Spark is Kryo, which is a generic serializer. Being generic, however, comes at a cost. Specifically, Kryo requires more type information to be stored along with the serialized data than a custom serializer would need, and also the parsing is more complex for a generic serializer. On top of this, Kryo also comes with its own buffering, introducing additional memory copies. In our benchmark, we use a custom serializer that takes advantage of the fact that the data consists of fixed size key/value pairs. The custom serializer further avoids extra buffering and directly interfaces with Crail file system streams when reading and writing data.
</p>
<p>
As with serialization, the Spark built-in sorter is a general purpose TimSort that can sort arbitrary collections of comparable objects. In our benchmark, we instruct the Crail shuffler to use a Radix sorter instead. The Radix sorter cannot be applied to arbitrary objects but works well for keys of a fixed byte length. The standard pipeline of a reduce task is to first deserialize the data and then sort it. In the particular configuration of the Crail shuffler, we turn these two steps around and first sort the data and deserialize it later. This is possible because the data is read into a contiguous off-heap buffer that can be sorted almost in-place.
@@ -188,8 +188,8 @@ In this blog post, we have shown that Crail successfully manages to translate th

All the components required to run the sorting benchmark using Spark/Crail are open source. Here is some guidance how to run the benchmark:

* Build and deploy Crail using the instructions at [crail.io/doc](http://www.crail.io/documentation#crail)
* Enable the Crail shuffler for Spark by building Spark-IO using the instructions at [crail.io/doc](http://www.crail.io/documentation#spark)
* Build and deploy Crail using the instructions at <a href="{{ site.base }}/documentation#crail">documentation</a>
* Enable the Crail shuffler for Spark by building Spark-IO using the instructions at <a href="{{ site.base }}/documentation#spark">documentation</a>
* Configure the DRAM storage tier of Crail so that all the shuffle data fits into the DRAM tier.
* Build the sorting benchmark using the instructions on [GitHub](https://github.com/zrlio/crail-terasort)
* Make sure you have the custom serializer and sorter specified in spark-defaults.conf
@@ -203,6 +203,5 @@ All the components required to run the sorting benchmark using Spark/Crail are o
-i /terasort-input-1280g -o /terasort-output-1280g
```

Have questions or comments? Feel free to discuss below or [contact](http://www.crail.io/people/) us directly.

Have questions or comments? Feel free to discuss at the dev mailing list at <a href="mailto:dev@crail.apache.org">dev@crail.apache.org</a>
<hr/>
@@ -97,7 +97,7 @@ and SPDK:
```
<div style="text-align: justify">
<p>
For sequential operations in Crail, metadata fetching is inlined with data operations as described in the <a href="http://www.crail.io/blog/2017/08/crail-memory.html">DRAM</a> blog. This is possible as long as the data transfer has a lower latency than the metadata RPC, which is typically the case. As a consequence, our NVMf storage tier reaches the same throughput as the native SPDK benchmark (device limit).
For sequential operations in Crail, metadata fetching is inlined with data operations as described in the <a href="{{ site.base }}/blog/2017/08/crail-memory.html">DRAM</a> blog. This is possible as long as the data transfer has a lower latency than the metadata RPC, which is typically the case. As a consequence, our NVMf storage tier reaches the same throughput as the native SPDK benchmark (device limit).
</p>
</div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-nvmf/throughput.svg" width="550"/></div>
@@ -106,7 +106,7 @@ For sequential operations in Crail, metadata fetching is inlined with data opera

<div style="text-align: justify">
<p>
Let us look at the sequential read and write throughput for buffered and direct streams and compare them to a buffered Crail stream on DRAM. All benchmarks are single thread/client performed against 8 storage nodes with 4 drives each, cf. configuration above. In this benchmark we use 32 outstanding operations for the NVMf storage tier buffered stream experiments by using a buffer size of 16MB and a slice size of 512KB, cf. <a href="http://www.crail.io/blog/2017/07/crail-memory.html">part I</a>. The buffered stream reaches line speed at a transfer size of around 1KB and shows only slightly slower performance when compared to the DRAM tier buffered stream. However we are only using 2 outstanding operations with the DRAM tier to achieve these results. Basically for sizes smaller than 1KB the buffered stream is limited by the copy speed to fill the application buffer. The direct stream reaches line speed at around 128KB with 128 outstanding operations. Here no copy operation is performed for transfer size greater than 512Byte (sector size). The command to run the Crail buffered stream benchmark:
Let us look at the sequential read and write throughput for buffered and direct streams and compare them to a buffered Crail stream on DRAM. All benchmarks are single thread/client performed against 8 storage nodes with 4 drives each, cf. configuration above. In this benchmark we use 32 outstanding operations for the NVMf storage tier buffered stream experiments by using a buffer size of 16MB and a slice size of 512KB, cf. <a href="{{ site.base }/blog/2017/07/crail-memory.html">part I</a>. The buffered stream reaches line speed at a transfer size of around 1KB and shows only slightly slower performance when compared to the DRAM tier buffered stream. However we are only using 2 outstanding operations with the DRAM tier to achieve these results. Basically for sizes smaller than 1KB the buffered stream is limited by the copy speed to fill the application buffer. The direct stream reaches line speed at around 128KB with 128 outstanding operations. Here no copy operation is performed for transfer size greater than 512Byte (sector size). The command to run the Crail buffered stream benchmark:
</p>
</div>
```
@@ -133,7 +133,7 @@ Random read latency is limited by the flash technology and we currently see arou

<div style="text-align: justify">
<p>
In this paragraph we show how Crail can leverage flash memory when there is not sufficient DRAM available in the cluster to hold all the data. As described in the <a href="http://www.crail.io/overview/">overview</a> section, if you have multiple storage tiers deployed in Crail, e.g. the DRAM tier and the NVMf tier, Crail by default first uses up all available resources of the faster tier. Basically a remote resource of a faster tier (e.g. remote DRAM) is preferred over a slower local resource (e.g., local flash), motivated by the fast network. This is what we call horizontal tiering.
In this paragraph we show how Crail can leverage flash memory when there is not sufficient DRAM available in the cluster to hold all the data. As described in the <a href="{{ site.base }}/overview/">overview</a> section, if you have multiple storage tiers deployed in Crail, e.g. the DRAM tier and the NVMf tier, Crail by default first uses up all available resources of the faster tier. Basically a remote resource of a faster tier (e.g. remote DRAM) is preferred over a slower local resource (e.g., local flash), motivated by the fast network. This is what we call horizontal tiering.
</p>
</div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-nvmf/crail_tiering.png" width="500" vspace="10"/></div>

0 comments on commit 22a7294

Please sign in to comment.