Skip to content
Permalink
Browse files
fixing site.base everywhere
  • Loading branch information
animeshtrivedi committed Jan 18, 2018
1 parent 91bb3af commit 166bec8d678b93d70a0b1425b14096f90f234f6e
Showing 4 changed files with 20 additions and 20 deletions.
@@ -108,7 +108,7 @@ A Spark sorting job consists of two phases. The first phase is a mapping or cla
</div>

<br>
<div style="text-align:center"><img src ="http://crail.io/img/blog/sort/terasort_pipeline.png" width="490"></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/sort/terasort_pipeline.png" width="490"></div>
<br><br>

### Using Vanilla Spark
@@ -120,7 +120,7 @@ The first question we are interested in is to what extent such a sorting benchma
</div>

<br>
<div style="text-align:center"><img src ="http://crail.io/img/blog/sort/net_vanilla.svg" /></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/sort/net_vanilla.svg" /></div>
<br><br>

<div style="text-align: justify">
@@ -130,7 +130,7 @@ The poor network usage matches with the general observation we made in our previ
</div>

<br>
<div style="text-align:center"><img src ="http://crail.io/img/blog/sort/cpu_network.svg"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/sort/cpu_network.svg"/></div>
<br><br>

<div style="text-align: justify">
@@ -148,7 +148,7 @@ An overview of the Crail shuffler is provided in the <a href="http://www.crail.i
</div>

<br>
<div style="text-align:center"><img src ="http://crail.io/img/blog/sort/shuffle_rdma.png" width="470"></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/sort/shuffle_rdma.png" width="470"></div>
<br><br>

<div style="text-align: justify">
@@ -173,13 +173,13 @@ The figure below shows the overall performance of Spark/Crail vs Spark/Vanilla o
</div>

<br>
<div style="text-align:center"><img src ="http://crail.io/img/blog/sort/performance_overall.png" width="470"></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/sort/performance_overall.png" width="470"></div>
<br><br>

One key question of interest is about the network usage of the Crail shuffler during the sorting benchmark. In the figure below, we show the data rate at which the different reduce tasks fetch data from the network. Each point in the figure corresponds to one reduce task. In our configuration, we run 3 Spark executors per node and 5 Spark cores per executor. Thus, 1920 reduce tasks are running concurrently (out of 6400 reduce tasks in total) generating a cluster-wide all-to-all traffic of about 70Gbit/s per node during that phase.

<br>
<div style="text-align:center"><img src ="http://crail.io/img/blog/sort/multiread.svg"></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/sort/multiread.svg"></div>
<br><br>

In this blog post, we have shown that Crail successfully manages to translate the raw network performance into actual workload level gains. The exercise with TeraSort as an application validates the design decisions we made in Crail. Stay tuned for more results with different workloads and hardware configurations.
@@ -60,7 +60,7 @@ One challenge with file read/write operations is to avoid blocking in case block
</p>
</div>
<br>
<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-memory/anatomy.png" width="420"></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-memory/anatomy.png" width="420"></div>
<br>
<div style="text-align: justify">
<p>
@@ -98,8 +98,8 @@ The figure below illustrates the sequential write (top) and read (bottom) perfor
</p>
</div>
<br>
<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-memory/write.svg" width="550"/></div>
<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-memory/read.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-memory/write.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-memory/read.svg" width="550"/></div>
<br><br>
<div style="text-align: justify">
<p>
@@ -110,8 +110,8 @@ Note that both figures show single-client performance numbers. With Crail being
</p>
</div>

<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-memory/crail-groupby.svg" width="550"/></div>
<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-memory/spark-groupby.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-memory/crail-groupby.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-memory/spark-groupby.svg" width="550"/></div>

### Random Read Latency

@@ -128,7 +128,7 @@ Typically, distributed storage systems are either built for sequential access to
The figure below illustrates the latencies of get() operations for different key/value sizes and compares them to the latencies we obtained with RAMCloud for the same type of operations (measured using RAMClouds C and Java APIs). RAMCloud is a low-latency key/value store implemented using RDMA. RAMCloud actually provides durable storage by asynchronously replicating data onto backup devices. However, at any point in time all the data is held in DRAM and read requests will be served from DRAM directly. Up to our knowledge, RAMCloud is the fastest key/value store that is (a) available open source and (b) can be deployed in practice as a storage platform for applications. Other similar RDMA-based storage systems we looked at, like FaRM or HERD, are either not open source or they do not provide a clean separation between storage system, API and clients.
</p>
</div>
<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-memory/latency.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-memory/latency.svg" width="550"/></div>

<div style="text-align: justify">
<p>
@@ -142,7 +142,7 @@ The latency advantages of Crail are beneficial also at the application level. Th
</p>
</div>

<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-memory/cdf-broadcast-128-read.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-memory/cdf-broadcast-128-read.svg" width="550"/></div>

<div style="text-align: justify">
<p>
@@ -80,7 +80,7 @@ The main take away from this plot is that the time it takes to perform a random
</p>
</div>

<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-nvmf/latency.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-nvmf/latency.svg" width="550"/></div>
<br>

<div style="text-align: justify">
@@ -100,7 +100,7 @@ and SPDK:
For sequential operations in Crail, metadata fetching is inlined with data operations as described in the <a href="http://www.crail.io/blog/2017/08/crail-memory.html">DRAM</a> blog. This is possible as long as the data transfer has a lower latency than the metadata RPC, which is typically the case. As a consequence, our NVMf storage tier reaches the same throughput as the native SPDK benchmark (device limit).
</p>
</div>
<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-nvmf/throughput.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-nvmf/throughput.svg" width="550"/></div>

### Sequential Throughput

@@ -117,7 +117,7 @@ The direct stream benchmark:
./bin/crail iobench -t readAsync -s <size> -k <iterations> -b 128 -w 32 -f /tmp.dat
```

<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-nvmf/throughput2.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-nvmf/throughput2.svg" width="550"/></div>

### Random Read Latency

@@ -127,7 +127,7 @@ Random read latency is limited by the flash technology and we currently see arou
</p>
</div>

<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-nvmf/latency2.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-nvmf/latency2.svg" width="550"/></div>

### Tiering DRAM - NVMf

@@ -136,15 +136,15 @@ Random read latency is limited by the flash technology and we currently see arou
In this paragraph we show how Crail can leverage flash memory when there is not sufficient DRAM available in the cluster to hold all the data. As described in the <a href="http://www.crail.io/overview/">overview</a> section, if you have multiple storage tiers deployed in Crail, e.g. the DRAM tier and the NVMf tier, Crail by default first uses up all available resources of the faster tier. Basically a remote resource of a faster tier (e.g. remote DRAM) is preferred over a slower local resource (e.g., local flash), motivated by the fast network. This is what we call horizontal tiering.
</p>
</div>
<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-nvmf/crail_tiering.png" width="500" vspace="10"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-nvmf/crail_tiering.png" width="500" vspace="10"/></div>
<br>
<div style="text-align: justify">
<p>
In the following 200G Terasort experiment we gradually limit the DRAM resources in Crail while adding more flash to the Crail NVMf storage tier. Note that here Crail is used for both input/output as well as shuffle data. The figure shows that by putting all the data in flash we only increase the sorting time by around 48% compared to the configuration where all the data resides in DRAM. Considering the cost of DRAM and the advances in technology described above we believe cheaper NVM storage can replace DRAM for most of the applications with only a minor performance decrease. Also, note that even with 100% of the data in NVMe, Spark/Crail is still faster than vanilla Spark with all the data in memory. The vanilla Spark experiment uses Alluxio for input/output and RamFS for the shuffle data.
</p>
</div>

<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-nvmf/tiering.svg" width="550"/></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-nvmf/tiering.svg" width="550"/></div>

To summarize, in this blog we have shown that the NVMf storage backend for Crail -- due to its efficient user-level implementation -- offers latencies and throughput very close to the hardware speed. The Crail NVMf storage tier can be used conveniently in combination with the Crail DRAM tier to either save cost or to handle situations where the available DRAM is not sufficient to store the working set of a data processing workload.

@@ -34,7 +34,7 @@ As described in <a href="/blog/2017/08/crail-memory.html">part I</a>, Crail data
</p>
</div>

<div style="text-align:center"><img src ="http://crail.io/img/blog/crail-metadata/rpc.png" width="480"></div>
<div style="text-align:center"><img src ="{{ site.base }}/img/blog/crail-metadata/rpc.png" width="480"></div>
<br>

<div style="text-align: justify">

0 comments on commit 166bec8

Please sign in to comment.