From 7e861e03f4112443376f47e13ffea0321215cfd7 Mon Sep 17 00:00:00 2001
From: Ricky Costa <rcosta84@gmail.com>
Date: Fri, 1 Apr 2022 12:39:40 -0400
Subject: [PATCH 1/4] altered emoji and title font sizes to match other readmes

---
 src/deepsparse/benchmark_model/README.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/deepsparse/benchmark_model/README.md b/src/deepsparse/benchmark_model/README.md
index 789a23255e..f721be7180 100644
--- a/src/deepsparse/benchmark_model/README.md
+++ b/src/deepsparse/benchmark_model/README.md
@@ -14,11 +14,11 @@ See the License for the specific language governing permissions and
 limitations under the License.
 -->
 
-# Benchmarking ONNX Models 📜
+## 📜 Benchmarking ONNX Models
 
 `deepsparse.benchmark` is a command-line (CLI) tool for benchmarking the DeepSparse Engine with ONNX models. The tool will parse the arguments, download/compile the network into the engine, generate input tensors, and execute the model depending on the chosen scenario. By default, it will choose a multi-stream or asynchronous mode to optimize for throughput.
 
-## Quickstart
+### Quickstart
 
 After `pip install deepsparse`, the benchmark tool is available on your CLI. For example, to benchmark a dense BERT ONNX model fine-tuned on the SST2 dataset where the model path is the minimum input required to get started, run:
 
@@ -26,7 +26,7 @@ After `pip install deepsparse`, the benchmark tool is available on your CLI. For
 deepsparse.benchmark zoo:nlp/text_classification/bert-base/pytorch/huggingface/sst2/base-none
 ```
 __ __
-## Usage
+### Usage
 
 In most cases, good performance will be found in the default options so it can be as simple as running the command with a SparseZoo model stub or your local ONNX model. However, if you prefer to customize benchmarking for your personal use case, you can run `deepsparse.benchmark -h` or with `--help` to view your usage options:
 
@@ -100,7 +100,7 @@ Output of the JSON file:
 
 ![alt text](./img/json_output.png)
 
-## Sample CLI Argument Configurations
+### Sample CLI Argument Configurations
 
 To run a sparse FP32 MobileNetV1 at batch size 16 for 10 seconds for throughput using 8 streams of requests:
 

From 487493523594992c918aec1383b872f792984523 Mon Sep 17 00:00:00 2001
From: Ricky Costa <rcosta84@gmail.com>
Date: Fri, 1 Apr 2022 12:41:47 -0400
Subject: [PATCH 2/4] altered emoji and title font sizes to match other readmes

---
 src/deepsparse/benchmark_model/README.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/deepsparse/benchmark_model/README.md b/src/deepsparse/benchmark_model/README.md
index f721be7180..c67133744e 100644
--- a/src/deepsparse/benchmark_model/README.md
+++ b/src/deepsparse/benchmark_model/README.md
@@ -100,7 +100,7 @@ Output of the JSON file:
 
 ![alt text](./img/json_output.png)
 
-### Sample CLI Argument Configurations
+#### Sample CLI Argument Configurations
 
 To run a sparse FP32 MobileNetV1 at batch size 16 for 10 seconds for throughput using 8 streams of requests:
 
@@ -114,21 +114,21 @@ To run a sparse quantized INT8 6-layer BERT at batch size 1 for latency:
 deepsparse.benchmark zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant_6layers-aggressive_96 --batch_size 1 --scenario sync
 ```
 __ __
-## Inference Scenarios ⚡⚡
+### ⚡ Inference Scenarios
 
-### Synchronous (Single-stream) Scenario
+#### Synchronous (Single-stream) Scenario
 
 Set by the `--scenario sync` argument, the goal metric is latency per batch (ms/batch). This scenario submits a single inference request at a time to the engine, recording the time taken for a request to return an output. This mimics an edge deployment scenario.
 
 The latency value reported is the mean of all latencies recorded during the execution period for the given batch size.
 
-### Asynchronous (Multi-stream) Scenario
+#### Asynchronous (Multi-stream) Scenario
 
 Set by the `--scenario async` argument, the goal metric is throughput in items per second (i/s). This scenario submits `--num_streams` concurrent inference requests to the engine, recording the time taken for each request to return an output. This mimics a model server or bulk batch deployment scenario.
 
 The throughput value reported comes from measuring the number of finished inferences within the execution time and the batch size.
 
-### Example Benchmarking Output of Synchronous vs. Asynchronous
+#### Example Benchmarking Output of Synchronous vs. Asynchronous
 
 **BERT 3-layer FP32 Sparse Throughput**
 

From 988f768c6c60156107ae471d3eee62873dc1fd86 Mon Sep 17 00:00:00 2001
From: Ricky Costa <rcosta84@gmail.com>
Date: Mon, 4 Apr 2022 13:33:45 -0400
Subject: [PATCH 3/4] fix yaml code block indentation

---
 README.md | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index b719fd6094..91696a09cf 100644
--- a/README.md
+++ b/README.md
@@ -97,15 +97,17 @@ To look up arguments run: `deepsparse.server --help`.
 **⭐ Multiple Models ⭐**
 To serve multiple models in your deployment you can easily build a `config.yaml`. In the example below, we define two BERT models in our configuration for the question answering task:
 
-    models:
+```yaml
+models:
     - task: question_answering
         model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none
         batch_size: 1
-        alias: question_answering/dense
+        alias: question_answering/base
     - task: question_answering
         model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant-aggressive_95
         batch_size: 1
-        alias: question_answering/sparse_quantized
+        alias: question_answering/pruned_quant
+```
 
 Finally, after your `config.yaml` file is built, run the server with the config file path as an argument:
 ```bash

From 15cebcacaa66a49b0aa7e88d25f3b037c67c18ed Mon Sep 17 00:00:00 2001
From: Ricky Costa <rcosta84@gmail.com>
Date: Mon, 4 Apr 2022 13:50:59 -0400
Subject: [PATCH 4/4] aligned indentation 2nd time

---
 README.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/README.md b/README.md
index 91696a09cf..54e8bfa51f 100644
--- a/README.md
+++ b/README.md
@@ -100,13 +100,13 @@ To serve multiple models in your deployment you can easily build a `config.yaml`
 ```yaml
 models:
     - task: question_answering
-        model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none
-        batch_size: 1
-        alias: question_answering/base
+      model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none
+      batch_size: 1
+      alias: question_answering/base
     - task: question_answering
-        model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant-aggressive_95
-        batch_size: 1
-        alias: question_answering/pruned_quant
+      model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant-aggressive_95
+      batch_size: 1
+      alias: question_answering/pruned_quant
 ```
 
 Finally, after your `config.yaml` file is built, run the server with the config file path as an argument: