neuralmagic · InquestGeronimo · Apr 7, 2022 · Apr 1, 2022 · Apr 1, 2022 · Apr 4, 2022
diff --git a/README.md b/README.md
@@ -76,25 +76,25 @@ pip install deepsparse
 
 ## 🔌 DeepSparse Server
 
-The DeepSparse Server allows you to serve models and pipelines in deployment in CLI. The server runs on top of the popular FastAPI web framework and Uvicorn web server. Install the server using the following command:
+The DeepSparse Server allows you to serve models and pipelines from the terminal. The server runs on top of the popular FastAPI web framework and Uvicorn web server. Install the server using the following command:
 
 ```bash
 pip install deepsparse[server]
 ```
 
-**⭐ Single Model ⭐**
+### Single Model
 
 Once installed, the following example CLI command is available for running inference with a single BERT model:
 
 ```bash
 deepsparse.server \
     --task question_answering \
-    --model_path "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none"
+    --model_path "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni"
 ```
 
 To look up arguments run: `deepsparse.server --help`.
 
-**⭐ Multiple Models ⭐**
+### Multiple Models
 To serve multiple models in your deployment you can easily build a `config.yaml`. In the example below, we define two BERT models in our configuration for the question answering task:
 
 ```yaml
@@ -104,7 +104,7 @@ models:
       batch_size: 1
       alias: question_answering/base
     - task: question_answering
-      model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant-aggressive_95
+      model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni
       batch_size: 1
       alias: question_answering/pruned_quant
 ```
@@ -113,6 +113,9 @@ Finally, after your `config.yaml` file is built, run the server with the config
 ```bash
 deepsparse.server --config_file config.yaml
 ```
+
+[Getting Started with the DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) for more info.
+
 ## 📜 DeepSparse Benchmark
 
 The benchmark tool is available on your CLI to run expressive model benchmarks on the DeepSparse Engine with minimal parameters.
@@ -124,27 +127,26 @@ deepsparse.benchmark [-h] [-b BATCH_SIZE] [-shapes INPUT_SHAPES]
                           [-ncores NUM_CORES] [-s {async,sync}] [-t TIME]
                           [-nstreams NUM_STREAMS] [-pin {none,core,numa}]
                           [-q] [-x EXPORT_PATH]
-                            model_path
+                          model_path
 
 ```
 
 [Getting Started with CLI Benchmarking](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark_model) includes examples of select inference scenarios: 
 - Synchronous (Single-stream) Scenario
 - Asynchronous (Multi-stream) Scenario
-__ __
 
-## 👩‍💻 NLP Inference | Question Answering
+
+## 👩‍💻 NLP Inference Example
 
 ```python
 from deepsparse.transformers import pipeline
 
 # SparseZoo model stub or path to ONNX file
-onnx_filepath="zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_98"
+model_path = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni"
 
 qa_pipeline = pipeline(
     task="question-answering",
-    model_path=onnx_filepath,
-    num_cores=None, # uses all available CPU cores by default
+    model_path=model_path,
 )
 
 my_name = qa_pipeline(question="What's my name?", context="My name is Snorlax")
@@ -154,20 +156,19 @@ NLP Tutorials:
 - [Getting Started with Hugging Face Transformers 🤗](https://github.com/neuralmagic/deepsparse/tree/main/examples/huggingface-transformers)
 
 Tasks Supported: 
-- Text Classification (Sentiment Analysis)
-- Question Answering
-- Masked Language Modeling (MLM)
-
-__ __
+- [Token Classification: Named Entity Recognition](https://neuralmagic.com/use-cases/sparse-named-entity-recognition/)
+- [Text Classification: Multi-Class](https://neuralmagic.com/use-cases/sparse-multi-class-text-classification/)
+- [Text Classification: Binary](https://neuralmagic.com/use-cases/sparse-binary-text-classification/)
+- [Text Classification: Sentiment Analysis](https://neuralmagic.com/use-cases/sparse-sentiment-analysis/)
+- [Question Answering](https://neuralmagic.com/use-cases/sparse-question-answering/)
 
 ## 🦉 SparseZoo ONNX vs. Custom ONNX Models
 
 DeepSparse can accept ONNX models from two sources: 
 
-1. `SparseZoo ONNX`: our open-source collection of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) hosts inference-optimized models, trained on repeatable sparsification recipes using state-of-the-art techniques from [SparseML.](https://github.com/neuralmagic/sparseml)
-
-2. `Custom ONNX`: Your own ONNX model, can be dense or sparse. Plug in your model to compare performance with other solutions.
+- **SparseZoo ONNX**: our open-source collection of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) hosts inference-optimized models, trained on repeatable sparsification recipes using state-of-the-art techniques from [SparseML](https://github.com/neuralmagic/sparseml).
 
+- **Custom ONNX**: your own ONNX model, can be dense or sparse. Plug in your model to compare performance with other solutions.
 
 ```bash
 > wget https://github.com/onnx/models/raw/main/vision/classification/mobilenet/model/mobilenetv2-7.onnx
@@ -188,15 +189,13 @@ inputs = generate_random_inputs(onnx_filepath, batch_size)
 engine = compile_model(onnx_filepath, batch_size)
 outputs = engine.run(inputs)
 ```
-Compatibility/Support Notes
+Compatibility/Support Notes:
 - ONNX version 1.5-1.7
 - ONNX opset version 11+
 - ONNX IR version has not been tested at this time
 
 The [GitHub repository](https://github.com/neuralmagic/deepsparse) includes package APIs along with examples to quickly get started benchmarking and inferencing sparse models.
 
-__ __
-
 ## Scheduling Single-Stream, Multi-Stream, and Elastic Inference
 
 The DeepSparse Engine offers up to three types of inferences based on your use case. Read more details here: [Inference Types](https://github.com/neuralmagic/deepsparse/blob/main/docs/source/scheduler.md).
@@ -216,7 +215,6 @@ PRO TIP: The most common use cases for the multi-stream scheduler are where para
 3 ⚡ Elastic scheduling: requests execute in parallel, but not multiplexed on individual NUMA nodes.
 
 Use Case: A workload that might benefit from the elastic scheduler is one in which multiple requests need to be handled simultaneously, but where performance is hindered when those requests have to share an L3 cache.
-__ __
 
 ## 🧰 CPU Hardware Support
 
@@ -233,34 +231,29 @@ Here is a table detailing specific support for some algorithms over different mi
 
 ## Resources
 
-<table>
-<tr><th> Documentation </th><th> &emsp;&emsp;&emsp;Versions </th><th> Info </th></tr>
-<tr><td>
-
-[DeepSparse](https://docs.neuralmagic.com/deepsparse/)
-
-[SparseML](https://docs.neuralmagic.com/sparseml/)
 
-[SparseZoo](https://docs.neuralmagic.com/sparsezoo/)
+#### Libraries
+- [DeepSparse](https://docs.neuralmagic.com/deepsparse/)
 
-[Sparsify](https://docs.neuralmagic.com/sparsify/)
+- [SparseML](https://docs.neuralmagic.com/sparseml/)
 
-</td><td>
+- [SparseZoo](https://docs.neuralmagic.com/sparsezoo/)
 
-&emsp;stable : : [DeepSparse](https://pypi.org/project/deepsparse)
+- [Sparsify](https://docs.neuralmagic.com/sparsify/)
 
-&emsp;nightly (dev) : : [DeepSparse-Nightly](https://pypi.org/project/deepsparse-nightly/)
 
-&emsp;releases : : [GitHub](https://github.com/neuralmagic/deepsparse/releases)
+#### Versions
+- [DeepSparse](https://pypi.org/project/deepsparse) | stable
 
-</td><td>
+- [DeepSparse-Nightly](https://pypi.org/project/deepsparse-nightly/) | nightly (dev)
 
-[Blog](https://www.neuralmagic.com/blog/) 
+- [GitHub](https://github.com/neuralmagic/deepsparse/releases) | releases
 
-[Resources](https://www.neuralmagic.com/resources/)
+#### Info
 
-</td></tr> </table>
+- [Blog](https://www.neuralmagic.com/blog/) 
 
+- [Resources](https://www.neuralmagic.com/resources/)
 
 
 ## Community
@@ -270,7 +263,7 @@ Here is a table detailing specific support for some algorithms over different mi
 
 Contribute with code, examples, integrations, and documentation as well as bug reports and feature requests! [Learn how here.](https://github.com/neuralmagic/deepsparse/blob/main/CONTRIBUTING.md)
 
-For user help or questions about DeepSparse, sign up or log in to our [**Deep Sparse Community Slack**](https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ). We are growing the community member by member and happy to see you there. Bugs, feature requests, or additional questions can also be posted to our [GitHub Issue Queue.](https://github.com/neuralmagic/deepsparse/issues) You can get the latest news, webinar and event invites, research papers, and other ML Performance tidbits by [subscribing](https://neuralmagic.com/subscribe/) to the Neural Magic community.
+For user help or questions about DeepSparse, sign up or log in to our **[Deep Sparse Community Slack](https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ)**. We are growing the community member by member and happy to see you there. Bugs, feature requests, or additional questions can also be posted to our [GitHub Issue Queue.](https://github.com/neuralmagic/deepsparse/issues) You can get the latest news, webinar and event invites, research papers, and other ML Performance tidbits by [subscribing](https://neuralmagic.com/subscribe/) to the Neural Magic community.
 
 For more general questions about Neural Magic, complete this [form.](http://neuralmagic.com/contact/)