Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
a071ac5
add megasparse dir
InquestGeronimo Apr 5, 2022
1d7712a
edited readme to support DS integration
InquestGeronimo Apr 5, 2022
9db5ca8
small edit to docstring
InquestGeronimo Apr 5, 2022
47bfdbb
edited settings.py module so only 2 models are loaded.
InquestGeronimo Apr 5, 2022
9eea303
added more context to the readme about adding more models.
InquestGeronimo Apr 5, 2022
03bc62e
fixed image
InquestGeronimo Apr 5, 2022
dc09979
added different host to streamlit client as default
InquestGeronimo Apr 5, 2022
d1ec856
quality check edits
InquestGeronimo Apr 5, 2022
5f26aae
quality check commit
InquestGeronimo Apr 5, 2022
970e9dd
passing copyright quality check
InquestGeronimo Apr 5, 2022
a7fe2fb
content edits
InquestGeronimo Apr 5, 2022
f0e2e50
rename dir to sparseserver-ui
InquestGeronimo Apr 8, 2022
9195b90
added new config file for quick start
InquestGeronimo Apr 8, 2022
7714dce
edited multipipelineclient in settings.py
InquestGeronimo Apr 8, 2022
29ada34
changed name of config file
InquestGeronimo Apr 8, 2022
dee99e5
changed name of config file
InquestGeronimo Apr 8, 2022
9f3fdda
edited model stubs
InquestGeronimo Apr 8, 2022
a1a0a5b
edited readme
InquestGeronimo Apr 8, 2022
d53f792
added dependency pins
InquestGeronimo Apr 8, 2022
e0a7080
Merge branch 'main' into megasparse
InquestGeronimo Apr 8, 2022
e4facd3
Merge branch 'main' into megasparse
InquestGeronimo Apr 8, 2022
8286403
changed server pin
InquestGeronimo Apr 8, 2022
7aa50f9
Merge branch 'megasparse' of github.com:InquestGeronimo/deepsparse in…
InquestGeronimo Apr 8, 2022
62bbcf9
Merge branch 'main' into megasparse
InquestGeronimo Apr 8, 2022
4de8d88
edited model choice logic
InquestGeronimo Apr 8, 2022
d854ef5
Merge branch 'megasparse' of github.com:InquestGeronimo/deepsparse in…
InquestGeronimo Apr 8, 2022
c9ab2a8
altered front-end features
InquestGeronimo Apr 9, 2022
f4d041e
style update
InquestGeronimo Apr 9, 2022
95f895a
renamed samples file
InquestGeronimo Apr 9, 2022
deefbe5
renamed samples file
InquestGeronimo Apr 9, 2022
8951944
added variant descriptions
InquestGeronimo Apr 11, 2022
dceb0bc
edited variant descriptions
InquestGeronimo Apr 11, 2022
684b46f
style changes
InquestGeronimo Apr 11, 2022
14ea603
style changes
InquestGeronimo Apr 11, 2022
4ecec22
edited samples.py module
InquestGeronimo Apr 11, 2022
11f3f4d
style changes
InquestGeronimo Apr 11, 2022
476dfc6
added new pic
InquestGeronimo Apr 11, 2022
bc81378
edited README
InquestGeronimo Apr 11, 2022
ead19b8
edited README
InquestGeronimo Apr 11, 2022
96b2c76
edited README
InquestGeronimo Apr 11, 2022
4f50de8
updated readme cdms
InquestGeronimo Apr 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions examples/sparseserver-ui/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
<!--
Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+) <sup><samp>[**NEURAL MAGIC**](https://neuralmagic.com)</samp></sup> ![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)

███████╗██████╗ █████╗ ██████╗ ███████╗███████╗ ███████╗███████╗██████╗ ██╗ ██╗███████╗██████╗ ██╗ ██╗ ██╗
██╔════╝██╔══██╗██╔══██╗██╔══██╗██╔════╝██╔════╝ ██╔════╝██╔════╝██╔══██╗██║ ██║██╔════╝██╔══██╗ ██║ ██║ ██║
███████╗██████╔╝███████║██████╔╝███████╗█████╗ ███████╗█████╗ ██████╔╝██║ ██║█████╗ ██████╔╝ ██║ ██║ ██║
╚════██║██╔═══╝ ██╔══██║██╔══██╗╚════██║██╔══╝ ╚════██║██╔══╝ ██╔══██╗╚██╗ ██╔╝██╔══╝ ██╔══██╗ ██║ ██║ ██║
███████║██║ ██║ ██║██║ ██║███████║███████╗ ███████║███████╗██║ ██║ ╚████╔╝ ███████╗██║ ██║ ██╗ ╚██████╔ ██║
╚══════╝╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚══════╝ ╚══════╝╚══════╝╚═╝ ╚═╝ ╚═══╝ ╚══════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝


*** A Streamlit app for deploying the DeepSparse Server ***
![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)![#00F](https://via.placeholder.com/15/00F/000000?text=+)


## <div>`INTRO`</div>

<samp>

<div>
SparseServer.UI allows you to serve a streamlit app running on top of the DeepSparse Server for comparing the latency speeds of sparse transformer models. The purpose of this app is for you to familiarize and compare the inference performance of transformers trained with various sparse approaches.
</div>

<br />

[Getting Started with the DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server)

<br />

![alt text](./img/demo_screenshot.png)

<br />

## <div>`INSTALLATION`</div>

```bash
git clone https://github.com/neuralmagic/deepsparse.git
cd deepsparse/examples/sparseserver-ui
pip install -r requirements.txt
```
<br />

The `config.yaml` file in the `server` directory includes a list of four BERT QA models for the DeepSparse Server to get started. If you prefer to add additional models to the `config.yaml` file, make sure to also add a `MultiPipelineClient` object to the `variants` attribute in the `settings.py` module.

Currently, the SparseZoo contains 20 BERT models, and the `big-config.yaml` file contains the full list in case you want to load them all 🤯. To load all of the 20 models at once, make sure you have at least 16GB of RAM available, otherwise you will get out of memory errors. In addition, uncomment the pipelines in the `settings.py` module.

For more details on question answering models, please refer to our [updated list](https://sparsezoo.neuralmagic.com/?domain=nlp&sub_domain=question_answering&page=1).

## <div>`START SERVER`</div>

To download and initialize the four models in the `config.yaml` file, run:
```bash
deepsparse.server --config_file server/config.yaml
```

After downloading, the DeepSparse Server should now be running on host `0.0.0.0` and port `5543`.

## <div>`START CLIENT`</div>

Open a new terminal (make sure you are in your environment) and run the following command to start the Streamlit app:

```bash
streamlit run client/app.py --browser.serverAddress="localhost"
```

This will start the Streamlit app on `localhost` and port `8501`.
Visit `http://localhost:8501` in your browser to view the demo.

### Testing

- 20 models should fit on 16GB RAM of a c2-standard-4 VM instance on GCP
- Ubuntu 20.04.4 LTS
- Python 3.8.10
</samp>
54 changes: 54 additions & 0 deletions examples/sparseserver-ui/client/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from time import perf_counter

import streamlit as st
from samples import sample
from settings import FeatureHandler as feat


# Titles
st.markdown(feat.title, unsafe_allow_html=True)
st.markdown(feat.subtitle, unsafe_allow_html=True)

# Sidebar
st.sidebar.selectbox(feat.tasks_desc, feat.tasks)
model_choice = st.sidebar.radio(feat.variants_desc, feat.variants.keys())
st.sidebar.markdown(feat.code_banner)
st.sidebar.code(body=feat.code_text, language=feat.language)
st.sidebar.markdown(feat.repo_test)

# Footer
st.markdown(feat.footer, unsafe_allow_html=True)

# Inference
model = feat.variants[model_choice]
selection = st.selectbox(feat.example_index_label, feat.example_index)
context = st.text_area(
label=feat.example_context_label, value=sample[selection]["context"], height=300
)
question = st.text_area(
label=feat.example_question_label, value=sample[selection]["question"]
)
start = perf_counter()
answer = model(question=question, context=context)
end = perf_counter()
infer_time = end - start
infer_time = round(infer_time, 4)
st.markdown(feat.markdown_style, unsafe_allow_html=True)
st.markdown(
f'<p class="big-font">ANSWER: {answer["answer"]}</p>', unsafe_allow_html=True
)
st.markdown(f'<p class="big-font">{infer_time} secs.</p>', unsafe_allow_html=True)
47 changes: 47 additions & 0 deletions examples/sparseserver-ui/client/pipelineclient.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import json
from typing import List

import numpy
import requests


class MultiPipelineClient:
"""
Client object for making requests to the example DeepSparse BERT inference server

:param model: model alias of FastAPI route
:param address: IP address of the server, default is 0.0.0.0
:param port: Port the server is hosted on, default is 5543
"""

def __init__(self, model: str, address: str = "0.0.0.0", port: str = "5543"):

self.model = model
self._url = f"http://{address}:{port}/predict/{self.model}"

def __call__(self, **kwargs) -> List[numpy.ndarray]:

"""
:param kwargs: named inputs to the model server pipeline. e.g. for
question-answering - `question="...", context="..."

:return: json outputs from running the model server pipeline with the given
input(s)
"""

response = requests.post(self._url, json=kwargs)
return json.loads(response.content)
55 changes: 55 additions & 0 deletions examples/sparseserver-ui/client/samples.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

sample = {
"example 1": {
"context": (
"The DeepSparse Engine is a CPU runtime that delivers "
"GPU-class performance by taking advantage of sparsity within neural "
"networks to reduce compute required as well as accelerate memory bound "
"workloads. It is focused on model deployment and scaling machine "
"learning pipelines, fitting seamlessly into your existing deployments "
"as an inference backend. "
),
"question": (
"What does the DeepSparse Engine take advantage of within neural networks?"
),
},
"example 2": {
"context": (
"Concerns were raised over whether Levi's Stadium's field was of a high "
"enough quality to host a Super Bowl; during the inaugural season, the "
"field had to be re-sodded multiple times due to various issues, and "
"during a week 6 game earlier in the 2015 season, a portion of the turf "
"collapsed under Baltimore Ravens kicker Justin Tucker, causing him "
"to slip and miss a field goal. "
),
"question": ("What collapsed on Justin Tucker?"),
},
"example 3": {
"context": (
"The league announced on October 16, 2012, that the two finalists were Sun "
"Life Stadium and Levi's Stadium. The South Florida/Miami area has "
"previously hosted the event 10 times (tied for most with New Orleans), "
"with the most recent one being Super Bowl XLIV in 2010. The San Francisco "
"Bay Area last hosted in 1985 (Super Bowl XIX), held at Stanford Stadium "
"in Stanford, California, won by the home team 49ers. The Miami bid "
"depended on whether the stadium underwent renovations. "
),
"question": (
"What was the most recent Super Bowl that took place at Sun "
"Life Stadium in Miami?"
),
},
}
Loading