Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions examples/sparseserver-ui/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ pip install -r requirements.txt

The `config.yaml` file in the `server` directory includes a list of four BERT QA models for the DeepSparse Server to get started. If you prefer to add additional models to the `config.yaml` file, make sure to also add a `MultiPipelineClient` object to the `variants` attribute in the `settings.py` module.

Currently, the SparseZoo contains 20 BERT models, and the `big-config.yaml` file contains the full list in case you want to load them all 🤯. To load all of the 20 models at once, make sure you have at least 16GB of RAM available, otherwise you will get out of memory errors. In addition, uncomment the pipelines in the `settings.py` module.
Currently, the SparseZoo holds a vast list of BERT models, and the `big-config.yaml` file contains 19 models in case you want to load them 🤯. To load all of the 19 models at once, make sure you have at least 16GB of RAM available, otherwise you will get out of memory errors. In addition, uncomment the pipelines in the `settings.py` module.

For more details on question answering models, please refer to our [updated list](https://sparsezoo.neuralmagic.com/?domain=nlp&sub_domain=question_answering&page=1).

Expand All @@ -82,7 +82,7 @@ Visit `http://localhost:8501` in your browser to view the demo.

### Testing

- 20 models should fit on 16GB RAM of a c2-standard-4 VM instance on GCP
- 19 models should fit on 16GB RAM of a c2-standard-4 VM instance on GCP
- Ubuntu 20.04.4 LTS
- Python 3.8.10
</samp>
3 changes: 0 additions & 3 deletions examples/sparseserver-ui/client/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,9 +85,6 @@ class FeatureHandler:
# "3-Layer BERT, 83% of Base Accuracy": MultiPipelineClient(
# model="question_answering/3lagg83"
# ),
# "12-Layer BERT, 90% of Base Accuracy": MultiPipelineClient(
# model="question_answering/12layer_pruned90"
# ),
}

title = "<h1 style='text-align: Center; color: white;'>✨ Neural Magic ✨</h1>"
Expand Down
4 changes: 0 additions & 4 deletions examples/sparseserver-ui/server/big-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -85,10 +85,6 @@ models:
model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni
batch_size: 1
alias: question_answering/12l_pruned80_quant
- task: question_answering
model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned90-none
batch_size: 1
alias: question_answering/12layer_pruned90
- task: question_answering
model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none
batch_size: 1
Expand Down