diff --git a/examples/sparseserver-ui/README.md b/examples/sparseserver-ui/README.md index 65a5bc659f..8810b84b74 100644 --- a/examples/sparseserver-ui/README.md +++ b/examples/sparseserver-ui/README.md @@ -56,7 +56,7 @@ pip install -r requirements.txt The `config.yaml` file in the `server` directory includes a list of four BERT QA models for the DeepSparse Server to get started. If you prefer to add additional models to the `config.yaml` file, make sure to also add a `MultiPipelineClient` object to the `variants` attribute in the `settings.py` module. -Currently, the SparseZoo contains 20 BERT models, and the `big-config.yaml` file contains the full list in case you want to load them all 🤯. To load all of the 20 models at once, make sure you have at least 16GB of RAM available, otherwise you will get out of memory errors. In addition, uncomment the pipelines in the `settings.py` module. +Currently, the SparseZoo holds a vast list of BERT models, and the `big-config.yaml` file contains 19 models in case you want to load them 🤯. To load all of the 19 models at once, make sure you have at least 16GB of RAM available, otherwise you will get out of memory errors. In addition, uncomment the pipelines in the `settings.py` module. For more details on question answering models, please refer to our [updated list](https://sparsezoo.neuralmagic.com/?domain=nlp&sub_domain=question_answering&page=1). @@ -82,7 +82,7 @@ Visit `http://localhost:8501` in your browser to view the demo. ### Testing -- 20 models should fit on 16GB RAM of a c2-standard-4 VM instance on GCP +- 19 models should fit on 16GB RAM of a c2-standard-4 VM instance on GCP - Ubuntu 20.04.4 LTS - Python 3.8.10 diff --git a/examples/sparseserver-ui/client/settings.py b/examples/sparseserver-ui/client/settings.py index 3f5b03e33f..5cb71a5d53 100644 --- a/examples/sparseserver-ui/client/settings.py +++ b/examples/sparseserver-ui/client/settings.py @@ -85,9 +85,6 @@ class FeatureHandler: # "3-Layer BERT, 83% of Base Accuracy": MultiPipelineClient( # model="question_answering/3lagg83" # ), - # "12-Layer BERT, 90% of Base Accuracy": MultiPipelineClient( - # model="question_answering/12layer_pruned90" - # ), } title = "

✨ Neural Magic ✨

" diff --git a/examples/sparseserver-ui/server/big-config.yaml b/examples/sparseserver-ui/server/big-config.yaml index e7179842cf..ade88d1252 100644 --- a/examples/sparseserver-ui/server/big-config.yaml +++ b/examples/sparseserver-ui/server/big-config.yaml @@ -85,10 +85,6 @@ models: model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni batch_size: 1 alias: question_answering/12l_pruned80_quant - - task: question_answering - model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned90-none - batch_size: 1 - alias: question_answering/12layer_pruned90 - task: question_answering model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none batch_size: 1