Releases: pytorch/serve
TorchServe v0.6.1 Release Notes
This is the release of TorchServe v0.6.1.
New Features
- Metrics Caching in Python backend - #1954 @maaquib @joshuaan7
- ONNX models served via ORT runtime & docs for TensorRT #1857. @msaroufim
- lPEX launcher core pinning #1401 . @min-jean-cho - to learn more https://pytorch.org/tutorials/intermediate/torchserve_with_ipex.html
New Examples
- DLRM example via torchrec #1648 @mreso
- Scriptable tokenizer example for text classification #1691 @mreso
- Loading large Huggingface models by using accelerate #1933 @jagadeeshi2i
- Stable diffusion Deepspeed MII example #1920 @jagadeeshi2i
- HuggingFace diffuser example #1904 @jagadeeshi2i
- On-premise near real-time video inference #1867 @agunapal
- fsspec for large scale batch inference from cloud buckets #1927 @kirkpa
- Torchdata example for unified training and inference preprocessing pipelines #1940 @PratsBhatt
- Wav2Vec2 SpeechToText from Huggingface #1939 @altre
Dependency Upgrades
- Support PyTorch 1.12 and Cuda 11.6 #1767 @lxning
- Upgraded to JDK17 - #1619 @rohithkrn
- Bumped gson version for security #1650 @lxning
Improvements
- Optimized gRPC workflow performance #1854 for gRPC workflow. @lxning
- Fixed worker shown as ready in DescribeModel endpoint before model is loaded #1679. @lxning
- Gracefully handle decoding exceptions in python backend #1789 @msaroufim
- Added handle OPTIONS in management API #1774 @xyang16
- Fixed model status API in KServe #1773 @jagadeeshi2i
- Fixed process verification in pid file - #1866 @rohithkrn
- Updated Nvidia Waveglow/Tacotron2 #1905 @kbumsik
- Added dev mode in
install_from_src.py
#1856 @msaroufim - Added the PV creation for K8 setup #1751 @jagadeeshi2i
- Fixed volume permission in kubernetes setup #1747 @jagadeeshi2i
- Upgraded hpa with v2beta2 api version #1760 @jagadeeshi2i
- Fixed gradle deprecation method #1936 @lxning
- Updated plugins/gradle.properties #1791 @liyaodev
- Fixed pynvml import failure #1882 @lxning
- Added pynvml exception management #1809 @lromor
- Fixed an erroneous logging format string and pylint pragma #1630 @bradlarsen
- Fixed broken path joins and unclosed files #1709 @DPeled
Build and CI
- Added ubuntu 20.04 GPU in docker build - #1773 @msaroufim
- Added spellchecking and link checking automation #1855 @sadra-barikbin
- Added full release automation #1739 @msaroufim
- Added workflow for pushing Conda nightly binaries #1685 @agunapal
- Added code coverage #1665 in CI build @msaroufim
- Unified documentation build dependencies #1759 @msaroufim
- Added skipping spellcheck if no changed files #1919 for skipping spellcheck if no changed files. @maaquib
- Added skipping flaky Java Windows test cases #1746 @msaroufim
- Added alarm on failed github action #1781 @msaroufim
Documentation
- Updated FAQ #1393 for how to decode international language @lxning
- Improved KServe documentation #1807 @jagadeeshi2i
- Updated
[examples/intel_extension_for_pytorch/README.md
#1816 @min-jean-cho - Fixed typos and dead links in doc.
Deprecations
- Deprecated old
ci/benchmark/buildspec.yml
#1658 @lxning - Deprecated old
docker/Dockerfile.neuron.dev
#1775 in favor of AWS SageMaker DLC. @rohithkrn - Deprecated redundant
LICENSE.txt
#1801 @msaroufim
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above, and JDK17.
GPU Support
Torch 1.11+ Cuda 10.2, 11.3, 11.6
Torch 1.9.0 + Cuda 11.1
Torch 1.8.1 + Cuda 9.2
TorchServe v0.6.0 Release Notes
This is the release of TorchServe v0.6.0.
New Features
- Support PyTorch 1.11 and Cuda 11.3 - Added support for PyTorch 1.11 and Cuda 11.3.
- Universal Auto Benchmark and Dashboard Tool - Added one command line tool for model analyzer to get benchmark report(sample) and dashboard on any device.
- HuggingFace model parallelism integration - Added example for HuggingFace model parallelism integration.
Build and CI
- Added nightly benchmark dashboard - Added nightly benchmark dashboard.
- Migrated CI, nightly binary and docker build to github workflow - Added CI, docker migration.
- Fixed gpu regression test
buildspec.yaml
- Added fixing for gpu regression testbuildspec.yaml
.
Documentation
- Updated documentation - Updated TorchServe, benchmark, snapshot and configuration documentation; fixed broken documentation build
Deprecations
- Deprecated old
benchmark/automated
directory in favor of new Github Action based workflow
Improvements
- Fixed workflow threads cleanup - Added fixing to clean workflow inference threadpool.
- Fixed empty model url - Added fixing for empty model url in model archiver.
- Fixed load model failure - Added support for loading a model directory.
- HuggingFace text generation example - Added text generation example.
- Updated metrics json and qlog format log - Added support for metrics json and qlog format log in log4j2.
- Added cpu, gpu and memory usage - Added cpu, gpu and memory usage in
benchmark-ab.py
report. - Added exception for
torch < 1.8.1
- Added exception to notifytorch < 1.8.1
. - Replaced hard code in
install_dependencies.py
- Added sys.executable ininstall_dependencies.py
. - Added default envelope for workflow - Added default envelope in model manager for workflow.
- Fixed multiple docker build errors - Fixed /home/venv write permission, typo in docker and added common requirements in docker.
- Fixed snapshot test - Added fixing for snapshot test.
- Updated
model_zoo.md
- Added dog breed, mmf and BERT in model zoo. - Added
nvgpu
in common requirements - Added nvgpu in common dependencies. - Fixed Inference API ping response - Fixed typo in Inference API ping response.
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above.
GPU Support
Torch 1.11+ Cuda 10.2, 11.3
Torch 1.9.0 + Cuda 11.1
Torch 1.8.1 + Cuda 9.2
TorchServe v0.5.3 Release Notes
This is the release of TorchServe v0.5.3.
New Features
- KServe V2 support - Added support for KServe V2 protocol.
- Model customized metadata support - Extended managementAPI to support customized metadata from handler.
Improvements
- Upgraded log4j2 version to 2.17.1 - Added log4j upgrade to address CVE-2021-44832.
- Upgraded pillow to 9.0.0, python support upgraded to py3.8/py3.9 - Added docker, install dependency upgrade.
- GPU utilization and GPU memory usage metrics support - Added support for GPU utilization and GPU memory usage metrics in benchmarks.
- Workflow benchmark support - Added support for workflow benchmark.
- benchmark-ab.py warmup support - Added support for warmup in benchmark-ab.py.
- Multiple inputs for a model inference example - Added example to support multiple inputs for a model inference.
- Documentation refactor - Improved documention.
- Added API auto-discovery - Added support for API auto-discovery.
- Nightly build support - Added support for Github action nightly build
pip install torchserve-nightly
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above.
GPU Support
Torch 1.10+ Cuda 10.2, 11.3
Torch 1.9.0 + Cuda 11.1
Torch 1.8.1 + Cuda 9.2
Planned Improvements
TorchServe v0.5.2 Release Notes
This is a hotfix release of Log4j issue.
Log4j Fixing
- Upgrade log4j2 version to 2.17.0 - Added log4j upgrade to address CVE-2021-45105.
TorchServe v0.5.1 Release Notes
This is a hotfix release of Log4j issue.
Log4j Fixing
- Upgrade log4j2 version to 2.16.0 - Added log4j upgrade to address CVE-2021-44228 and CVE-2021-45046.
New Features
- IPEX launcher support - Added support for Intel extension for PyTorch.
TorchServe v0.5.0 Release Notes
This is the release of TorchServe v0.5.0.
New Features
- PyTorch 1.10.0 support - TorchServe is now certified working with torch 1.10.0 torchvision 0.11.1, torchtext 0.11.0 and torchaudio 0.10.0
- Kubernetes HPA support - Added support for Kubernetes HPA.
- Faster transformer example - Added example for Faster transformer for optimized transformer model inference.
- (experimental) torchprep support - Added experimental CLI tool to prepare Pytorch models for efficient inference.
- Custom metrics example - Added example for custom metrics with mtail metrics exporter and Prometheus.
- Reactjs example for Image Classifier - Added example for Reactjs Image Classifier.
Improvements
- Batching inference exception support - Optimized batching to fix a concurrent modification exception that was occurring with batch inference.
- k8s cluster creation support upgrade - Updated Kubernetes cluster creation scripts for v1.17 support.
- Nvidia devices visibility support - Added support for NVIDIA devices visibility.
- Large image support - Added support for PIL.Image.MAX_IMAGE_PIXELS.
- Custom HTTP status support - Added support to return custom http status from a model handler.
- TS_CONFIG_FILE env var support - Added support for setting
TS_CONFIG_FILE
as env var. - Frontend build optimization - Optimized frontend to reduce build times by 3.7x.
- Warmup in benchmark - Added support for warmup in benchmark scripts.
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4)
GPU Support
Torch 1.10+ Cuda 10.2, 11.3
Torch 1.9.0 + Cuda 11.1
Torch 1.8.1 + Cuda 9.2
TorchServe v0.4.2 Release Notes
This is a hotfix release of TorchServe v0.4.2.
Improvements
TorchServe v0.4.1 Release Notes
This is the release of TorchServe v0.4.1.
New Features
- PyTorch 1.9.0 support - TorchServe is now certified working with torch 1.9.0 torchvision 0.10.0, torchtext 0.10.0 and torchaudio 0.9.0
- Model configuration support - Added support for model performance tuning on SageMaker via model configuration in config.properties.
- Serialize config snapshots to DynamoDB - Added support for serializing config snapshots to DDB.
- Prometheus metrics plugin support - Added support for Prometheus metrics plugin.
- Kubeflow Pipelines support - Added support for Kubeflow pipelines and Google Vertex AI Manages pipelines, see examples here
- KFServing docker support - Added production docker for KFServing.
- Python 3.9 support - TorchServe is now certified working with Python 3.9.
Improvements
- HF BERT models multiple GPU support - Added multi-gpu support for HuggingFace BERT models.
- Error log for customer python package installation - Added support to log error of customer python package installation.
- Workflow documentation optimization - Optimized workflow documentation.
Tooling improvements
- Mar file automation integration - Integrated mar file generation automation into pytest and postman test.
- Benchmark automation for AWS neuron support - Added support for AWS neuron benchmark automation.
- Staging binary build support - Added support for staging binary build.
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4)
GPU Support
Torch 1.9.0 + Cuda 10.2, 11.1
Torch 1.8.1 + Cuda 9.2, 10.1
TorchServe v0.4.0 Release Notes
This is the release of TorchServe v0.4.0.
New Features
- Workflow support - Added support for sequential and parallel ensemble models with Language Translation and Computer Vision classification examples.
- S3 Model Store SSE support - Added support for S3 server side model encryption via KMS.
- MMF-activity-recognition model example - Added example MMF-activity-recognition model
- PyTorch 1.8.1 support - TorchServe is now certified working with torch 1.8.1, torchvision 0.9.1, torchtext 0.9.1, and torchaudio 0.8.1
Improvements
- Fixed GPU memory high usage issue and updated model zoo - Fixed duplicate process on GPU device .
- gRPC max_request_size support- Added support for gRPC max_request_size configuration in config.properties.
- Non SSL request support - Added support for non SSL request.
- Benchmark automation support - Added support for benchmark automation.
- Support mar file generation automation - Added mar file generation automation.
Community Contributions
- Fairseq NMT example - Added Fairseq Neural Machine Translation example (contributed by @AshwinChafale)
- DeepLabV3 Image Segmentation example - Added DeepLabV3 Image Segmentation example (contributed by @alvarobartt)
Bug Fixes
- Huggingface_Transformers model example - Fixed Captum explanations fails with HF models.
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4)
GPU Support
Cuda 10.1, 10.2, 11.1