-
NVIDIA
- Toronto, Canada
Highlights
Block or Report
Block or report Tabrizian
Report abuse
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abusePinned
-
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
-
-
learning-to-quantize Public
Code for "Adaptive Gradient Quantization for Data-Parallel SGD", published in NeurIPS 2020.
-
Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.
707 contributions in the last year
Less
More
Activity overview
Contributed to
triton-inference-server/server,
triton-inference-server/python_backend,
triton-inference-server/client
and 16 other
repositories
Contribution activity
March 2023
Created a pull request in triton-inference-server/python_backend that received 4 comments
Initialize CUDA driver API before using it
Looks like there were some changes in the CUDA driver API that is affecting our GPU tensor support.
+23
−2
•
4
comments
Opened 17 other pull requests in 6 repositories
triton-inference-server/server
2
open
6
merged
triton-inference-server/python_backend
3
merged
triton-inference-server/core
2
merged
triton-inference-server/client
2
merged
triton-inference-server/common
1
open
triton-inference-server/pytorch_backend
1
merged
Reviewed 50 pull requests in 8 repositories
triton-inference-server/server
20 pull requests
- Update post-23.03 release
- Add testing for model getting killed during initialization
- Update README and add RELEASE notes for 23.03
- Updated dlpack_test
- Adding repo tag to torch.hub.load to fix compatibility issues
- Linking examples
- Small refactoring (if/else paths)
- Made requested change by GKE to meet expectation
- Fix to catch all parser exception and report error
- Add libtorch ragged test in L0_batcher
- Add GRPC option for restricted protocol access
- Add environment re-extraction test
- Generate identity model to test linalg
- Add parsing paremters to the HTTP and GRPC frontends
- Update README and versions for r23.03 branch
- Add testing for handling bls error in initialize and finalize functions
- Modify L0_backend_python bls test for BLS decoupled support
- Adding test for Model Instance Kind Example added to Python Backend
- Remove redis mentions until finished
- Update NGC version post-23.02 release
triton-inference-server/python_backend
12 pull requests
- Update dlpack implementation for PbTensor
- Add healthiness check to avoid hanging during model initialization
- Adding repo tag to torch.hub.load to fix compatibility issues
- Fix L0_backend_python timeout issue (#218)
- Fix L0_backend_python timeout issue
- Link properly with dlclose and dlopen libraries
- Pass InferPayload pointer to callback function to fix reinterpret_cast issue on shared_ptr
- Re-extract environment if the archive has been updated
- Add request parameters to Python models
- Improve error message when BLS is used in 'initialize' or 'finalize' function
- Enhancement for BLS decoupled support
- Model Instance Kind Example for python backend
triton-inference-server/client
9 pull requests
- support bls
- Reduce overhead for custom input data case
- Add CLI documentation page
- Add active check on GRPC stream
- Add quick start documentation page
- Add option and testing for overhead percentage
- Improve custom data testing
- Update input data usage to loop through to match user-specified sequence length
- Add request send rate detection