335 contributions in the last year
Reviewed 9 pull requests in 1 repository
tensorflow/tensorflow 9 pull requests
- [XLA] Enable pointwise row vectorization for small row.
- [ROCm] Proper fix for XLA unit test seg fault issue.
- Adding no_rocm tags to current tests that are broken on ROCm.
- [ROCm] Enable xlogy and xlog1py for HIP GPU.
- Fix the shape inference function for FusedBatchNormGradEx
- Caching of Pointers to Redzone Checker Kernel
- [ROCm] creating a wrapper for the rocblas lib, and updating all calls to use the wrapper
- Improve BiasAdd GPU performance with more threads
- BFloat16 ROCM implementation for GEMM.