Skip to content

Latest commit

 

History

History
17 lines (12 loc) · 710 Bytes

README.md

File metadata and controls

17 lines (12 loc) · 710 Bytes

Tests to see if half precision (fp16) is working

Measure performance in notebook

Profile CUDA to make sure it is utilizing half precision

  1. pip install nvprof
  2. /usr/local/cuda/bin/nvprof --log-file nvprof_output.txt python profile_fp16.py
  3. cat nvprof_output.txt | grep fp16_s884

You should see some 884 calls

Awesome Nvidia resource on model conversion and why we need to copy model parameters