Closed
Description
Description
There may be an opportunity to improve prediction latency on CPUs and/or GPUs by building TensorFlow Serving from source.
For example, on p2.xlarge
, TF Serving shows this log message:
Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
And on m5.large
, TF Serving shows this log message:
Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
Here's an example of someone building from source.
Question: Different instance type support different instruction sets. What happens if we compile against instruction sets (e.g. AVX 512
on m5.large
) that are not available on your instance (e.g. t3.large
or even t3a.large
)?