Benchmarking code for evaluating inference latency of NLP models across hardware platforms and deep learning frameworks. For more details, see paper at: https://arxiv.org/abs/2302.06117.
Cite as:
@article{fernandez2023framework,
title={The Framework Tax: Disparities Between Inference Efficiency in Research and Deployment},
author={Fernandez, Jared and Kahn, Jacob and Na, Clara and Bisk, Yonatan and Strubell, Emma},
journal={arXiv preprint arXiv:2302.06117},
year={2023}
}