Skip to content

v0.1.0-rc1

Pre-release
Pre-release
Compare
Choose a tag to compare
@rsomani95 rsomani95 released this 06 Mar 08:04

This release has the small, base and tiny encoders, both stock and optimised variants, at precisions fp16 and fp32.
All benchmarks have been run on a 16 inch M1 Max MBP 2021, 64GB RAM, MacOS 13.0

CleanShot 2023-03-06 at 13 32 56

Key observations from the XCode benchmarks:

  1. All FP16 exports (base and optimised arch) run on ANE 100%
  2. All FP32 exports (base and optimised arch) run on GPU 100%
  3. In FP16, the optimised arch is significantly faster
  4. In FP32, the base arch is significantly faster

Caveats:

  • It's unclear how exactly XCode runs their benchmark. We do not know how the performance scales at larger batch sizes.
  • Though the FP32 base model looks the fastest per XCode, we may see a different result when running the optimised ANE model on larger batch sizes as it is 100% ANE accelerated