Skip to content

Quantize TinyLlama-1.1B-Chat from PyTorch to CoreML (float16, int8, int4) for efficient on-device inference on iOS 18+.

Notifications You must be signed in to change notification settings

ambv231/tinyllama-coreml-ios18-quantization

Error
Looks like something went wrong!

About

Quantize TinyLlama-1.1B-Chat from PyTorch to CoreML (float16, int8, int4) for efficient on-device inference on iOS 18+.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages