Skip to content

This repository is an implementation of quantizing and converting the Llama3-8B-Instruct model weights and deploying it on Android for on-device inference.

License

Notifications You must be signed in to change notification settings

NSTiwari/Llama3-on-Mobile

Repository files navigation

MobileLlama3: Llama3 on Mobile

This repository is an implementation of quantizing and converting the Llama3-8B-Instruct model weights and deploying it on Android for on-device inference.

Pipeline:

Demo Output:

Resources:

  1. Colab notebook to quantize and convert Llama3-8B-Instruct model
  2. HuggingFace repository for Llama3-8B-Instruct converted weights.
  3. Medium blog for step-by-step implementation to deploy Llama-3-8B-Instruct on Android.
  4. Medium blog to set up environment on Google Cloud Platform VM instance.
  5. Install the APK directly.

Citation

@software{mlc-llm,
    author = {MLC team},
    title = {{MLC-LLM}},
    url = {https://github.com/mlc-ai/mlc-llm},
    year = {2023}
}

About

This repository is an implementation of quantizing and converting the Llama3-8B-Instruct model weights and deploying it on Android for on-device inference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages