Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Your first task is to prepare a development environment with the required softwa

### Create workspace directory

Create a separate directory for all dependencies and repositories that this Learning Path uses.
Create a separate directory for all the dependencies and repositories that this Learning Path uses.

Export the `WORKSPACE` variable to point to this directory, which you will use in the following steps:

Expand Down Expand Up @@ -118,4 +118,4 @@ export PATH=$PATH:~/Library/Android/sdk/cmdline-tools/latest/bin
{{< /tab >}}
{{< /tabpane >}}

Now that your development environment is ready and all pre-requisites installed, you can test the Audio Stable Open Small model.
Now that your development environment is ready and all the prerequisites are installed, you can move on to test the Stable Audio Open Small model.
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ You can learn more about this model [here](https://huggingface.co/stabilityai/st

### Good prompting practices

A good prompt for this audio generation model can include:
A good prompt for the Stable Audio Open Small model can include the following elements:

* Music genre and subgenre.
* Musical elements (texture, rhythm and articulation).
Expand All @@ -41,5 +41,5 @@ The order of prompt parameters matters. For more information, see the [Prompt st

You can explore training and inference code for audio generation models in the [Stable Audio Tools repository](https://github.com/Stability-AI/stable-audio-tools).

Now that you've downloaded the model, continue to the next section to convert the model to LiteRT.
Now that you've downloaded the model, you're ready to convert it to LiteRT format in the next step.

Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
---
title: Convert Open Stable Audio Small model to LiteRT
title: Convert Stable Audio Open Small model to LiteRT
weight: 4

### FIXED, DO NOT MODIFY
layout: learningpathall
---
In this section, you will learn about the audio generation model. You will then clone a repository to run conversion steps, which is needed to generate the inference application.
In this section, you will learn about the audio generation model. You will then clone a repository that contains the scripts required to convert the model submodules into LiteRT format and generate the inference application.

## Stable Audio Open Small

The open-sourced model includes three main parts. They are described in the table below, and come together through the pipeline shown in the image.
The open-source model consists of three main submodules. They are described in the table below, and come together through the pipeline shown in the image.

|Submodule|Description|
|------|------|
Expand Down Expand Up @@ -47,7 +47,7 @@ git clone https://github.com/ARM-software/ML-examples.git
cd ML-examples/kleidiai-examples/audiogen/
```

Install the needed Python packages for this, including *onnx2tf* and *ai_edge_litert*
Install the required Python packages for this, including *onnx2tf* and *ai_edge_litert*

```bash
bash install_requirements.sh
Expand Down Expand Up @@ -78,7 +78,7 @@ pip install triton==3.2.0

The Conditioners submodule is based on the T5Encoder model. First, convert it to ONNX, then to LiteRT.

For this conversion, the following steps are needed:
For this conversion, the following steps are required:
1. Load the Conditioners submodule from the Stable Audio Open Small model configuration and checkpoint.
2. Export the Conditioners submodule to ONNX via *torch.onnx.export()*.
3. Convert the resulting ONNX file to LiteRT using *onnx2tf*.
Expand Down Expand Up @@ -109,9 +109,9 @@ python3 ./scripts/export_dit_autoencoder.py --model_config "$WORKSPACE/model_con

After successful conversion, you now have `dit_model.tflite` and `autoencoder_model.tflite` models in your current directory.

A more detailed explanation of the above scripts is available [here](https://github.com/ARM-software/ML-examples/blob/main/kleidiai-examples/audiogen/scripts/README.md)
A more detailed explanation of the above scripts is available [here](https://github.com/ARM-software/ML-examples/blob/main/kleidiai-examples/audiogen/scripts/README.md).

For easy access, add all needed models to one directory:
For easy access, add all the required models to one directory:

```bash
export LITERT_MODELS_PATH=$WORKSPACE/litert-models
Expand All @@ -121,7 +121,7 @@ cp dit_model.tflite $LITERT_MODELS_PATH
cp autoencoder_model.tflite $LITERT_MODELS_PATH
```

With all three submodules converted to LiteRT format, you're ready to build LiteRT and run the model on a mobile device in the next step.
With all three submodules now converted to LiteRT format, you're ready to build the runtime and run Stable Audio Open Small directly on an Android device in the next step.



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ layout: learningpathall

## LiteRT

LiteRT (short for Lite Runtime), formerly known as TensorFlow Lite, is Google's high-performance runtime for on-device AI. Designed for low-latency, resource-efficient execution, LiteRT is optimized for mobile and embedded environments — making it a natural fit for Arm CPUs running models lite Stable Audio Open Small. You will build the runtime using the framework using the Bazel build tool.
LiteRT (short for Lite Runtime), formerly known as TensorFlow Lite, is Google's high-performance runtime for on-device AI. Designed for low-latency, resource-efficient execution, LiteRT is optimized for mobile and embedded environments — making it a natural fit for Arm CPUs running models like Stable Audio Open Small. You’ll build the runtime using the Bazel build tool.

## Build LiteRT libraries

Expand Down Expand Up @@ -81,7 +81,7 @@ cmake ../tensorflow/lite/tools/cmake/native_tools/flatbuffers
cmake --build .
```

With flatbuffers and LiteRT built, you can now build the application for Android devices.
Now that LiteRT and FlatBuffers are built, you're ready to compile and deploy the Stable Audio Open Small inference application on your Android device.



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ layout: learningpathall

## Create and build a simple program

As a final step, you'll now build a simple program that runs inference on all three submodules directly on an Android device.
As a final step, you’ll build a simple program that runs inference on all three submodules directly on an Android device.

The program takes a text prompt as input and generates an audio file as output.

Expand Down Expand Up @@ -45,7 +45,9 @@ Verify this model was downloaded to your `WORKSPACE`.
ls $WORKSPACE/spiece.model
```

Connect your Android device to your development machine using a cable. adb (Android Debug Bridge) is available as part of the Android SDK. You should see your device on running the following command.
Connect your Android device to your development machine using a cable. adb (Android Debug Bridge) is available as part of the Android SDK.

You should see your device listed when you run the following command:

```bash
adb devices
Expand Down Expand Up @@ -82,7 +84,7 @@ LD_LIBRARY_PATH=. ./audiogen . "warm arpeggios on house beats 120BPM with drums
exit
```

The successful execution of the app will create `output.wav` of your chosen audio defined by the prompt, you can pull it back to your host machine and enjoy!
You can now pull the generated `output.wav` back to your host machine and listen to the result.

```bash
adb pull /data/local/tmp/app/output.wav
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,10 @@ minutes_to_complete: 30
who_is_this_for: This is an introductory topic for developers looking to deploy the Stable Audio Open Small text-to-audio model using LiteRT on an Android device.

learning_objectives:
- Download and learn about the Stable Audio Open Small.
- Create a simple application to generate audio.
- Download and test the Stable Audio Open Small model.
- Convert the Stable Audio Open Small model to the LiteRT (.tflite) format.
- Compile the application for an Arm CPU.
- Create a simple application that generates audio.
- Run the application on an Android smartphone and generate an audio snippet.

prerequisites:
Expand Down