Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
4413025
Update learning path weights and add overview for AFM-4.5B deployment…
madeline-underwood Jul 15, 2025
e6a37c1
Update _index.md
madeline-underwood Jul 16, 2025
07fe4ef
Update 00_overview.md
madeline-underwood Jul 16, 2025
f3d7d09
Update 01_launching_a_graviton4_instance.md
madeline-underwood Jul 16, 2025
e8a313a
Merge branch 'ArmDeveloperEcosystem:main' into Arcee
madeline-underwood Jul 17, 2025
002706f
Update _index.md
madeline-underwood Jul 17, 2025
325e40f
Update 00_overview.md
madeline-underwood Jul 17, 2025
763e3cc
Update 00_overview.md
madeline-underwood Jul 17, 2025
1aa9be2
Update 01_launching_a_graviton4_instance.md
madeline-underwood Jul 17, 2025
affe227
Update 01_launching_a_graviton4_instance.md
madeline-underwood Jul 17, 2025
db3af87
Update _index.md
madeline-underwood Jul 18, 2025
08d7da9
Update _index.md
madeline-underwood Jul 18, 2025
b3f475a
Update 01_launching_a_graviton4_instance.md
madeline-underwood Jul 18, 2025
f1d8a2a
Update 01_launching_a_graviton4_instance.md
madeline-underwood Jul 18, 2025
b47d193
Update 02_setting_up_the_instance.md
madeline-underwood Jul 18, 2025
6ad6e74
Update 03_building_llama_cpp.md
madeline-underwood Jul 18, 2025
9815562
Update 03_building_llama_cpp.md
madeline-underwood Jul 18, 2025
4325e30
Update 04_install_python_dependencies_for_llama_cpp.md
madeline-underwood Jul 18, 2025
fe61016
Update 04_install_python_dependencies_for_llama_cpp.md
madeline-underwood Jul 18, 2025
21b51d4
Update 04_install_python_dependencies_for_llama_cpp.md
madeline-underwood Jul 18, 2025
21aa51a
Update 05_downloading_and_optimizing_afm45b.md
madeline-underwood Jul 18, 2025
dc5df12
Update 05_downloading_and_optimizing_afm45b.md
madeline-underwood Jul 18, 2025
ea07cdf
Update 05_downloading_and_optimizing_afm45b.md
madeline-underwood Jul 18, 2025
c110403
Update 05_downloading_and_optimizing_afm45b.md
madeline-underwood Jul 18, 2025
b03dced
Update 06_running_inference.md
madeline-underwood Jul 18, 2025
ef1be6c
Update 07_evaluating_the_quantized_models.md
madeline-underwood Jul 18, 2025
cd4adbd
Update 07_evaluating_the_quantized_models.md
madeline-underwood Jul 18, 2025
c957be0
Update 08_conclusion.md
madeline-underwood Jul 19, 2025
cc793c5
Update 08_conclusion.md
madeline-underwood Jul 19, 2025
acb2c2d
Update 08_conclusion.md
madeline-underwood Jul 20, 2025
77d24c7
Multiple enhancements
madeline-underwood Jul 20, 2025
a3857d8
Final tweaks
madeline-underwood Jul 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
title: Overview
weight: 2

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## The AFM-4.5B model

AFM-4.5B is a 4.5-billion-parameter foundation model designed to balance accuracy, efficiency, and broad language coverage. Trained on nearly 7 trillion tokens of carefully filtered data, it performs well across a wide range of languages, including Arabic, English, French, German, Hindi, Italian, Korean, Mandarin, Portuguese, Russian, and Spanish.

In this Learning Path, you'll deploy AFM-4.5B using [Llama.cpp](https://github.com/ggerganov/llama.cpp) on an Arm-based AWS Graviton4 instance. You’ll walk through the full workflow, from setting up your environment and compiling the runtime, to downloading, quantizing, and running inference on the model. You'll also evaluate model quality using perplexity, a common metric for measuring how well a language model predicts text.

This hands-on guide helps developers build cost-efficient, high-performance LLM applications on modern Arm server infrastructure using open-source tools and real-world deployment practices.

### LLM deployment workflow on Arm Graviton4

- **Provision compute**: launch an EC2 instance using a Graviton4-based instance type (for example, `c8g.4xlarge`)

- **Set up your environment**: install the required build tools and dependencies (such as CMake, Python, and Git)

- **Build the inference engine**: clone the [Llama.cpp](https://github.com/ggerganov/llama.cpp) repository and compile the project for your Arm-based environment

- **Prepare the model**: download the **AFM-4.5B** model files from Hugging Face and use Llama.cpp's quantization tools to reduce model size and optimize performance

- **Run inference**: load the quantized model and run sample prompts using Llama.cpp.

- **Evaluate model quality**: calculate **perplexity** or use other metrics to assess model performance

{{< notice Note>}}
You can reuse this deployment flow with other models supported by Llama.cpp by swapping out the model file and adjusting quantization settings.
{{< /notice >}}




Original file line number Diff line number Diff line change
@@ -1,171 +1,118 @@
---
title: Launching a Graviton4 instance
weight: 2
title: Provision your Graviton4 environment
weight: 3

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Requirements

- An AWS account
Before you begin, make sure you have the following:

- Access to launch an EC2 instance of type `c8g.4xlarge` (or larger) with at least 128 GB of storage
- An AWS account
- Permission to launch a Graviton4 EC2 instance of type `c8g.4xlarge` (or larger)
- At least 128 GB of available storage

For more information about creating an EC2 instance using AWS refer to [Getting Started with AWS](/learning-paths/servers-and-cloud-computing/csp/aws/).
If you're new to EC2, check out the Learning Path [Getting Started with AWS](/learning-paths/servers-and-cloud-computing/csp/aws/).

## AWS Console Steps
## Create an SSH key pair

Follow these steps to launch your EC2 instance using the AWS Management Console:
To deploy the Arcee AFM-4.5B model, you need an EC2 instance running on Arm-based Graviton4 hardware.

### Step 1: Create an SSH Key Pair
To do this, start by signing in to the [AWS Management Console](https://console.aws.amazon.com), then navigate to the **EC2** service.

1. **Navigate to EC2 Console**
From there, you can create an SSH key pair that allows you to connect to your instance securely.

- Go to the [AWS Management Console](https://console.aws.amazon.com)
## Set up secure access

- Search for "EC2" and click on "EC2" service
Open the **Key Pairs** section under **Network & Security** in the sidebar, and create a new key pair named `arcee-graviton4-key`.

2. **Create Key Pair**
Next, select **RSA** as the key type, and **.pem** as the file format. Once you create the key, your browser will download the `.pem` file automatically.

- In the left navigation pane, click "Key Pairs" under "Network & Security"
To ensure the key remains secure and accessible, move the `.pem` file to your SSH configuration directory, and update its permissions to restrict access.

- Click "Create key pair"
To do this, on macOS or Linux, run:

- Enter name: `arcee-graviton4-key`
```bash
mkdir -p ~/.ssh
mv arcee-graviton4-key.pem ~/.ssh/
chmod 400 ~/.ssh/arcee-graviton4-key.pem
```
internet
## Launch and configure the EC2 instance

- Select "RSA" as the key pair type
In the left sidebar of the EC2 dashboard, select **Instances**, and then **Launch instances**.

- Select ".pem" as the private key file format
Use the following settings to configure your instance:

- Click "Create key pair"
- **Name**: `Arcee-Graviton4-Instance`
- **Application and OS image**:
- Select the **Quick Start** tab
- Select **Ubuntu Server 24.04 LTS (HVM), SSD Volume Type**
- Ensure the architecture is set to **64-bit (ARM)**
- **Instance type**: select `c8g.4xlarge` or larger
- **Key pair name**: select `arcee-graviton4-key` from the list

- The private key file will automatically download to your computer
## Configure network

3. **Secure the Key File**
To enable internet access, choose a VPC with at least one public subnet.

- Move the downloaded `.pem` file to the SSH configuration directory
Then select a public subnet from the list.

```bash
mkdir -p ~/.ssh
mv arcee-graviton4-key.pem ~/.ssh
```
Under **Auto-assign public IP**, select **Enable**.

- Set proper permissions on macOS or Linux:
## Configure firewall

```bash
chmod 400 ~/.ssh/arcee-graviton4-key.pem
```
Select **Create security group**. Then select **Allow SSH traffic from** and select **My IP**.

### Step 2: Launch EC2 Instance
{{% notice Note %}}
You'll only be able to connect to the instance from your current host, which is the most secure setting. Avoid selecting **Anywhere** unless absolutely necessary, as this setting allows anyone on the internet to attempt a connection.

1. **Start Instance Launch**

- In the left navigation pane, click "Instances" under "Instances"

- Click "Launch instances" button

2. **Configure Instance Details**

- **Name and tags**: Enter `Arcee-Graviton4-Instance` as the instance name

- **Application and OS Images**:
- Click "Quick Start" tab

- Select "Ubuntu"

- Choose "Ubuntu Server 24.04 LTS (HVM), SSD Volume Type"

- **Important**: Ensure the architecture shows "64-bit (ARM)" for Graviton compatibility

- **Instance type**:
- Click on "Select instance type"

- Select `c8g.4xlarge` or larger

3. **Configure Key Pair**

In "Key pair name", select the SSH keypair you created earlier (`Arcee-Graviton4-Instance`)

4. **Configure Network Settings**

- **Network**: Select a VPC with a least one public subnet.

- **Subnet**: Select a public subnet in the VPC

- **Auto-assign Public IP**: Enable

- **Firewall (security groups)**

- Click on "Create security group"

- Click on "Allow SSH traffic from"

- In the dropdown list, select "My IP".


{{% notice Notes %}}
You will only be able to connect to the instance from your current host, which is the safest setting. Selecting "Anywhere" allows anyone on the Internet to attempt to connect; use at your own risk.

Although this demonstration only requires SSH access, it is possible to use one of your existing security groups as long as it allows SSH traffic.
You only need SSH access for this Learning Path. If you already have a security group that allows inbound SSH traffic, you can reuse it.
{{% /notice %}}

5. **Configure Storage**

- **Root volume**:
- Size: `128` GB

- Volume type: `gp3`

7. **Review and Launch**

- Review all settings in the "Summary" section
## Configure storage

- Click "Launch instance"
Set the **root volume size** to `128` GB, then select **gp3** as the volume type.

### Step 3: Monitor Instance Launch
## Review and launch the instance

1. **View Launch Status**
Review all your configuration settings, and when you're ready, select **Launch instance** to create your EC2 instance.

After a few seconds, you should see a message similar to this one:
## Monitor the instance launch

`Successfully initiated launch of instance (i-<unique instance ID>)`
After a few seconds, you should see a confirmation message like this:

If instance launch fails, please review your settings and try again.
```
Successfully initiated launch of instance (i-xxxxxxxxxxxxxxxxx)
```

2. **Get Connection Information**
If the launch fails, double-check the instance type, permissions, and network settings.

- Click on the instance id, or look for the instance in the Instances list in the EC2 console.
To retrieve the connection details, go to the **Instances** list in the EC2 dashboard.

- In the "Details" tab of the instance, note the "Public DNS" host name
Then select your instance by selecting **Instance ID**.

- This is the host name you'll use to connect via SSH, aka `PUBLIC_DNS_HOSTNAME`
In the **Details** tab, copy the **Public DNS** value - youll use this to connect through SSH.

### Step 4: Connect to Your Instance
## Connect to your instance

1. **Open Terminal/Command Prompt**
Open a terminal and connect to the instance using the SSH key you downloaded earlier:

2. **Connect via SSH**
```bash
ssh -i ~/.ssh/arcee-graviton4-key.pem ubuntu@<PUBLIC_DNS_HOSTNAME>
```
```bash
ssh -i ~/.ssh/arcee-graviton4-key.pem ubuntu@<PUBLIC_DNS_HOSTNAME>
```

3. **Accept Security Warning**
When prompted, type `yes` to confirm the connection.

- When prompted about authenticity of host, type `yes`

- You should now be connected to your Ubuntu instance

### Important Notes

- **Region Selection**: Ensure you're in your preferred AWS region before launching

- **AMI Selection**: The Ubuntu 24.04 LTS AMI must be ARM64 compatible for Graviton processors

- **Security**: Think twice about allowing SSH from anywhere (0.0.0.0/0). It is strongly recommended to restrict access to your IP address.

- **Storage**: The 128GB EBS volume is sufficient for the Arcee model and dependencies

- **Backup**: Consider creating AMIs or snapshots for backup purposes
You should now be connected to your Ubuntu instance running on Graviton4.

{{% notice Note %}}
**Region**: make sure you're launching in your preferred AWS region.
**AMI**: confirm that the selected AMI supports the Arm64 architecture.
**Security**: for best practice, restrict SSH access to your own IP.
**Storage**: 128 GB is sufficient for the AFM-4.5B model and dependencies.
**Backup**: consider creating an AMI or snapshot after setup is complete.
{{% /notice %}}

Original file line number Diff line number Diff line change
@@ -1,51 +1,58 @@
---
title: Setting up the instance
weight: 3
title: Configure your Graviton4 environment
weight: 4

### FIXED, DO NOT MODIFY
layout: learningpathall
---

In this step, you'll set up the Graviton4 instance with all the necessary tools and dependencies required to build and run the Arcee Foundation Model. This includes installing the build tools and Python environment.
In this step, you'll set up the Graviton4 instance with the tools and dependencies required to build and run the Arcee Foundation Model. This includes installing system packages and a Python environment.

## Step 1: Update Package List
## Update the package list

Run the following command to update your local APT package index:

```bash
sudo apt-get update
```

This command updates the local package index from the repositories:
This step ensures you have the most recent metadata about available packages, including versions and dependencies. It helps prevent conflicts when installing new packages.

- Downloads the latest package lists from all configured APT repositories
- Ensures you have the most recent information about available packages and their versions
- This is a best practice before installing new packages to avoid potential conflicts
- The package index contains metadata about available packages, their dependencies, and version information
## Install system dependencies

## Step 2: Install System Dependencies
Install the build tools and Python environment:

```bash
sudo apt-get install cmake gcc g++ git python3 python3-pip python3-virtualenv libcurl4-openssl-dev unzip -y
```

This command installs all the essential development tools and dependencies:
This command installs the following tools and dependencies:

- **CMake**: cross-platform build system generator used to compile and build Llama.cpp

- **GCC and G++**: GNU C and C++ compilers for compiling native code

- **Git**: version control system for cloning repositories

- **Python 3**: Python interpreter for running Python-based tools and scripts

- **Pip**: Python package manager

- **Virtualenv**: tool for creating isolated Python environments

- **libcurl4-openssl-dev**: development files for the curl HTTP library

- **cmake**: Cross-platform build system generator used to compile Llama.cpp
- **gcc & g++**: GNU C and C++ compilers for building native code
- **git**: Version control system for cloning repositories
- **python3**: Python interpreter for running Python-based tools and scripts
- **python3-pip**: Python package installer for managing Python dependencies
- **python3-virtualenv**: Tool for creating isolated Python environments
- **libcurl4-openssl-dev**: client-side URL transfer library
- **Unzip**: tool to extract `.zip` files (used in some model downloads)

The `-y` flag automatically answers "yes" to prompts, making the installation non-interactive.
The `-y` flag automatically approves the installation of all packages without prompting.

## What's Ready Now?
## Ready for build and deployment

After completing these steps, your Graviton4 instance has:
After completing the setup, your instance includes the following tools and environments:

- A complete C/C++ development environment for building Llama.cpp
- Python 3 with pip for managing Python packages
- Python 3, pip, and virtualenv for managing Python tools and environments
- Git for cloning repositories
- All necessary build tools for compiling optimized ARM64 binaries
- All required dependencies for compiling optimized Arm64 binaries

The system is now prepared for the next steps: building Llama.cpp and downloading the Arcee Foundation Model.
You're now ready to build Llama.cpp and download the Arcee Foundation Model.
Loading
Loading