# AIOS Model Onboarding Tutorial: Prerequisites & Setup
What is AIOS - #DK 
Overall Goal - #SS
    This Video Series
    This Jupyter notebook GOAL-Local context goal
    
Welcome to the AIOS Model Onboarding Tutorial Series! This series of videos and notebooks will guide you through the entire process of integrating and deploying a model on the AIOS platform.

## Overview & Prerequisites

This first video and notebook will cover the foundational concepts and prerequisites you'll need to get started. We will discuss:

- The overall plan for the tutorial series.
- The AIOS SDKs and how to choose the right one for your model.
- Key concepts like the AIOS Packet and the methods for model onboarding.

## Prerequisites

Before we begin, it's important to have a basic understanding of the AIOS SDKs. These SDKs provide the tools and utilities to integrate your model with the AIOS platform.

### AIOS Instance SDK
Better info/version from #DK

The AIOS Instance SDK is a Python library designed as a foundational framework for implementing custom computational logic that operates as servable instances within a modular block architecture. It is general-purpose and supports a wide range of computational workloads, including AI model inference, web application backends, and more.

**Core Features:**
- Modular execution logic (e.g., preprocessing, inference, postprocessing).
- Custom management operations.
- Native batching and multiplexing support.
- GPU compatibility.
- State management utilities.
- Custom telemetry and performance metrics.

### AIOS SDK Utilities

AIOS provides several SDK utilities to simplify the integration process for different types of models. The two main utilities are:

- **`aios_llama_cpp`**: A utility wrapper around the `llama_cpp` library to simplify model loading, tokenization, streaming, and batch inference for GGUF models. It supports both single-GPU and multi-GPU execution.
- **`aios_transformers`**: A utility wrapper around Hugging Face's `transformers` library, designed for inference with large language models. It supports both single-GPU and multi-GPU execution using tensor parallelism.

### Choosing `aios_llama_cpp`

For this tutorial, we will be using the `aios_llama_cpp` SDK. It is well-suited for our purposes because it provides a simple and efficient way to serve GGUF models. While it also supports multi-GPU setups, its straightforward approach is ideal for demonstrating the core concepts of model onboarding. The SDK includes the `LLAMAUtils` class for model interaction and the `LLMMetrics` class for performance monitoring.

### What is a Packet in AIOS V1?

The AIOS Packet is the standard data structure used for communication between different components in the AIOS ecosystem. It is defined as follows:

```protobuf
message AIOSPacket {
    string session_id = 1;  // A unique session_id for each block - useful for stateful inference
    uint64 seq_no = 2;      // The unique sequence number if you are doing sequence inference
    string data = 4;        // The data payload (optional)
    double ts = 5;          // The timestamp in unix epoch format (optional)
    string output_ptr = 6;  // The output pointer structure (optional)
    repeated FileInfo files = 7; // A list of file structures (optional)
}

message FileInfo {
    string metadata = 1;        // JSON string containing metadata for the file
    bytes file_data = 2;        // Raw file content as byte array
}
```

### Methods in AIOS V1 for Model Onboarding

The `aios_instance` SDK provides a structured way to integrate your model by implementing a custom class with the following methods:

-   `__init__(self, context)`: Initializes the block, loads models, and sets up configurations.
-   `on_preprocess(self, packet)`: Pre-processes incoming data packets.
-   `on_data(self, preprocessed_entry)`: Performs the main inference logic.
-   `management(self, action, data)`: Handles custom management commands.
-   `get_muxer(self)`: (Optional) Provides a muxer for packet merging.

This structure allows for a clear separation of concerns and makes it easy to integrate custom logic into the AIOS ecosystem.


# Hello World For AIOS

#### Code Examples for Core Methods

Here are some basic code examples to illustrate the implementation of the core methods in the `aios_instance` SDK. These examples are conceptual and would need to be adapted for a specific use case.


In [1]:
# __init__: Initialize the block, load models, and set up configurations.
def __init__(self, context):
    super().__init__(context)
    # Example: self.model = load_model('path_to_your_model')
    self.context.logger.info("Block initialized.")

# on_preprocess: Pre-process incoming data packets.
def on_preprocess(self, packet):
    # Example: data = json.loads(packet.data)
    # return True, [PreProcessResult(packet=packet, extra_data={"input": data})]
    return True, [PreProcessResult(packet=packet, extra_data={})]

# on_data: Perform the main inference logic.
def on_data(self, preprocessed_entry):
    # Example: result = self.model.predict(preprocessed_entry.extra_data["input"])
    # return True, OnDataResult(output={"prediction": result})
    return True, OnDataResult(output={"status": "ok"})


# management: Handle custom management commands.
def management(self, action, data):
    if action == 'reload_model':
        # self.model.reload()
        return {"status": "ok", "message": "Model reloaded."}
    return {"status": "error", "message": f"Unknown action: {action}"}


Custom  Model , prebuilt images context as Hellow world where to start, recommedation. Formal use these, and proced,skip

## Local Setup and Project Structure

This section guides you through setting up your local environment and organizing your project files. This structure is designed to be generic and will be used for volume mounting when deploying your model.

### 1. Create a Virtual Environment

It is highly recommended to use a virtual environment to manage your project's dependencies.

```bash
python3 -m venv aios_env
source aios_env/bin/activate
pip install  grpcio grpcio-tools protobuf huggingface-hub
```

### 2. Recommended Project Structure

We recommend the following folder structure for your AIOS model projects. This structure will be assumed for the rest of the tutorial series.

```
my-aios-model/
├── models/
│   └── your-model-file.gguf
├── component.json
├── allocation.json
└── inference_client/
    ├── service_pb2.py
    └── service_pb2_grpc.py
```

- **models/**: This directory will store your model files (e.g., GGUF files).
- **component.json**: The asset definition file for your model.
- **allocation.json**: The block allocation configuration.
- **inference_client/**: This directory will contain the gRPC client files for sending inference requests.

## Installing Hugging Face CLI

The Hugging Face CLI is the recommended way to download models for AIOS integration.

### Install Hugging Face Hub

Run the following command to install the Hugging Face CLI:

```bash
pip install huggingface-hub
```

### Create Models Directory

```bash
mkdir -p models
```

### Optional: Login to Hugging Face

For access to private models and higher download limits:

```bash
huggingface-cli login
```

## Downloading Models with Hugging Face CLI

Now let's download a sample model to test our setup. We'll use the Gemma 2B model as an example.

### Download Gemma 2B model using shell 
```bash
huggingface-cli download google/gemma-2b-it --local-dir ./models/gemma-2b-it
```

---

## 🐳 **Docker Images Disclaimer**

**Important**: The following tutorials assume you have already built the base Docker images. We will be using these pre-built Docker containers as the foundation for our AIOS model integration.

For the next tutorial on onboardiing prebuilt docker images into AIOS Ecosystem, you can refer [02_Part1_onboard_gemma3_llama_cpp](https://github.com/OpenCyberspace/AIOS_AI_Blueprints/tree/tutorial_notebooks/video_tutorial_series/02_Part1_onboard_gemma3_llama_cpp)

If you want to build your own custom images with your selected models,refer below,before proceeding for [02_Part2_onboard_custom_llama_cpp](https://github.com/OpenCyberspace/AIOS_AI_Blueprints/tree/tutorial_notebooks/video_tutorial_series/02_Part2_onboard_custom_llama_cpp)

- [building aios_instance:v1-gpu base image](https://docs.aigr.id/aios-instance/aios-instance/#building-the-container-image)
- [building aios_llama_cpp:v1-gpu base image](https://docs.aigr.id/llm-docs/llm-method-1/#using-the-docker-images-for-building)
---