
# Getting Started with Avalan CLI

This tutorial walks through the Avalan command line interface (CLI) to build agents and models. We'll explore how to find suitable models, inspect tokenizers, use tools, retain memories, craft agents, and serve them through an OpenAI-compatible endpoint. Each step uses CLI commands so you can follow along in your own terminal.



## Exploring the CLI

Avalan exposes all functionality through the `avalan` entry point. Run `poetry run avalan --help` to inspect global options such as `--cache-dir`, `--device`, `--locale`, and `--help-full`, along with the available command groups (`agent`, `cache`, `deploy`, `flow`, `memory`, `model`, `tokenizer`, and `train`). Use `--help-full` when you want a single dump containing help for every subcommand.


In [None]:
!poetry run avalan --help



## Finding and Inspecting Models

Use `poetry run avalan model search --help` to review filters like `--search`, `--task`, `--filter`, `--language`, and `--limit`. Combine them to locate models tailored to your project. For example:

```bash
poetry run avalan model search --task text-generation --search llama --limit 5
```

After narrowing your choices, inspect metadata with `model display`:

```bash
poetry run avalan model display meta-llama/Meta-Llama-3-8B-Instruct --summary
```

Add `--sentence-transformer` when you need to view the model as a sentence encoder instead of a text generator.


In [None]:
!poetry run avalan model search --help


### Installing and Running Models

Download weights locally with `model install` so repeated runs do not hit the Hugging Face hub:

```bash
poetry run avalan model install meta-llama/Meta-Llama-3-8B-Instruct --workers 4
```

Stream responses directly from the command line. Pipe prompts through `model run` and add decoding flags like `--skip-special-tokens` or `--display-tokens` when you want richer output diagnostics:

```bash
echo "Summarize the following article" |       poetry run avalan model run meta-llama/Meta-Llama-3-8B-Instruct       --skip-special-tokens --display-tokens
```

When you no longer need a model, clean up disk space with `model uninstall` (use `--delete` for an actual removal instead of a dry run):

```bash
poetry run avalan model uninstall meta-llama/Meta-Llama-3-8B-Instruct --delete
```


In [None]:
!poetry run avalan model run --help



## Learning About Tokenizers

Tokenizers break text into tokens understood by models. The `tokenizer` command requires a tokenizer identifier via `-t/--tokenizer` and supports mutation flags such as `--special-token`, `--token`, and `--save` for persisting changes.

The help output shows every option:


In [None]:
!poetry run avalan tokenizer --help


To see how a tokenizer splits text, pipe input through the command:

```bash
echo 'Hello world' | poetry run avalan tokenizer -t gpt2 --skip-hub-access-check --no-repl
```

Add `--special-token` or `--token` repeatedly to extend vocabularies, then provide `--save /path/to/tokenizer` to write the adjusted tokenizer to disk.



## Managing the Model Cache

`avalan cache` lets you prime or inspect local weights. Download artifacts without running a model by calling `cache download` with the same arguments accepted by `model install`:

```bash
poetry run avalan cache download meta-llama/Meta-Llama-3-8B-Instruct --workers 4
```

Review cached revisions with `cache list --summary`, or remove entries using `cache delete -m <model-id>` and add `--delete` when you are sure you want to reclaim space.


In [None]:
!poetry run avalan cache list --help



## Running Agents with Tools

Avalan agents can call external tools while generating responses. Review available switches with `agent run --help`. When enabling tools, combine `--tool` (for specific tool IDs) or `--tools` (to enable by namespace) with `--tools-confirm` if you want to approve each call interactively.


In [None]:
!poetry run avalan agent run --help


For example, the following command invokes a calculator tool, prints event traces, and displays the live tool panel:

```bash
echo 'What is (4 + 6) * 5?' |       poetry run avalan agent run --engine-uri meta-llama/Meta-Llama-3-8B-Instruct       --tool math.calculator --display-events --display-tools --quiet
```

The `--display-events` and `--display-tools` flags stream tool invocation events so you can follow how the agent reasons.



## Inspecting Agent Memories

Use `avalan agent message search` to retrieve items from an agent's permanent message memory. Provide the agent specification file (or inline settings such as `--engine-uri`), then identify the agent/session pair:

```bash
poetry run avalan agent message search my_agent.toml       --id <agent-id> --participant <participant-id> --session <session-id>       --function l2_distance --limit 5
```

Swap `--function` with any value from `cosine_distance`, `inner_product`, `l1_distance`, or `l2_distance` to mirror the similarity metric stored in your database.


In [None]:
!poetry run avalan agent message search --help



## Keeping Memories Across Sessions

Agents can maintain context with recent and persistent memories. Recent memory keeps a rolling window of messages, while persistent memory stores information in external backends.

Use these flags when launching an agent:

```bash
poetry run avalan agent run --engine-uri <model-id>       --memory-recent --memory-permanent-message 'postgresql://user:pass@localhost:5432/dbname'       --load-recent-messages-limit 20
```

Provide multiple permanent stores with `--memory-permanent namespace@dsn` and fine-tune chunking via `--memory-engine-model-id`, `--memory-engine-window`, `--memory-engine-overlap`, and `--memory-engine-max-tokens` so embeddings match your storage strategy.



## Creating Agents via CLI Arguments and Configuration Files

Agents can be built inline with CLI flags or from TOML configuration files.

### Inline Agent
```bash
poetry run avalan agent run       --engine-uri meta-llama/Meta-Llama-3-8B-Instruct       --tool math.calculator --memory-recent       --name 'Helper' --role 'You are a helpful assistant named Helper.'
```

### Configuration File
Create `my_agent.toml`:

```toml
[engine]
uri = "meta-llama/Meta-Llama-3-8B-Instruct"

[run]
name = "Helper"
role = "You are a helpful assistant named Helper."
memory_recent = true

[tool]
enable = ["math.calculator"]
```

Launch it with:

```bash
poetry run avalan agent run my_agent.toml
```

Generate a starter TOML interactively with `poetry run avalan agent init --name Helper --role "Helpful assistant" --tool math.calculator`, then tweak the saved blueprint before running it.



## Serving an Agent via the OpenAI API Endpoint

Use `agent serve` to expose your agent on an OpenAI-compatible HTTP server. Customize the listener with `--host`, `--port`, or change the API prefixes via `--openai-prefix`, `--mcp-prefix`, and `--a2a-prefix`. Add `--cors-origin` or `--cors-origin-regex` when integrating with browsers, and `--reload` to auto-restart during development.


In [None]:
!poetry run avalan agent serve --help


Start the server:

```bash
poetry run avalan agent serve my_agent.toml -vvv
```

The server listens on `http://localhost:9001/v1` by default.

### Python Client
```python
from openai import OpenAI

client = OpenAI(base_url="http://localhost:9001/v1")
chat = client.chat.completions.create(
    model="openai",
    messages=[{"role": "user", "content": "Hello agent"}],
)
print(chat.choices[0].message.content)
```

### TypeScript Client
```typescript
import OpenAI from "openai";

const client = new OpenAI({ baseURL: "http://localhost:9001/v1" });
const chat = await client.chat.completions.create({
  model: "openai",
  messages: [{ role: "user", content: "Hello agent" }],
});
console.log(chat.choices[0].message?.content);
```

With the server running, both clients communicate with your agent using familiar OpenAI API calls.



## Running Flows and Trainings

Avalan can orchestrate multi-step automations and fine-tuning jobs directly from the CLI. Point `flow run` at a flow definition (TOML or YAML) to execute your sequence of tasks:

```bash
poetry run avalan flow run flows/my_flow.toml
```

To launch a training recipe, supply the training file to `train run`:

```bash
poetry run avalan train run trainings/my_training.toml
```

Both commands understand the same global flags as `avalan` (for example `--cache-dir` or `--locale`), so you can reuse environment overrides consistently.


In [None]:
!poetry run avalan flow run --help


In [None]:
!poetry run avalan train run --help
