# Getting Started with Avalan CLI

This tutorial walks through the Avalan command line interface (CLI) to build agents and models. You'll prepare your environment, explore the main command groups, and learn how to search for models, adjust tokenizers, manage memory, and serve agents. Every step uses CLI commands so you can follow along in your own terminal.

**What you'll learn**

- Installing dependencies and confirming the CLI works
- Inspecting and installing models from supported hubs
- Customizing tokenizers and caching downloaded weights
- Running agents with tools, memories, and configuration files
- Serving or deploying agents once you're ready to integrate

## Prerequisites

Before running the commands below, make sure you have:

1. **Python 3.11+ and [Poetry](https://python-poetry.org/)** installed on your machine.
2. Access to any private model hubs you intend to use (set environment variables such as `HF_TOKEN` when necessary).
3. GPU drivers and CUDA/cuDNN installed if you plan to run larger transformer models locally.

> ℹ️ **Tip:** Avalan reads most CLI options from environment variables as well. Run `poetry run avalan --help` to see the environment variable that corresponds to each flag.

### Install dependencies

Run the following command in the project root to create the virtual environment and install dependencies.

In [None]:
!poetry install --sync

### Verify the CLI is available

The first invocation can take a little while because Avalan lazily loads optional dependencies such as Hugging Face Transformers and Diffusers. Subsequent runs are much faster.

In [None]:
!poetry run avalan --help

## Finding a Model for Your Task

Use `avalan model search` to locate models that match your needs. You can filter by task, name, modality, and other metadata. Start by checking the help text to discover all available options:

In [None]:
!poetry run avalan model search --help

Search for a sentiment analysis model and limit the results to five entries:

```
poetry run avalan model search --search sentiment --limit 5
```

Once you identify a model, you can display its details and verify you have access before downloading weights:

```
poetry run avalan model display <model-id>
```

When you're ready to try a model locally, download it to the cache or a custom directory:

```
poetry run avalan model install <model-id> --revision main --local-dir ./models
```

For a quick quality check, stream tokens directly from the model without building an agent:

```
poetry run avalan model run <model-id> --prompt "Hello there" --max-new-tokens 64
```

## Learning About Tokenizers

Tokenizers break text into tokens understood by models. The `tokenizer` command lets you inspect, extend, and persist tokenization rules. Check the help output to review all flags:

In [None]:
!poetry run avalan tokenizer --help

To see how a tokenizer splits text, pipe input through the command:

```
echo 'Hello world' | poetry run avalan tokenizer -t gpt2 --skip-hub-access-check --no-repl
```

Use `--special-token` or `--token` to extend vocabularies, `--save` to write the modified tokenizer to disk, and `--tokenizer-subfolder` when working with repositories that ship multiple tokenizers.

## Managing the Model Cache

Caching avoids repeated downloads and lets you prune unused revisions. The `cache` subcommands expose the most common operations:

In [None]:
!poetry run avalan cache --help

Typical workflows include:

- **Download weights for offline work:**
  ```
poetry run avalan cache download meta-llama/Meta-Llama-3-8B-Instruct --workers 4
```
- **Inspect your cache:**
  ```
poetry run avalan cache list --summary
```
- **Remove specific revisions or everything for a model:**
  ```
poetry run avalan cache delete meta-llama/Meta-Llama-3-8B-Instruct --delete-revision main
```

All cache commands honor `--cache-dir`, so you can keep large weights on a secondary volume.

## Using Tools with Models

Avalan agents can call external tools while generating responses. When running an agent, add `--tool` arguments and enable event display to visualize tool calls. Review the available runtime options:

In [None]:
!poetry run avalan agent run --help

For example, invoke a calculator tool and print event traces:

```
echo 'What is (4 + 6) * 5?' |     poetry run avalan agent run         --engine-uri meta-llama/Meta-Llama-3-8B-Instruct         --tool math.calculator         --display-events --quiet
```

The `--display-events` flag streams tool invocation events so you can follow how the agent reasons. To inspect recent conversations without re-running prompts, use `poetry run avalan agent message search --id <agent-id> --participant <uuid> --session <uuid>`.

## Keeping Memories Across Sessions

Agents can maintain context with recent and persistent memories. Recent memory keeps a rolling window of messages, while persistent memory stores information in external backends.

Use these flags when launching an agent:

```
poetry run avalan agent run --engine-uri <model-id>     --memory-recent     --memory-permanent-message 'postgresql://user:pass@localhost:5432/dbname'
```

Replace the DSN above with your own database connection string. With both options enabled, the agent recalls prior turns across runs.

### Indexing documents into memory

Populate the memory store with documents before chatting so the agent can ground its answers. The `memory document index` command accepts local files or URLs and chunks them with either a text or code partitioner.

In [None]:
!poetry run avalan memory document index --help

Index a Markdown knowledge base stored on disk:

```
poetry run avalan memory document index docs/handbook.md     --model sentence-transformers/all-MiniLM-L6-v2     --namespace support --participant 00000000-0000-0000-0000-000000000000     --dsn postgresql://user:pass@localhost:5432/avalan --partition-max-tokens 256
```

## Creating Agents via CLI Arguments and Configuration Files

Agents can be built inline with CLI flags or from TOML configuration files.

### Scaffold a configuration file

Interactively generate a blueprint using `agent init` and redirect it to disk:

```
poetry run avalan agent init > helper.toml
```

Re-run the command with `--name`, `--role`, or `--tool` flags to skip prompts in automated environments.

### Inline agent

```
poetry run avalan agent run     --engine-uri meta-llama/Meta-Llama-3-8B-Instruct     --tool math.calculator     --memory-recent     --name 'Helper'     --role 'You are a helpful assistant named Helper.'
```

### Configuration file

Create `my_agent.toml`:

```
[engine]
uri = "meta-llama/Meta-Llama-3-8B-Instruct"

[run]
name = "Helper"
role = "You are a helpful assistant named Helper."
memory_recent = true
[tool]
enable = ["math.calculator"]
```

Launch it with:

```
poetry run avalan agent run my_agent.toml
```

## Serving an Agent via the OpenAI API Endpoint

Use `agent serve` to expose your agent on an OpenAI-compatible HTTP server.

In [None]:
!poetry run avalan agent serve --help

Start the server:

```
poetry run avalan agent serve my_agent.toml -vvv
```

The server listens on `http://localhost:9001/v1` by default.

### Python Client
```
from openai import OpenAI
client = OpenAI(base_url="http://localhost:9001/v1")
chat = client.chat.completions.create(
    model="openai",
    messages=[{"role": "user", "content": "Hello agent"}],
)
print(chat.choices[0].message.content)
```

### TypeScript Client
```
import OpenAI from "openai";
const client = new OpenAI({ baseURL: "http://localhost:9001/v1" });
const chat = await client.chat.completions.create({
  model: "openai",
  messages: [{ role: "user", content: "Hello agent" }],
});
console.log(chat.choices[0].message?.content);
```

With the server running, both clients communicate with your agent using familiar OpenAI API calls.

## Deploying Agents to AWS (Optional)

Automate provisioning with `deploy run`. Provide a deployment TOML that specifies your VPC, instance type, and the agents to publish. A minimal example looks like:

```
[aws]
vpc = "my-vpc"
instance = "t3.large"
database = "avalan-db"
pgsql = "postgresql://user:pass@host:5432/avalan"

[agents]
port = 9001
publish = "my_agent.toml"
```

Run the deployment (requires AWS credentials in your environment):

```
poetry run avalan deploy run deployment.toml
```

## Next Steps

- Explore additional tutorials in `docs/tutorials/` for in-depth walkthroughs.
- Read `poetry run avalan train run --help` to experiment with fine-tuning workflows.
- Combine agents, flows, and tools in automation pipelines to orchestrate complex tasks.