GitHub - amanasmuei/aeval: The portable evaluation layer for AI companions

The portable evaluation layer for AI companions.

Track your AI relationship over time — trust trajectory, session quality, milestones, and patterns. Data-driven relationship improvement in a single file.

Quick Start · How It Works · Commands · Reports · Ecosystem

The Problem

You use AI every day but have no way to track whether the relationship is improving. Is your AI getting better at understanding you? Are frustrating sessions getting less frequent? You have no data.

The Solution

aeval gives you a lightweight relationship tracker. 4 questions per session, 30 seconds, and you get a data-driven view of how your AI partnership is evolving.

npx @aman_asmuei/aeval init

No databases. No cloud. Just one markdown file tracking what matters.

Quick Start

# Install
npm install -g @aman_asmuei/aeval

# Initialize
aeval init              # Create ~/.aeval/eval.md

# After each session
aeval log               # Log a session (4 quick questions)

# Review
aeval                   # Show current metrics
aeval report            # Full relationship report
aeval milestone "text"  # Record a milestone
aeval doctor            # Health check

How It Works

aeval maintains a single markdown file (~/.aeval/eval.md) that tracks your AI relationship over time.

Session Logging

aeval log walks you through 4 quick questions:

#	Question	Options
1	How was this session?	great / good / okay / frustrating
2	What went well?	Free text (optional)
3	What could improve?	Free text (optional)
4	Trust change?	increased / same / decreased

Each log updates your session count, adds a timeline entry, and recalculates trust and trajectory.

Rating Scale

Rating	Stars	Description
great	★★★★★	Exceeded expectations
good	★★★★☆	Solid session
okay	★★★☆☆	Acceptable, room to improve
frustrating	★★☆☆☆	Needs work

Trajectory

Calculated from your recent session ratings:

Trajectory	Condition
building	Average recent rating >= 3.5
stable	Average recent rating >= 2.5
declining	Average recent rating < 2.5

Relationship Report

$ aeval report

◆ aeval — relationship report

  Sessions:    12
  Since:       2026-03-15 (10 days)
  Trust:       4/5
  Trajectory:  building

  Recent sessions:
    2026-03-25  ★★★★★  great — productive debugging, AI caught edge case
    2026-03-24  ★★★★☆  good — solid feature work
    2026-03-23  ★★★☆☆  okay — some misunderstandings on requirements

  Milestones:
    2026-03-22  First time AI proactively suggested a better approach
    2026-03-18  Completed first full feature together

  Patterns:
    - AI works best when given clear requirements upfront
    - Debugging sessions build trust fastest

Commands

Command	What it does
`aeval`	Show current metrics
`aeval init`	Create `~/.aeval/eval.md`
`aeval log`	Log a session (interactive)
`aeval report`	Full relationship report
`aeval milestone "text"`	Record a milestone
`aeval doctor`	Health check

eval.md Format

# AI Relationship Metrics

## Overview
- Sessions: 12
- First session: 2026-03-15
- Trust level: 4/5
- Trajectory: building

## Timeline
<!-- Entries added automatically, newest first -->
- 2026-03-25 | great | productive debugging session
- 2026-03-24 | good  | solid feature work

## Milestones
- 2026-03-22: First proactive suggestion from AI
- 2026-03-18: Completed first full feature together

## Patterns
- AI works best with clear requirements upfront
- Debugging sessions build trust fastest

Philosophy

Principle	Why
Single file	One markdown file, no database, no cloud
Portable	Works anywhere, version-controllable
Honest	Track what actually happens, not what you wish happened
Lightweight	4 questions per session, done in 30 seconds

The Ecosystem

aman
├── acore      → identity    → who your AI IS
├── amem       → memory      → what your AI KNOWS
├── akit       → tools       → what your AI CAN DO
├── aflow      → workflows   → HOW your AI works
├── arules     → guardrails  → what your AI WON'T do
└── aeval      → evaluation  → how GOOD your AI is  ← YOU ARE HERE

Layer	Package	What it does
Identity	acore	Personality, values, relationship memory
Memory	amem	Automated knowledge storage (MCP)
Tools	akit	15 portable AI tools (MCP + manual fallback)
Workflows	aflow	Reusable AI workflows
Guardrails	arules	Safety boundaries and permissions
Evaluation	aeval	Relationship tracking and session logging
Unified	aman	One command to set up everything

Each works independently. aman is the front door.

Contributing

Contributions welcome! Open an issue or submit a PR.

License

MIT

Track it. Improve it. Data-driven AI partnership.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
bin		bin
src		src
template		template
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The portable evaluation layer for AI companions.

The Problem

The Solution

Quick Start

How It Works

Session Logging

Rating Scale

Trajectory

Relationship Report

Commands

eval.md Format

Philosophy

The Ecosystem

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The portable evaluation layer for AI companions.

The Problem

The Solution

Quick Start

How It Works

Session Logging

Rating Scale

Trajectory

Relationship Report

Commands

eval.md Format

Philosophy

The Ecosystem

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages