In [None]:
#| hide
from chasqui.core import *

# Chasqui

> A lightweight workflow automation system for computational materials science

`chasqui` helps you manage VASP calculations on remote HPC clusters with PBS queue systems. It handles job submission, monitors queue status, and tracks results—all while respecting strict authentication requirements and queue limits.

**Key features:**
- Local SQLite database tracks all jobs
- Batched SSH operations (2FA-friendly)
- Self-perpetuating remote queue (no cron needed)
- Automatic overflow management (respects PBS limits)
- Simple CLI for everyday tasks

### The Need for Simple Workflow Systems

Running computational materials calculations on HPC clusters often involves:
- Submitting hundreds of jobs with varying parameters
- Monitoring queue status across multiple runs
- Managing data transfer between local and remote systems
- Respecting queue policies (maximum jobs, walltime limits)
- Dealing with authentication barriers (2FA, SSH keys)

Existing workflow systems like AiiDA, FireWorks, or Snakemake are powerful but can be heavy—requiring databases, web servers, or complex configuration. For individual researchers or small groups, there's a gap: **we need something lightweight, transparent, and easy to understand.**

`chasqui` fills this gap by embracing simplicity:

- **Single SQLite file** - no database server
- **Plain bash scripts** - no daemons or background processes
- **SSH and PBS only** - standard HPC tools 
- **Literate programming** - readable notebooks, not opaque frameworks

### Why "Chasqui"?

The name comes from the **chasqui** (also *chaski*), the relay messengers of the Inca Empire. These runners formed a sophisticated communication network spanning thousands of kilometers across the Andes, carrying messages and small goods between administrative centers. Chasquis often carried **khipus** (*quipus*)—intricate systems of knotted strings that encoded numerical data, records, and possibly narratives. 

### Design Philosophy

1. **Transparency** — You should understand what's happening  
2. **Simplicity** — Fewer moving parts means less to break  
3. **Flexibility** — Work with your HPC environment, not against it  
4. **Literate** — Code as documentation, documentation as code  

`chasqui` isn't trying to be the most feature-rich workflow system. It's trying to be the one you can **debug at 2 AM when your calculations are due**, understand six months later when you return to a project, and modify when your HPC center changes queue policies.

## Install
```sh
pip install -e .
```

## Quick Start

### Initialize Database
```python
from chasqui.database import ChasquiDB

db = ChasquiDB("~/.chasqui/jobs.db")
db.init_db()
```

### Create and Submit a Job
```python
# Create job (assuming VASP inputs in ~/my_vasp_job/)
job_id = db.create_job(
    local_path="~/my_vasp_job",
    vasp_config={
        "job_name": "my_calculation",
        "cores": 2,
        "walltime": "24:00:00",
        "project": "YOUR_PROJECT_CODE",
        "vasp_version": "vasp_std",
        "remote_work_dir": "$HOME/scratch/my_calculation"
    }
)

# Queue for submission
db.update_state(job_id, 'QUEUED_LOCAL')

# Sync and submit
from chasqui.sync import sync, SyncConfig

config = SyncConfig(remote_host='bebop')
result = sync(config)

print(f"Uploaded: {result['uploaded']}, Submitted: {result['submitted']}")
```

## Workflow States

Jobs progress through these states:
```
PREPARED → QUEUED_LOCAL → UPLOADED → SUBMITTED → RUNNING → COMPLETED/FAILED
```

## How to use

For detailed usage, see the [documentation](https://garciajc.github.io/chasqui/).