# ASACA Quick-Start

Welcome to **ASACA — Automated Speech Analysis for Cognitive Assessment**.  
This notebook shows how to clone the repo, build & install the package locally (CPU-only), run a first inference, launch the GUI, and understand what happens under the hood.

## 1  Prerequisites

* Python ≥ 3.11 (tested on 3.11.9)
* Git + Git LFS (model weights live in `Models/`)
* `build` & `pip` for packaging
* **CPU-only** runtime (no GPU/CUDA needed)

In [None]:
# Clone the repository and pull LFS weights
!git clone https://github.com/RhysonYang-2030/ASACA-Automatic-Speech-Analysis-for-Cognitive-Assessment.git
%cd ASACA-Automatic-Speech-Analysis-for-Cognitive-Assessment
!git lfs install --skip-smudge  # in case LFS not yet initialised
!git lfs pull

## 2  Build & install (editable OR wheel)
Choose **one** of the two options:

In [None]:
# Option A — editable install (recommended while developing)
import sys, subprocess, textwrap, os, pathlib, json

In [None]:
!pip install -e .[gui]

In [None]:
# Option B — build a wheel (exactly what CI does)
!python -m build
!pip install dist/asaca-0.1.0-py3-none-any.whl

## 3  Smoke test
Verify that the CLI is on `$PATH`:

In [None]:
!asaca --help | head -n 20

## 4  Run your first inference
We’ll use the bundled **5 MB** demo file.

In [None]:
!python -m asaca infer samples/demo.wav -o outputs

Expected console output (truncated):
```text
Prediction  : MCI  (p = 0.71)
CTC-segmented transcript saved to outputs/
PDF report  saved to outputs/report.pdf
```

## 5  Launch the GUI
If PyQt5 isn’t installed, re-run the install cell with `[gui]` extras.

In [None]:
!python -m asaca gui

The window lets you:
1. Load a WAV/FLAC file
2. View waveform, transcript, and segmentation
3. Export PDF report or JSON metrics
4. Record directly from microphone (→ *record* button)

## 6 Underlying principles
![ASACA pipeline](../docs/img/pipeline.png)

Workflow summary:
1. **wav2vec 2** (fine-tuned) produces frame-level logits
2. **CTC Layer** → decoded by greedy CTC decoder
3. **CTC segmentation** aligns timestamps
4. Forced alignment → pause & syllable metrics
5. **Inference head** (log-reg) returns *HC / MCI / AD* with SHAP explainability

## 7  Custom model paths & GPU flag
```bash
python -m asaca infer my.wav \
    --processor Models \
    --model     Models \
    --device cpu        # or cuda:0 if you have a GPU build of PyTorch
```

## 8  Troubleshooting
| Error | Likely cause | Fix |
|-------|-------------|------|
| `ModuleNotFoundError: PyQt5` | GUI extras not installed | `pip install PyQt5>=5.15` |
| `nltk.corpus.cmudict` missing | first-time run | `python -m nltk.downloader cmudict` |
| Torch/Torchvision DLL popup (Windows) | mismatched versions | `pip install torch==2.0.1+cpu torchvision==0.15.2+cpu -f https://download.pytorch.org/whl/torch_stable.html` |

## 9  Citation
If you use ASACA in published work:
```bibtex
@mastersthesis{yang2025asaca,
  author  = {Xinbo Yang},
  title   = {ASACA — Automatic Speech & Cognition Assessment},
  school  = {Trinity College Dublin},
  year    = {2025},
  url     = {https://github.com/ProfYang-2030/ASACA-Automatic-Speech-Analysis-for-Cognitive-Assessment}
}
```