Skip to content

chloepilonv/post-train-app

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AngelCare Post-Training App

Streamlit app for fine-tuning NVIDIA Cosmos Reason 2 on elder care safety classification. Supports Full SFT, LoRA, and QLoRA training methods with built-in dataset management.

Nebius Instance Setup

Create a GPU VM on Nebius AI Cloud with the following specs:

Setting QLoRA / LoRA (recommended) Full SFT
GPU 1x H100 80GB or 1x H200 141GB 8x H100 or 8x H200
Preset 1gpu-16vcpu-200gb 8gpu-128vcpu-1600gb
vCPUs 16 128
RAM 200 GiB 1600 GiB
Boot disk 200 GiB 200 GiB
OS Ubuntu 22.04 LTS (with CUDA) Ubuntu 22.04 LTS (with CUDA)

QLoRA on Cosmos Reason2 2B fits in ~5GB VRAM. A single H100 is more than enough. Full SFT via cosmos-rl needs multi-GPU parallelism (tp_size + dp_shard_size).

Quick Start

SSH into your Nebius instance, then:

git clone <repo-url> && cd post-train-app
chmod +x run.sh download_datasets.sh
./run.sh                # installs deps, downloads datasets, clones cosmos libs
source .venv/bin/activate
streamlit run app.py --server.port 8501 --server.address 0.0.0.0

Or step by step:

pip install -r requirements.txt   # Python deps
./download_datasets.sh            # GMDC + Harvard FallVision datasets
streamlit run app.py

Datasets

Downloaded automatically by download_datasets.sh into datasets/:

Dataset Videos Source Auto-label
GMDC-SA24 160 GitHub / Zenodo CSV descriptions -> 8 classes
Harvard FallVision 200+ Harvard Dataverse All fall (class 0)
Personal clips - Manual bad/ = fall, good/ = daily
Custom - Upload in app Manual annotation

App Tabs

  1. Upload -- Upload video files
  2. Convert -- Transcode to MP4 via ffmpeg
  3. Annotate -- Label videos (freeform QA, MCQ, or safety classification)
  4. Dataset Builder -- Scan external datasets, merge with custom annotations, stratified train/test split, class distribution chart
  5. Post-train -- Train with Full SFT (cosmos-rl), LoRA, or QLoRA (TRL/PEFT). Merge adapter into base model for standalone inference.
  6. Evaluate -- Compare base vs fine-tuned model accuracy
  7. Export to HF -- Upload merged model to Hugging Face Hub

Training Methods

Method Backend VRAM Output
Full SFT cosmos-rl High (multi-GPU) Full checkpoint
LoRA TRL + PEFT Medium adapter/ + merged/
QLoRA TRL + PEFT + BitsAndBytes Low (~5GB) adapter/ + merged/

Project Structure

post-train-app/
├── app.py                      # Streamlit app
├── run.sh                      # One-command setup (deps + datasets + cosmos libs)
├── download_datasets.sh        # Dataset downloader
├── requirements.txt
├── src/
│   ├── dataset_manager.py      # Dataset scanning, LLaVA conversion, merge, split
│   ├── train_trl.py            # LoRA/QLoRA training script (subprocess)
│   ├── post_train_cosmosrl.py  # cosmos-rl SFT + TRL subprocess launchers
│   ├── llava_builder.py        # LLaVA format builders (MCQ, freeform, AngelCare)
│   ├── paths.py                # App directory structure
│   ├── annotations.py          # Annotation persistence
│   ├── video_convert.py        # ffmpeg conversion
│   ├── evaluate_cosmos.py      # Evaluation runner
│   └── hf_export.py            # HuggingFace upload
├── templates/
│   ├── sft_template.config.toml
│   └── eval_config.yaml
├── datasets/                   # Downloaded by download_datasets.sh (gitignored)
└── tmp/                        # Runtime artifacts (gitignored)

Safety Classes

ID Label Risk
0 Fall Detected CRITICAL
1 Immobility Alert HIGH
2 Unsteady Movement MEDIUM
3 Distress Posture HIGH
4 Normal Walking SAFE
5 Normal Sitting SAFE
6 Normal Daily Activity SAFE
7 Resting or Sleeping SAFE

Citations

@article{alam2024,
  title={GMDCSA24: A Dataset for Human Fall Detection in Videos},
  author={Alam, Ekram and Sufian, Abu and Dutta, Paramartha and Leo, Marco},
  journal={Data in Brief},
  year={2024}
}

@data{DVN/75QPKK,
  title={FallVision: A benchmark video dataset for fall detection},
  publisher={Harvard Dataverse},
  doi={10.7910/DVN/75QPKK}
}

About

App to easily perform data formatting and VLM post-training

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors