Skip to content

A-Kuo/CIPHER

Repository files navigation

CIPHER

On-Device Tactical Awareness AI for First Responders

Python FastAPI Qualcomm YOLO License: MIT Status

"In a disaster, connectivity is the first thing you lose. Decisions can't wait for the cloud."


What CIPHER Is

CIPHER turns a laptop and webcam into a local tactical awareness system — no internet required, no API keys, no data leaving the device.

Built for search and rescue, emergency response, and field operations where real-time environmental intelligence matters and connectivity cannot be guaranteed.

Four operational modes:

Mode What It Does
Manual Live YOLO object detection on webcam feed with semantic 2D map of detected objects
Agent Natural-language Q&A: "At which node was the person last seen?" — answered using local RAG + LLM
3D World Interactive point cloud and overhead map for spatial navigation
Replay Trajectory playback with speed control for after-action review

Why On-Device

Cloud-based AI in tactical settings has three failure modes:

  1. Connectivity loss — disasters, remote locations, RF-denied environments
  2. Latency — a 2-second round trip is acceptable in a chatbot; not in active search operations
  3. Data sovereignty — field footage of emergency scenes should not traverse external APIs

CIPHER's entire inference stack runs locally. The same codebase supports CPU, GPU, and Qualcomm NPU via AI Hub — meaning it can scale from a developer laptop to a Snapdragon-powered handheld device.


Architecture

┌──────────────────────────────────────────────────────────────────┐
│                         CAMERA INPUT                             │
│              (webcam, drone feed, recorded video)                │
└───────────────────────────────┬──────────────────────────────────┘
                                │
                                ▼
┌──────────────────────────────────────────────────────────────────┐
│                      PERCEPTION LAYER                            │
│  ┌──────────────────────┐     ┌───────────────────────────────┐  │
│  │  YOLO Object Det.    │     │  Voice Input (Whisper)         │  │
│  │  (CPU / GPU / NPU)   │     │  → Natural language query     │  │
│  └──────────┬───────────┘     └──────────────┬────────────────┘  │
│             │                                │                   │
│             ▼                                ▼                   │
│  ┌──────────────────────────────────────────────────────────┐    │
│  │               WORLD GRAPH BUILDER                       │    │
│  │  • Detection events → node IDs                          │    │
│  │  • Spatial relationships encoded                         │    │
│  │  • Timestamped trajectory tracking                       │    │
│  └──────────────────────────┬───────────────────────────────┘    │
└──────────────────────────────┼───────────────────────────────────┘
                               │
                               ▼
┌──────────────────────────────────────────────────────────────────┐
│                       AGENT LAYER                                │
│  ┌──────────────────────┐     ┌───────────────────────────────┐  │
│  │  Local Vector DB     │     │  LLM (Ollama / Qualcomm Genie)│  │
│  │  (semantic search    │     │  → reasoning, answer          │  │
│  │   over world graph)  │     │     generation                │  │
│  └──────────┬───────────┘     └──────────────┬────────────────┘  │
│             └──────────────────┬─────────────┘                   │
│                                ▼                                 │
│            "Where was the survivor last seen?"                   │
│            → node_003, node_007 — frames 142–156                  │
└──────────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌──────────────────────────────────────────────────────────────────┐
│                     FASTAPI + REACT FRONTEND                     │
│  Manual Mode | Agent Mode | 3D World | Replay                    │
└──────────────────────────────────────────────────────────────────┘

Tech Stack

Component Technology
Object detection YOLOv8 / YOLO-NAS (Ultralytics)
Speech input OpenAI Whisper (local, no API)
Language model Ollama (local) or Qualcomm Genie
Vector database Local vector DB (Chroma / FAISS)
Edge acceleration Qualcomm AI Hub (NPU-optimized models)
Backend FastAPI
Frontend React
Drone mode YOLO on NPU + spatial agent

Quick Start

# Install dependencies
pip install -r requirements.txt

# Download YOLO model
python download_model.py

# Start backend + frontend
./run.sh          # Linux/Mac
./run.ps1         # Windows PowerShell

Qualcomm NPU mode:

pip install -r requirements-llama-qualcomm.txt
./run_drone.ps1   # Drone mode with NPU acceleration

Use Cases

  • Search & rescue — Track where survivors or hazards were last detected without replaying full video
  • Damage assessment — Map detected structural hazards to spatial nodes for systematic review
  • After-action review — Replay trajectories with detection overlay for training and debrief
  • Privacy-sensitive operations — All data stays on device; no cloud calls during active operations
  • Edge hardware research — Reference implementation for Qualcomm NPU perception pipelines

Emergency Protocol Integration

CIPHER includes a data/emergency_manuals/ directory with structured protocols for:

  • Survivor triage
  • Fire response
  • Building collapse and structural hazard assessment
  • Search extraction procedures

These are ingested into the local vector DB, enabling the agent to answer questions that combine live detections with protocol knowledge — e.g., "I see smoke and an unconscious person — what's the extraction priority?"


Drone Mode

Drone/ contains a specialized configuration for aerial feeds:

  • YOLO on NPU for low-power inference on embedded hardware
  • Spatial agent for overhead scene understanding
  • Emergency manual integration for autonomous triage assistance
  • Point cloud generation from depth-enabled cameras

Roadmap

Priority Feature
🔄 In progress Depth model integration (Qualcomm AI Hub Depth-Anything)
🔄 In progress Tighter 3D map integration for spatial Q&A
⏸️ Planned Multi-camera fusion and unified tactical view
⏸️ Planned One-click export for after-action reporting
⏸️ Planned Mobile optimization (Snapdragon tablet/handheld)

Related Work


Citation

@software{cipher_2026,
  author = {A-Kuo},
  title = {CIPHER: On-Device Tactical Awareness AI for First Responders},
  url = {https://github.com/A-Kuo/CIPHER},
  year = {2026},
  note = {YOLO + Whisper + local LLM + Qualcomm NPU — zero cloud dependency}
}

Built for the field. Runs offline. Keeps data local. April 2026.

About

On-device tactical awareness AI for first responders — YOLO, Whisper, local LLM, Qualcomm NPU. Zero cloud dependency. Built for search & rescue and field operations.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors