On-Device Tactical Awareness AI for First Responders
"In a disaster, connectivity is the first thing you lose. Decisions can't wait for the cloud."
CIPHER turns a laptop and webcam into a local tactical awareness system — no internet required, no API keys, no data leaving the device.
Built for search and rescue, emergency response, and field operations where real-time environmental intelligence matters and connectivity cannot be guaranteed.
Four operational modes:
| Mode | What It Does |
|---|---|
| Manual | Live YOLO object detection on webcam feed with semantic 2D map of detected objects |
| Agent | Natural-language Q&A: "At which node was the person last seen?" — answered using local RAG + LLM |
| 3D World | Interactive point cloud and overhead map for spatial navigation |
| Replay | Trajectory playback with speed control for after-action review |
Cloud-based AI in tactical settings has three failure modes:
- Connectivity loss — disasters, remote locations, RF-denied environments
- Latency — a 2-second round trip is acceptable in a chatbot; not in active search operations
- Data sovereignty — field footage of emergency scenes should not traverse external APIs
CIPHER's entire inference stack runs locally. The same codebase supports CPU, GPU, and Qualcomm NPU via AI Hub — meaning it can scale from a developer laptop to a Snapdragon-powered handheld device.
┌──────────────────────────────────────────────────────────────────┐
│ CAMERA INPUT │
│ (webcam, drone feed, recorded video) │
└───────────────────────────────┬──────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ PERCEPTION LAYER │
│ ┌──────────────────────┐ ┌───────────────────────────────┐ │
│ │ YOLO Object Det. │ │ Voice Input (Whisper) │ │
│ │ (CPU / GPU / NPU) │ │ → Natural language query │ │
│ └──────────┬───────────┘ └──────────────┬────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ WORLD GRAPH BUILDER │ │
│ │ • Detection events → node IDs │ │
│ │ • Spatial relationships encoded │ │
│ │ • Timestamped trajectory tracking │ │
│ └──────────────────────────┬───────────────────────────────┘ │
└──────────────────────────────┼───────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ AGENT LAYER │
│ ┌──────────────────────┐ ┌───────────────────────────────┐ │
│ │ Local Vector DB │ │ LLM (Ollama / Qualcomm Genie)│ │
│ │ (semantic search │ │ → reasoning, answer │ │
│ │ over world graph) │ │ generation │ │
│ └──────────┬───────────┘ └──────────────┬────────────────┘ │
│ └──────────────────┬─────────────┘ │
│ ▼ │
│ "Where was the survivor last seen?" │
│ → node_003, node_007 — frames 142–156 │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ FASTAPI + REACT FRONTEND │
│ Manual Mode | Agent Mode | 3D World | Replay │
└──────────────────────────────────────────────────────────────────┘
| Component | Technology |
|---|---|
| Object detection | YOLOv8 / YOLO-NAS (Ultralytics) |
| Speech input | OpenAI Whisper (local, no API) |
| Language model | Ollama (local) or Qualcomm Genie |
| Vector database | Local vector DB (Chroma / FAISS) |
| Edge acceleration | Qualcomm AI Hub (NPU-optimized models) |
| Backend | FastAPI |
| Frontend | React |
| Drone mode | YOLO on NPU + spatial agent |
# Install dependencies
pip install -r requirements.txt
# Download YOLO model
python download_model.py
# Start backend + frontend
./run.sh # Linux/Mac
./run.ps1 # Windows PowerShellQualcomm NPU mode:
pip install -r requirements-llama-qualcomm.txt
./run_drone.ps1 # Drone mode with NPU acceleration- Search & rescue — Track where survivors or hazards were last detected without replaying full video
- Damage assessment — Map detected structural hazards to spatial nodes for systematic review
- After-action review — Replay trajectories with detection overlay for training and debrief
- Privacy-sensitive operations — All data stays on device; no cloud calls during active operations
- Edge hardware research — Reference implementation for Qualcomm NPU perception pipelines
CIPHER includes a data/emergency_manuals/ directory with structured protocols for:
- Survivor triage
- Fire response
- Building collapse and structural hazard assessment
- Search extraction procedures
These are ingested into the local vector DB, enabling the agent to answer questions that combine live detections with protocol knowledge — e.g., "I see smoke and an unconscious person — what's the extraction priority?"
Drone/ contains a specialized configuration for aerial feeds:
- YOLO on NPU for low-power inference on embedded hardware
- Spatial agent for overhead scene understanding
- Emergency manual integration for autonomous triage assistance
- Point cloud generation from depth-enabled cameras
| Priority | Feature |
|---|---|
| 🔄 In progress | Depth model integration (Qualcomm AI Hub Depth-Anything) |
| 🔄 In progress | Tighter 3D map integration for spatial Q&A |
| ⏸️ Planned | Multi-camera fusion and unified tactical view |
| ⏸️ Planned | One-click export for after-action reporting |
| ⏸️ Planned | Mobile optimization (Snapdragon tablet/handheld) |
QComm-Warehouse-XR-Robotics-System— Qualcomm edge AI for warehouse robotics; shares the on-device inference architectureSign-Language-Edge-Gloss— Qualcomm AI Hub edge model deployment patternsAI-Safety-Benchmarking-RedTeam-Framework— Systematic evaluation framework for AI systems in safety-critical contexts
@software{cipher_2026,
author = {A-Kuo},
title = {CIPHER: On-Device Tactical Awareness AI for First Responders},
url = {https://github.com/A-Kuo/CIPHER},
year = {2026},
note = {YOLO + Whisper + local LLM + Qualcomm NPU — zero cloud dependency}
}Built for the field. Runs offline. Keeps data local. April 2026.