SonarVision ⚓

Depth sounder pings → self-supervised generative underwater video Sonar hears. Cameras see. The physics of the water column does the rest.

What Is This?

SonarVision converts marine sonar depth sounder returns into predicted underwater video frames. When sonar detects a fish at 15 meters, the camera positioned at 15 meters provides the ground truth. No manual labeling required.

Sonar: "Fish at 15.2m"
Camera@5m:  dark water    → w = 0.01
Camera@15m: ★ FISH ★     → w = 0.95 (ground truth!)
Camera@20m: dark water    → w = 0.01

Loss = Σ exp(-|d_cam - d_sonar|²/σ²) × L2(predicted, camera)

Quick Start

git clone https://github.com/SuperInstance/sonar-vision.git
cd sonar-vision
pip install torch numpy pillow pyyaml einops

python -c "
import torch
from sonar_vision.pipeline import SonarVision
model = SonarVision(max_depth=50, bearing_bins=32, embed_dim=256)
model.eval()
sonar = torch.randn(1, 32, 50)
output = model.generate(sonar)
print(f'Frame: {output[\"frame\"].shape}')  # (1, 3, 384, 512)
"

📖 Full Getting Started Guide

Documentation

Doc	What it covers
Getting Started	Install, first inference, troubleshooting
Tutorials	6 step-by-step tutorials (prediction, data pipeline, cameras, LoRA, Jetson, federated)
Onboarding	Contributing, code structure, PR process
Use Cases	Commercial fishing, aquaculture, AUVs, sustainability, market analysis
Hardware Guide	Sonar, cameras, Jetson, wiring diagrams, cost breakdown
Architecture	System design, GCT adapter, training pipeline, deployment plan

Architecture

Sonar → SonarEncoder (4-ch ViT) → GCT Stream (causal attn, KV cache) → VideoDecoder → RGB Frame
                                                                         ↓
                                                          DepthWeightedLoss ← cameras

Module	What it does
`SonarEncoder`	4-channel ViT (intensity, gradient, depth-norm, accumulated)
`StreamingGCTAggregator`	Causal attention, KV cache, 3D RoPE, grouped-query attention
`VideoDecoder`	DPT head + UnderwaterColorHead (blue-green cast by depth)
`DepthWeightedLoss`	Automatic ground truth from camera array
`WaterColumnModel`	Unified model: Mackenzie sound speed, Francois-Garrison absorption, Jerlov light
`NightlyTrainer`	Automated daily LoRA training with quality scoring
`FederatedAggregator`	Opt-in data sharing with differential privacy

Real-World Applications

🐟 Commercial fishing — Species ID from sonar, bycatch reduction, catch forecasting
🏭 Aquaculture — Fish counting, health monitoring, feeding optimization ($910M market by 2026)
🤖 AUVs — Obstacle avoidance, SLAM, 3D reconstruction in zero-visibility water
🔬 Marine research — Habitat mapping, species surveys, environmental monitoring
🌊 Sustainability — IUU detection, catch monitoring, traceability

📖 Full Use Cases + Market Analysis

Hardware Setup

Complete system from $500 (entry) to $5,000 (professional):

Config	Hardware	FPS
Entry	Used sonar + 2 cameras + Jetson Nano	5-8
Standard	Garmin sonar + 4 GoPros + Jetson Orin NX	10-15
Pro	Premium sonar + 4 GoPros + Jetson Orin NX	10-15
Research	Full system + Jetson AGX Orin	20-30

📖 Full Hardware Guide + Wiring Diagrams

Nightly Training

The system gets better every day. Mount cameras on your downrigger, let it run:

# Set up cron — trains automatically at 3am
0 3 * * * cd /opt/sonar-vision && python -m sonar_vision.nightly.cron \
    --data_dir /data/boat_data --output_dir /data/nightly

Progressive epochs (3 → 10 as dataset grows)
Automatic promotion/rollback based on quality score
LoRA weights are tiny (~2MB) — fast training, easy sharing

📖 Tutorial: Training Your First LoRA

Federated Learning

Opt-in to share anonymized data for a global underwater vision model:

Privacy first: Only noise-injected LoRA weights leave your boat (no raw images, no GPS)
Differential privacy: Calibrated Gaussian noise (ε=1.0 budget)
Federated averaging: Weighted by data quality × quantity
Regional tracking: GOA, Bering, Gulf, Pacific NW

📖 Tutorial: Joining the Federated Network

Project Structure

sonar_vision/
├── pipeline.py           # Main model (encoder → GCT → decoder)
├── config.py             # YAML experiment configs
├── deploy.py             # TorchScript, quantization, Jetson
├── train.py              # Training loop (AMP, EMA)
├── encoder/              # SonarSweepEmbedding + SonarEncoder
├── aggregator/           # GCT streaming (KV cache, 3D RoPE)
├── decoder/              # DPT head + UnderwaterColorHead
├── supervision/          # Self-supervision losses
├── water/                # Underwater physics (Francois-Garrison, Jerlov, Thermocline, Seabed)
├── data/                 # Dataset, augmentation, preprocessing
├── nightly/              # LoRA training, cron job
├── federated/            # Data anonymization, DP, FedAvg
├── utils/                # Visualization
└── tests/                # Unit tests

Stats

Metric	Value
Source files	28
Lines of Python	5,000+
Test files	5
Preset configs	3
PLATO knowledge rooms	4
Tutorials	6
MIT License	✅

Adapted from LingBot-Map

LingBot-Map	SonarVision
RGB video → ViT patches	Sonar pings → 4-channel acoustic embed
3D RoPE (x, y, time)	3D RoPE (depth, bearing, time)
Camera intrinsics	Fixed sonar geometry
DPT → 3D point cloud	DPT → underwater video frames
Pretrained depth labels	Self-supervised via camera array

Fleet Integration

SonarVision is part of the Cocapn ecosystem:

🏛️ PLATO — 4 knowledge rooms (architecture, physics, GCT, self-supervision)
🐚 I2I — Fleet communication bottles
⚡ Flux — NMEA preprocessing bytecode
🐙 Git-Native — Push protocol and recovery

Contributing

Contributions welcome! See docs/ONBOARDING.md for setup and PR process.

Good first issues:

Add more NMEA sentence parsers ($SDDBT, $SDVHW, $SDMTW)
Real data benchmarks on actual sonar recordings
Jetson TensorRT INT8 quantization
Documentation improvements and translations

License

MIT — use it however you want.

Sonar hears. Cameras see. The physics of the water column does the rest.

Part of the SuperInstance fleet. Built by Forgemaster ⚒️.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
.plato		.plato
benchmarks		benchmarks
configs		configs
cross-pollination-prs		cross-pollination-prs
dashboard-widget		dashboard-widget
dashboard		dashboard
demo		demo
docs-site		docs-site
docs		docs
monitoring		monitoring
notebooks		notebooks
open-agents		open-agents
packages		packages
plugins		plugins
sim-pipeline		sim-pipeline
sonar_vision		sonar_vision
tests		tests
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.streaming		Dockerfile.streaming
PACKAGES.md		PACKAGES.md
README.md		README.md
cross-pollination-plan.md		cross-pollination-plan.md
cross-pollination-results.json		cross-pollination-results.json
docker-compose.streaming.yml		docker-compose.streaming.yml
docker-compose.yml		docker-compose.yml
flux-physics-proposal.md		flux-physics-proposal.md
index.html		index.html
landing.html		landing.html
neural_physics.py		neural_physics.py
pyproject.toml		pyproject.toml
ray_tracer.py		ray_tracer.py
sonar-vision-cli.py		sonar-vision-cli.py
streaming.py		streaming.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SonarVision ⚓

What Is This?

Quick Start

Documentation

Architecture

Real-World Applications

Hardware Setup

Nightly Training

Federated Learning

Project Structure

Stats

Adapted from LingBot-Map

Fleet Integration

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SonarVision ⚓

What Is This?

Quick Start

Documentation

Architecture

Real-World Applications

Hardware Setup

Nightly Training

Federated Learning

Project Structure

Stats

Adapted from LingBot-Map

Fleet Integration

Contributing

License

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages