They've been talking this whole time. We just couldn't hear it.
In-browser hummingbird vocalization translator. Local ML, Web Audio, WebGL. All processing on-device. No audio leaves your machine.
deploy: teslasolar.github.io/hummingbird/
stack: Web Audio API | Canvas/WebGL | IndexedDB | Web Bluetooth
model: rule-based classifier β self-compacting specializer
hw: ESP32-S3 + MEMS mic array + piezo tweeter (Phase 3)
# local dev β any static server
python3 -m http.server 8000
# or
npx serve .
# then open http://localhost:8000No build step. No dependencies. Open index.html.
# deploy to github pages
git push origin main # gh-pages serves from root βββββββββββββββ
β index.html β β orchestrator
ββββββββ¬βββββββ
ββββββββββββββββββΌβββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββ ββββββββββββββ ββββββββββββ
β CAPTURE β β MODEL β β VIZ β
β pipeline β β classify β β render β
ββββββ¬ββββββ βββββββ¬βββββββ ββββββ¬ββββββ
β β β
capture.js classifier.js spectrogram.js
features.js compactor.js landscape.js
loader.js timeline.js
vocab.json
β
βββββββ΄βββββββ
β DATA β
β species β
β syllables β
ββββββββββββββ
β
βββββββ΄βββββββ
β HARDWARE β
β ble.js β
β playback β
β calibrate β
ββββββββββββββ
| what | how |
|---|---|
| mic input | Web Audio API, 44.1kHz, echo/noise/agc OFF |
| FFT | AnalyserNode, 4096-point, ~10.7Hz resolution |
| bandpass | biquad highpass 2kHz β lowpass 16kHz |
| features | fundamental freq, harmonics, contour, syllable count, pattern, spectral centroid/bandwidth |
| patterns | triplet, trill, chirp, buzz, unknown |
| syllable detect | energy threshold + duration gate (30-200ms) + triplet detector |
β data/README.md
| what | how |
|---|---|
| classifier | rule-based scoring against vocab.json |
| categories | territory, courtship, alarm, feeding, chick_begging, chase, contact |
| species ID | frequency signature matching (10 species) |
| learning | user corrections adjust rule weights via IndexedDB |
| compaction | prunes unobserved species every 100 recordings |
| size curve | 50KB general β 20KB regional β 5KB individual-specialized |
β model/README.md
| what | how |
|---|---|
| spectrogram | Canvas 2D, scrolling, freq-band colored |
| landscape | WebGL 3D terrain (Canvas 2D fallback), mouse/touch rotate |
| timeline | detection markers, color-coded by category |
| freq bands | blue 0-2kHz, green 2-8kHz, gold 8-16kHz, red 16-22kHz |
β viz/README.md
| what | how |
|---|---|
| BLE | Web Bluetooth to ESP32-S3 feeder |
| playback | piezo tweeter 2-30kHz, safety-gated |
| calibrate | frequency sweep mic/speaker response |
| safety | ONLY contact + feeding calls, volume capped, auto-silence on alarm |
β hw/README.md
| doc | scope |
|---|---|
| URS-001 | user requirements β what it must do |
| TECH-SPEC-001 | system design β MCU, audio, BLE protocol, power |
| ARCH-001 | architecture β module boundaries, data flow, state machine |
| WIRING-001 | schematics β every pin, every wire, every cap |
| FW-001 | firmware β ESP-IDF, FreeRTOS tasks, C source structure |
| MECH-001 | mechanical β enclosure, IP54, thermal, mounting |
| SAFETY-001 | safety β 7 rules, 9 layers, MBTA compliance |
| BOM-001 | materials β $55 BOM, 45 line items, volume pricing |
| TEST-001 | test plan β 73 tests, 9 categories, 5 critical safety |
AudioCapture
.start() β Promise<void> request mic, init FFT
.stop() β void release mic
.getFrequencyData() β Uint8Array full spectrum 0-Nyquist
.getTimeDomainData() β Uint8Array waveform
.getFilteredFrequencyData() β Uint8Array 2-16kHz band only
.binToFreq(bin) β number Hz
.freqToBin(freq) β number bin index
.sampleRate : number
.nyquist : number
.frequencyResolution : number
FeatureExtractor
.extract(freqData, timeData, sampleRate) β FeatureVector | null
.energyThreshold : number (default 15)
FeatureVector {
fundamental_freq : Hz
peak_freq : Hz
duration_ms : number
freq_contour : Hz[]
harmonics : Hz[]
syllable_count : number
pattern : "triplet" | "trill" | "chirp" | "buzz" | "unknown"
energy_profile : number[]
energy : number
peak_amplitude : number
spectral_centroid : Hz
spectral_bandwidth : Hz
timestamp : ISO8601
}
VocalizationClassifier
.load() β Promise<void> load vocab + species data
.classify(features) β ClassResult score against rules
.userCorrection(actual, classified) β void reinforce
.getModelSize() β string
ClassResult {
category : string territory|courtship|alarm|...
confidence : 0.0-1.0
species : string | null "Anna's Hummingbird"
pattern : string
frequency : Hz
scores : {category: score}
}
ModelCompactor
.addRecording(features, result) β void
.compact() β CompactionReport
.saveToIndexedDB() β Promise<number> returns count saved
.getRecordingCount() β Promise<number>
.exportModel() β object shareable model state
.importModel(data) β void merge shared model
FrequencyLandscape
.update(freqData, sampleRate) β void add frame + render
.toggleAutoRotate() β void
.resetView() β void
SpectrogramViz
.draw(freqData, sampleRate) β void draw one column
RecordingTimeline
.addDetection(result) β void
.draw() β void
.clear() β void
BLEFeeder
.connect() β Promise<void> scan + pair
.disconnect() β void
.sendPlayback(freqHz, durationMs, amplitude) β Promise<void>
.getStatus() β Promise<FeederStatus>
.isConnected() β boolean
HummingbirdPlayback
.play(type, bleFeeder?) β Promise<boolean> safety-gated
.testTone(bleFeeder?) β Promise<void>
.sweep(startHz, endHz, ms) β Promise<void>
.allowedTypes : ["contact", "feeding"]
.blockedTypes : ["alarm", "courtship", "territory", "chase", "chick_begging"]
SPECIES FREQ RANGE NOTES
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Ruby-throated 2-8 kHz Eastern NA most common eastern
Anna's 2-10 kHz Western NA complex song, year-round
Rufous 3-10 kHz Western NA extremely aggressive
Black-chinned 2-7 kHz Western NA soft chips
Broad-tailed 2-8 kHz Mountain West metallic wing trill
Allen's 3-9 kHz California similar to Rufous
Calliope 2-6 kHz Pacific NW smallest NA bird
Costa's 2-8 kHz SW desert whistle-like
Ecuadorian Hillstar 10-15 kHz Andes F0=13.4kHz, highest known
Black Jacobin 10-14 kHz SE Brazil harmonics to 80kHz
0-2 kHz blue sub-bird range
2-8 kHz green normal bird vocalization
8-16 kHz gold hummingbird high-frequency
16-80 kHz red ultrasonic (beyond mic Nyquist)
territory repeated chirps, rapid, aggressive 2-14 kHz
courtship complex multi-syllable song variable
alarm sharp single/burst chip wide band
feeding soft chips near food 2-5 kHz low amp
chick_beg high-pitched continuous >adult freq
chase aggressive buzz + wing sounds broadband
contact quiet single chip "I'm here" soft, brief
Duque et al. 2018 "HF vocalizations in Andean hummingbirds" Current Biology
Duque et al. 2020 "High-frequency hearing in a hummingbird" Science Advances
Olson et al. 2018 "Black Jacobin vocalizes above hearing range" Current Biology
Pytte et al. 2004 "Ultrasonic singing by blue-throated hbird" J Comp Physiol
KEY FINDINGS:
- HF vocalizations for territory defense + courtship
- Stereotyped syllable triplets ~60ms each
- Males inflate throat β iridescent wave while singing
- HF songs attenuate rapidly = short-range private channel
- Evolved to avoid masking (waterfalls, insects, other birds)
- Hummingbirds hear 0-10+ kHz, most birds 2-8 kHz
- Dogs hear up to 40 kHz (can hear all of it)
Phase 3 hardware feeder β $55 BOM:
COMPONENT SPEC COST
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
MEMS mic Γ4 INMP441 I2S 44.1kHz $16
ESP32-S3 240MHz BLE5 I2S DAC SD $8
Piezo tweeter 2-30kHz response $5
Class D amp MAX98357A I2S $4
Solar panel 5V 1W $6
LiPo battery 3.7V 2000mAh $5
Weatherproof case 3D printed $3
Feeder standard hummingbird $8
RULE 1 Never emit predator alarm calls
RULE 2 Never emit courtship calls
RULE 3 Only emit contact + feeding calls
RULE 4 Volume capped at natural hummingbird dB
RULE 5 User must confirm before any playback
RULE 6 Auto-silence if alarm behavior detected
RULE 7 No playback during nesting season without expert review
The feeder should ATTRACT, not CONFUSE.
β R0 feeder physical ground, sugar water, attraction
γ R1 microphone signal intake, 44.1kHz capture
β R2 classifier gate β what is this sound?
β‘ R3 translation meaning β territory? courtship? alarm?
β³ R4 model learn, compress, specialize, improve
β R5 species profile identity β which bird? which individual?
β― R6 3D landscape observer β see the whole soundscape
p=2 capture.js + index.html (mic + FFT) β
p=3 spectrogram.js (real-time freq display) β
p=5 features.js (syllable + pattern extraction) β
p=7 vocab.json + species.json (base knowledge) β
p=11 classifier.js + loader.js (rule-based model) β
p=13 landscape.js (3D WebGL terrain) β
p=17 compactor.js (self-pruning local learning) β
hummingbird/
βββ index.html orchestrator + UI
βββ style.css dark theme, hummingbird green (#39e891)
βββ capture.js Web Audio mic + FFT + bandpass
βββ features.js syllable extraction + pattern detection
βββ model/
β βββ loader.js data loader (vocab + species + syllables)
β βββ classifier.js rule-based vocalization classifier
β βββ compactor.js self-compacting model pruner
β βββ vocab.json classification rules + species signatures
βββ viz/
β βββ spectrogram.js Canvas 2D scrolling spectrogram
β βββ landscape.js WebGL 3D frequency terrain
β βββ timeline.js detection timeline with annotations
βββ hw/
β βββ ble.js Web Bluetooth feeder connection
β βββ playback.js safety-gated tone generation
β βββ calibrate.js mic/speaker calibration
βββ data/
β βββ species.json 10 hummingbird species profiles
β βββ syllables.json 8 syllable types + 7 vocalization contexts
β βββ recordings/ user recordings (IndexedDB backed)
βββ README.md β you are here