Skip to content

teslasolar/hummingbird

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🐦 KONOMI TORI ε₯½γΏγƒˆγƒͺ

They've been talking this whole time. We just couldn't hear it.

In-browser hummingbird vocalization translator. Local ML, Web Audio, WebGL. All processing on-device. No audio leaves your machine.

deploy: teslasolar.github.io/hummingbird/
stack:  Web Audio API | Canvas/WebGL | IndexedDB | Web Bluetooth
model:  rule-based classifier β†’ self-compacting specializer
hw:     ESP32-S3 + MEMS mic array + piezo tweeter (Phase 3)

# local dev β€” any static server
python3 -m http.server 8000
# or
npx serve .
# then open http://localhost:8000

No build step. No dependencies. Open index.html.

# deploy to github pages
git push origin main  # gh-pages serves from root

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  index.html β”‚ ← orchestrator
                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β–Ό                β–Ό                β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ CAPTURE  β”‚    β”‚   MODEL    β”‚    β”‚   VIZ    β”‚
    β”‚ pipeline β”‚    β”‚  classify  β”‚    β”‚  render  β”‚
    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
         β”‚                β”‚                β”‚
    capture.js       classifier.js    spectrogram.js
    features.js      compactor.js     landscape.js
                     loader.js        timeline.js
                     vocab.json
                          β”‚
                    β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
                    β”‚    DATA    β”‚
                    β”‚  species   β”‚
                    β”‚  syllables β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                    β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
                    β”‚  HARDWARE  β”‚
                    β”‚  ble.js    β”‚
                    β”‚  playback  β”‚
                    β”‚  calibrate β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

capture β€” capture.js features.js

what how
mic input Web Audio API, 44.1kHz, echo/noise/agc OFF
FFT AnalyserNode, 4096-point, ~10.7Hz resolution
bandpass biquad highpass 2kHz β†’ lowpass 16kHz
features fundamental freq, harmonics, contour, syllable count, pattern, spectral centroid/bandwidth
patterns triplet, trill, chirp, buzz, unknown
syllable detect energy threshold + duration gate (30-200ms) + triplet detector

β†’ data/README.md

model β€” model/

what how
classifier rule-based scoring against vocab.json
categories territory, courtship, alarm, feeding, chick_begging, chase, contact
species ID frequency signature matching (10 species)
learning user corrections adjust rule weights via IndexedDB
compaction prunes unobserved species every 100 recordings
size curve 50KB general β†’ 20KB regional β†’ 5KB individual-specialized

β†’ model/README.md

viz β€” viz/

what how
spectrogram Canvas 2D, scrolling, freq-band colored
landscape WebGL 3D terrain (Canvas 2D fallback), mouse/touch rotate
timeline detection markers, color-coded by category
freq bands blue 0-2kHz, green 2-8kHz, gold 8-16kHz, red 16-22kHz

β†’ viz/README.md

hw β€” hw/

what how
BLE Web Bluetooth to ESP32-S3 feeder
playback piezo tweeter 2-30kHz, safety-gated
calibrate frequency sweep mic/speaker response
safety ONLY contact + feeding calls, volume capped, auto-silence on alarm

β†’ hw/README.md

engineering β€” engineering/

doc scope
URS-001 user requirements β€” what it must do
TECH-SPEC-001 system design β€” MCU, audio, BLE protocol, power
ARCH-001 architecture β€” module boundaries, data flow, state machine
WIRING-001 schematics β€” every pin, every wire, every cap
FW-001 firmware β€” ESP-IDF, FreeRTOS tasks, C source structure
MECH-001 mechanical β€” enclosure, IP54, thermal, mounting
SAFETY-001 safety β€” 7 rules, 9 layers, MBTA compliance
BOM-001 materials β€” $55 BOM, 45 line items, volume pricing
TEST-001 test plan β€” 73 tests, 9 categories, 5 critical safety

β†’ engineering/README.md


AudioCapture
  .start()              β†’ Promise<void>     request mic, init FFT
  .stop()               β†’ void              release mic
  .getFrequencyData()   β†’ Uint8Array        full spectrum 0-Nyquist
  .getTimeDomainData()  β†’ Uint8Array        waveform
  .getFilteredFrequencyData() β†’ Uint8Array  2-16kHz band only
  .binToFreq(bin)       β†’ number Hz
  .freqToBin(freq)      β†’ number bin index
  .sampleRate            : number
  .nyquist               : number
  .frequencyResolution   : number

FeatureExtractor
  .extract(freqData, timeData, sampleRate) β†’ FeatureVector | null
  .energyThreshold       : number (default 15)

FeatureVector {
  fundamental_freq     : Hz
  peak_freq            : Hz
  duration_ms          : number
  freq_contour         : Hz[]
  harmonics            : Hz[]
  syllable_count       : number
  pattern              : "triplet" | "trill" | "chirp" | "buzz" | "unknown"
  energy_profile       : number[]
  energy               : number
  peak_amplitude       : number
  spectral_centroid    : Hz
  spectral_bandwidth   : Hz
  timestamp            : ISO8601
}

VocalizationClassifier
  .load()               β†’ Promise<void>     load vocab + species data
  .classify(features)   β†’ ClassResult       score against rules
  .userCorrection(actual, classified) β†’ void   reinforce
  .getModelSize()       β†’ string

ClassResult {
  category             : string             territory|courtship|alarm|...
  confidence           : 0.0-1.0
  species              : string | null       "Anna's Hummingbird"
  pattern              : string
  frequency            : Hz
  scores               : {category: score}
}

ModelCompactor
  .addRecording(features, result) β†’ void
  .compact()            β†’ CompactionReport
  .saveToIndexedDB()    β†’ Promise<number>    returns count saved
  .getRecordingCount()  β†’ Promise<number>
  .exportModel()        β†’ object             shareable model state
  .importModel(data)    β†’ void               merge shared model

FrequencyLandscape
  .update(freqData, sampleRate) β†’ void       add frame + render
  .toggleAutoRotate()   β†’ void
  .resetView()          β†’ void

SpectrogramViz
  .draw(freqData, sampleRate) β†’ void         draw one column

RecordingTimeline
  .addDetection(result) β†’ void
  .draw()               β†’ void
  .clear()              β†’ void

BLEFeeder
  .connect()            β†’ Promise<void>      scan + pair
  .disconnect()         β†’ void
  .sendPlayback(freqHz, durationMs, amplitude) β†’ Promise<void>
  .getStatus()          β†’ Promise<FeederStatus>
  .isConnected()        β†’ boolean

HummingbirdPlayback
  .play(type, bleFeeder?) β†’ Promise<boolean>   safety-gated
  .testTone(bleFeeder?)   β†’ Promise<void>
  .sweep(startHz, endHz, ms) β†’ Promise<void>
  .allowedTypes          : ["contact", "feeding"]
  .blockedTypes          : ["alarm", "courtship", "territory", "chase", "chick_begging"]

species coverage

SPECIES                  FREQ         RANGE               NOTES
─────────────────────────────────────────────────────────────────
Ruby-throated            2-8 kHz      Eastern NA          most common eastern
Anna's                   2-10 kHz     Western NA          complex song, year-round
Rufous                   3-10 kHz     Western NA          extremely aggressive
Black-chinned            2-7 kHz      Western NA          soft chips
Broad-tailed             2-8 kHz      Mountain West       metallic wing trill
Allen's                  3-9 kHz      California          similar to Rufous
Calliope                 2-6 kHz      Pacific NW          smallest NA bird
Costa's                  2-8 kHz      SW desert           whistle-like
Ecuadorian Hillstar      10-15 kHz    Andes               F0=13.4kHz, highest known
Black Jacobin            10-14 kHz    SE Brazil           harmonics to 80kHz

frequency bands

0-2 kHz     blue     sub-bird range
2-8 kHz     green    normal bird vocalization
8-16 kHz    gold     hummingbird high-frequency
16-80 kHz   red      ultrasonic (beyond mic Nyquist)

vocalization types

territory    repeated chirps, rapid, aggressive       2-14 kHz
courtship    complex multi-syllable song              variable
alarm        sharp single/burst chip                  wide band
feeding      soft chips near food                     2-5 kHz low amp
chick_beg    high-pitched continuous                  >adult freq
chase        aggressive buzz + wing sounds            broadband
contact      quiet single chip "I'm here"             soft, brief

Duque et al. 2018   "HF vocalizations in Andean hummingbirds"      Current Biology
Duque et al. 2020   "High-frequency hearing in a hummingbird"      Science Advances
Olson et al. 2018   "Black Jacobin vocalizes above hearing range"  Current Biology
Pytte et al. 2004   "Ultrasonic singing by blue-throated hbird"    J Comp Physiol

KEY FINDINGS:
- HF vocalizations for territory defense + courtship
- Stereotyped syllable triplets ~60ms each
- Males inflate throat β†’ iridescent wave while singing
- HF songs attenuate rapidly = short-range private channel
- Evolved to avoid masking (waterfalls, insects, other birds)
- Hummingbirds hear 0-10+ kHz, most birds 2-8 kHz
- Dogs hear up to 40 kHz (can hear all of it)

Phase 3 hardware feeder β€” $55 BOM:

COMPONENT              SPEC                     COST
────────────────────────────────────────────────────
MEMS mic Γ—4            INMP441 I2S 44.1kHz      $16
ESP32-S3               240MHz BLE5 I2S DAC SD   $8
Piezo tweeter          2-30kHz response          $5
Class D amp            MAX98357A I2S             $4
Solar panel            5V 1W                     $6
LiPo battery           3.7V 2000mAh             $5
Weatherproof case      3D printed                $3
Feeder                 standard hummingbird      $8

RULE 1  Never emit predator alarm calls
RULE 2  Never emit courtship calls
RULE 3  Only emit contact + feeding calls
RULE 4  Volume capped at natural hummingbird dB
RULE 5  User must confirm before any playback
RULE 6  Auto-silence if alarm behavior detected
RULE 7  No playback during nesting season without expert review

The feeder should ATTRACT, not CONFUSE.


●  R0  feeder          physical ground, sugar water, attraction
γ€œ R1  microphone       signal intake, 44.1kHz capture
┃  R2  classifier       gate β€” what is this sound?
β™‘  R3  translation      meaning β€” territory? courtship? alarm?
β–³  R4  model            learn, compress, specialize, improve
◐  R5  species profile  identity β€” which bird? which individual?
β—―  R6  3D landscape     observer β€” see the whole soundscape

p=2   capture.js + index.html (mic + FFT)           βœ“
p=3   spectrogram.js (real-time freq display)        βœ“
p=5   features.js (syllable + pattern extraction)    βœ“
p=7   vocab.json + species.json (base knowledge)     βœ“
p=11  classifier.js + loader.js (rule-based model)   βœ“
p=13  landscape.js (3D WebGL terrain)                βœ“
p=17  compactor.js (self-pruning local learning)     βœ“

hummingbird/
β”œβ”€β”€ index.html              orchestrator + UI
β”œβ”€β”€ style.css               dark theme, hummingbird green (#39e891)
β”œβ”€β”€ capture.js              Web Audio mic + FFT + bandpass
β”œβ”€β”€ features.js             syllable extraction + pattern detection
β”œβ”€β”€ model/
β”‚   β”œβ”€β”€ loader.js           data loader (vocab + species + syllables)
β”‚   β”œβ”€β”€ classifier.js       rule-based vocalization classifier
β”‚   β”œβ”€β”€ compactor.js        self-compacting model pruner
β”‚   └── vocab.json          classification rules + species signatures
β”œβ”€β”€ viz/
β”‚   β”œβ”€β”€ spectrogram.js      Canvas 2D scrolling spectrogram
β”‚   β”œβ”€β”€ landscape.js        WebGL 3D frequency terrain
β”‚   └── timeline.js         detection timeline with annotations
β”œβ”€β”€ hw/
β”‚   β”œβ”€β”€ ble.js              Web Bluetooth feeder connection
β”‚   β”œβ”€β”€ playback.js         safety-gated tone generation
β”‚   └── calibrate.js        mic/speaker calibration
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ species.json        10 hummingbird species profiles
β”‚   β”œβ”€β”€ syllables.json      8 syllable types + 7 vocalization contexts
β”‚   └── recordings/         user recordings (IndexedDB backed)
└── README.md               ← you are here

About

🐦 KONOMI TORI ε₯½γΏγƒˆγƒͺ Hummingbird Communication Translator

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors