Skip to content

EmirhanEtem/SSYNC

Repository files navigation

🎧 SSYNC β€” Audio Augmented Reality

Real-time AI-powered sound discovery and mixing on your phone

React Native Android C++ TFLite License

Status

SSYNC listens to the world through your phone's microphone, identifies environmental sounds using spectral AI analysis, and lets you control the volume of each discovered sound category in real-time β€” like a DJ mixer for reality.

πŸ‡ΉπŸ‡· TΓΌrkΓ§e README


🌟 Features

πŸ” Discovery Mode

  • No hardcoded categories β€” the AI discovers sounds dynamically
  • Real-time spectral analysis identifies: Speech, Singing, Music, Traffic, Bird Song, Wind, Engine Rumble, and more
  • New sliders automatically appear when sounds are detected
  • Sliders fade out after 8 seconds of silence

πŸŽ›οΈ Real-Time Mixer

  • Individual volume control for each discovered sound
  • Touch-to-set horizontal sliders with per-category gain
  • Glassmorphism dark UI with neon color coding:
    • πŸŽ™οΈ Cyan β€” Voice (Speech, Talking, Whispering)
    • 🎡 Purple β€” Music (Keyboard, Singing)
    • πŸ™οΈ Orange β€” Urban (Traffic, Engine Rumble, Impact)
    • 🌿 Green β€” Nature (Bird Song, Wind)

⚑ Ultra-Low Latency

  • Oboe audio engine with SharingMode::Exclusive + PerformanceMode::LowLatency
  • 48kHz phone mic for maximum AI accuracy
  • C++ DSP pipeline β€” zero-allocation, lock-free audio processing
  • ~64ms classification interval with 300ms consensus voting

🧠 AI Pipeline

  • Source Separation (4-stem TFLite model) β€” separates Voice, Music, Traffic, Nature
  • Spectral Classifier β€” ZCR, spectral centroid, high-frequency ratio analysis
  • Consensus Voting β€” 4-frame window, 50% threshold to prevent false detections
  • 5x Digital Pre-Amp for enhanced microphone sensitivity

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Phone Mic  │────▢│  Oboe (48kHz, Exclusive, LowLatency) β”‚
β”‚  (Built-in) β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β”‚
                                   β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚     AudioEngine (C++)     β”‚
                    β”‚                          β”‚
                    β”‚  β”Œβ”€ 5x Pre-Amp ──────┐   β”‚
                    β”‚  β”‚ processBlock()    β”‚   β”‚
                    β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
                    β”‚         β”‚                β”‚
                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
                    β”‚  β”‚ Source Separation  β”‚   β”‚
                    β”‚  β”‚ (TFLite 4-stem)   β”‚   β”‚
                    β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
                    β”‚         β”‚                β”‚
                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
                    β”‚  β”‚ Spectral Features β”‚   β”‚
                    β”‚  β”‚ ZCR + Centroid    β”‚   β”‚
                    β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
                    β”‚         β”‚                β”‚
                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
                    β”‚  β”‚ Heuristic Classify β”‚   β”‚
                    β”‚  β”‚ + Voting Window   β”‚   β”‚
                    β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
                    β”‚         β”‚                β”‚
                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
                    β”‚  β”‚ Discovery Map     β”‚   β”‚
                    β”‚  β”‚ (mutex-guarded)   β”‚   β”‚
                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚ JNI Bridge
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   AudioModule (Java)       β”‚
                    β”‚   React Native Bridge      β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚ JSON
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   App.tsx (React Native)   β”‚
                    β”‚   Dynamic FlatList UI      β”‚
                    β”‚   Glassmorphism + Neon     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

SSYNC/
β”œβ”€β”€ App.tsx                          # React Native UI (Discovery Mode)
β”œβ”€β”€ index.js                         # Entry point
β”œβ”€β”€ package.json
β”œβ”€β”€ android/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ build.gradle
β”‚   β”‚   └── src/main/
β”‚   β”‚       β”œβ”€β”€ AndroidManifest.xml
β”‚   β”‚       β”œβ”€β”€ assets/
β”‚   β”‚       β”‚   └── auramix_model.tflite    # ML model
β”‚   β”‚       β”œβ”€β”€ java/com/auramixapp/
β”‚   β”‚       β”‚   β”œβ”€β”€ AudioModule.java        # JNI ↔ React bridge
β”‚   β”‚       β”‚   β”œβ”€β”€ AudioPackage.java       # RN package registration
β”‚   β”‚       β”‚   β”œβ”€β”€ MainActivity.kt
β”‚   β”‚       β”‚   └── MainApplication.kt
β”‚   β”‚       β”œβ”€β”€ cpp/                        # C++ Audio Engine
β”‚   β”‚       β”‚   β”œβ”€β”€ AudioEngine.h           # Engine header
β”‚   β”‚       β”‚   β”œβ”€β”€ AudioEngine.cpp         # DSP + AI classifier
β”‚   β”‚       β”‚   β”œβ”€β”€ AudioBridge.cpp         # JNI bridge
β”‚   β”‚       β”‚   β”œβ”€β”€ OboeAudioStream.h       # Oboe header
β”‚   β”‚       β”‚   └── OboeAudioStream.cpp     # Low-latency audio I/O
β”‚   β”‚       └── jni/
β”‚   β”‚           β”œβ”€β”€ CMakeLists.txt          # Build config
β”‚   β”‚           └── OnLoad.cpp
β”‚   └── gradle.properties
└── ml/
    └── train_and_export.py                 # TFLite model training

πŸš€ Getting Started

Prerequisites

  • Node.js 18+
  • Java JDK 17
  • Android SDK (API 33+)
  • Android NDK (for C++ compilation)
  • Python 3.10+ (for model training only)
  • An Android phone with USB debugging enabled

Installation

# 1. Clone the repository
git clone https://github.com/EmirhanEtem/SSYNC.git
cd SSYNC

# 2. Install Node dependencies
npm install

# 3. Build and run on device
npx react-native run-android

Build Standalone APK

# Bundle JavaScript into the APK
npx react-native bundle \
  --platform android --dev false \
  --entry-file index.js \
  --bundle-output android/app/src/main/assets/index.android.bundle \
  --assets-dest android/app/src/main/res/

# Build release APK
cd android && ./gradlew assembleRelease

# Output: android/app/build/outputs/apk/release/app-release.apk

Train the ML Model (Optional)

cd ml
pip install tensorflow numpy
python train_and_export.py
# Output: auramix_model.tflite β†’ copy to android/app/src/main/assets/

πŸ”§ Configuration

Audio Engine Constants (AudioEngine.h)

Constant Default Description
kSampleRate 48000 Microphone sample rate (Hz)
kInputPreAmp 5.0 Digital input gain multiplier
kClassifyEveryN 8 Classify every N audio callbacks
kVotingWindowSize 4 Frames for consensus voting
kConsensusThreshold 0.50 Min vote ratio to confirm detection
kConfidenceThreshold 0.35 Min confidence per detection
kCooldownMs 8000 Time before removing inactive slider

Debug Logging

# View real-time AI detections in Logcat
adb logcat -s AuraMixAI

# Example output:
# AuraMixAI: Detected [Speech: 0.90, Voice: 0.70] | inputDb=-15.2 zcr=0.230
# AuraMixAI: NEW DISCOVERY: [Speech] category=voice confidence=100% votes=4/4

πŸ› οΈ Tech Stack

Layer Technology
Frontend React Native 0.84 (New Architecture)
Bridge JNI (Java ↔ C++) + NativeModules (JS ↔ Java)
Audio I/O Google Oboe (AAudio/OpenSL ES)
DSP Engine Custom C++17, lock-free processing
ML Inference TensorFlow Lite C API
Classification Spectral heuristics (ZCR, Centroid, HFR)
Build System Gradle + CMake + Metro

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License β€” see the LICENSE file for details.


πŸ‘€ Author

Emirhan Etem


Built with ❀️ and C++ for real-time audio

About

Audio Augmented Reality - Real-time AI sound discovery and mixing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors