Skip to content

romizone/sonic-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

SonicAI

SonicAI

Browser-Based AI Text-to-Song Generator

Live Demo Paper

HTML5 CSS3 JavaScript Web Audio API Speech Synthesis Vercel MIT License


Transform text and lyrics into music โ€” entirely in your browser, no server required.


๐ŸŽฏ Overview

SonicAI is a real-time text-to-song generation system that runs entirely client-side in the browser. It converts lyrics into musical compositions using the Web Audio API for instrument synthesis and formant vocal modeling for singing voices, paired with the browser's native Speech Synthesis API for lyric vocalization.

Zero dependencies. Single HTML file. Instant music.


โœจ Key Features

Feature Description
๐ŸŽน 6 Genres Pop, Rock, Jazz, Electronic, Classical, Lo-fi โ€” each with unique scales, chords, and drum patterns
๐ŸŽค Dual Vocal Engine Formant synthesizer for melodic "aah/ooh" vocals + Speech Synthesis for lyric articulation
๐ŸŽผ Text-to-Melody Character-to-note mapping algorithm that generates scale-appropriate melodies from any text
๐Ÿ“Š Live Visualizer Real-time 80-bar frequency spectrum analyzer with genre-colored gradients
๐Ÿ‡ฎ๐Ÿ‡ฉ Bilingual Presets 5 original Indonesian songs with English translations
๐ŸŽš๏ธ Full Controls Genre, key, tempo (60โ€“180 BPM), volume, play/pause/stop
๐Ÿ“ฑ Responsive Works on desktop and mobile browsers
โšก Zero Dependencies Single index.html file โ€” no npm, no build step, no server

๐Ÿš€ Quick Start

Option 1: Live Demo

Visit sonic-ai-dun.vercel.app โ€” no installation needed.

Option 2: Run Locally

git clone https://github.com/romizone/sonic-ai.git
cd sonic-ai
open index.html

Option 3: Local Server

python3 -m http.server 3000
# Open http://localhost:3000

๐ŸŽต How It Works

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Input Text  โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Text-to-Melody  โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚   Melody    โ”‚โ”€โ”€โ”
โ”‚  / Lyrics    โ”‚     โ”‚  (char โ†’ note)   โ”‚     โ”‚  Oscillator โ”‚  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
                                                               โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Genre Config โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Chord Engine    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Pad Synth  โ”‚โ”€โ”€โ”ผโ”€โ–ถโ”‚  Master  โ”‚
โ”‚ (scale,bpm) โ”‚     โ”‚  (I-IV-V-I etc)  โ”‚     โ”‚  + Bass     โ”‚  โ”‚  โ”‚  Gain    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚  โ”‚          โ”‚
                                                               โ”‚  โ”‚    โ”Œโ”€โ”€โ”€โ”€โ”€โ”ค
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚  โ”‚    โ”‚Reverbโ”‚
โ”‚  Formant    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Bandpass Filter  โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Vocal      โ”‚โ”€โ”€โ”ค  โ”‚    โ””โ”€โ”€โ”ฌโ”€โ”€โ”˜
โ”‚  Vocal Syn  โ”‚     โ”‚  Chain (F1,F2,F3) โ”‚     โ”‚  "Aah/Ooh"  โ”‚  โ”‚  โ”‚       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                                               โ”‚  โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Drum       โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Kick/Snare/HH   โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Drum Bus   โ”‚โ”€โ”€โ”˜  โ””โ”€โ–ถโ”‚ Analyser โ”‚
โ”‚  Pattern    โ”‚     โ”‚  Synthesis       โ”‚     โ”‚             โ”‚      โ”‚ + Output โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐ŸŽค Preset Songs

# Song Artist Genre BPM Key
1 Senja di Sudirman Dian Sastro Pop 110 C
2 Hujan di Senopati Wulan Jazz 95 Am
3 Kereta Terakhir Davina Rock 125 Em
4 Kopi dan Janji Titi Kamal Lo-fi 85 F
5 Lampu Kota Davina Electronic 128 G

All songs feature Jakarta city themes with both Indonesian and English lyrics.


๐Ÿ—๏ธ Architecture

Audio Synthesis Engine

Layer Technology Purpose
Melody OscillatorNode + LowpassFilter Genre-specific waveforms with vibrato
Vocals 3x Detuned Oscillators + BandpassFilter chain Formant synthesis (vowel-like "aah/ooh")
Speech SpeechSynthesisUtterance Lyric articulation with pitch/rate tuning
Chords Detuned OscillatorNode pairs + LowpassFilter Pad sounds with smooth crossfade envelopes
Bass OscillatorNode + LowpassFilter (400Hz) Genre-specific waveform bass lines
Drums OscillatorNode + AudioBuffer (noise) Synthesized kick, snare, hi-hat
Reverb ConvolverNode (procedural impulse) Exponential decay with early reflections
Visualizer AnalyserNode + Canvas 2D 80-bar frequency spectrum at 2x resolution

Genre Configurations

Each genre defines a unique combination of:

  • Scale โ€” Ionian, Minor, Blues, Pentatonic, Dorian
  • Chord Progression โ€” I-IV-V-I, i-iv-III-i, Imaj7-IVmaj7, etc.
  • Waveforms โ€” sine, triangle, sawtooth, square
  • Formant Frequencies โ€” F1/F2/F3 tuning for vocal character
  • Drum Pattern โ€” Beat placement and swing feel
  • Effects โ€” Reverb decay (1.2sโ€“4.0s), filter cutoff, wet/dry mix

๐Ÿ“Š Comparison with Deep Learning

SonicAI Suno AI
Voice Quality Formant synth + TTS Neural vocal synthesis
Latency Instant (client-side) Secondsโ€“minutes
Dependencies None (browser only) GPU servers
Offline Full support Requires internet
Cost Free & open source Subscription
Privacy 100% local Data sent to servers
File Size ~40KB Multi-GB models

๐Ÿ“„ Academic Paper

The full technical paper is available at SonicAI_Paper.pdf, covering:

  • System architecture and audio signal flow
  • Text-to-melody conversion algorithm
  • Formant vocal synthesis with bandpass filter chains
  • Genre-adaptive chord progression and drum pattern engines
  • Chrome autoplay policy compliance
  • Comparison with deep learning approaches
  • Limitations and future work

๐Ÿ› ๏ธ Tech Stack

HTML5 CSS3 JavaScript Web Audio API Vercel


๐Ÿ“ License

This project is open source and available under the MIT License.


๐Ÿ‘ค Author

Romi Nur Ismanto


Built with Web Audio API | Deployed on Vercel | Made with โค๏ธ

About

SonicAI - AI Text to Song Generator (Web Audio API + Speech Synthesis)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages