## 🎵 **A complete Python code snippet for creating an audio spectrum visualizer in Google Colab by Worachat W., Ph.D. 2025** 🎵  

This visualizer allows you to upload a WAV audio file and displays a bar-based frequency spectrum that updates in real-time as the audio plays. It uses Matplotlib for visualization and IPython for audio playback.

---

## Audio Spectrum Visualizer in Google Colab

### Overview
This code:
1. Uploads an audio file (WAV format) using Google Colab's file upload feature.
2. Processes the audio to compute its frequency spectrum using the Fast Fourier Transform (FFT).
3. Creates an animated bar plot to visualize the spectrum.
4. Provides an audio player to listen to the file while watching the visualization.

### Prerequisites
- Run this code in a Google Colab notebook.
- Ensure your audio file is in WAV format (e.g., `sample.wav`).

---

### How to Use
1. **Copy the Code**: Paste this code into a cell in a Google Colab notebook.
2. **Run the Cell**: Execute the cell by pressing `Shift + Enter`.
3. **Upload a File**: A file upload prompt will appear. Upload a WAV audio file from your computer.
4. **View the Output**:
   - An animated bar plot will appear, showing the frequency spectrum.
   - An audio player will display below the animation. Click "Play" to listen to the audio while watching the visualization.

### Explanation
- **File Upload**: Uses `files.upload()` to let you upload a WAV file.
- **Audio Processing**: The `wave` library reads the file, and the data is converted to a NumPy array. Stereo audio is simplified to mono by taking the first channel.
- **STFT**: The audio is split into overlapping windows (1024 samples each, shifted by 512 samples). The FFT is computed for each window to get the frequency spectrum.
- **Visualization**: A bar plot with 64 bars (representing the first 64 frequency bins) updates dynamically using `FuncAnimation`. Each bar’s height reflects the magnitude of a frequency bin.
- **Audio Playback**: `IPython.display.Audio` creates a playable audio widget with the processed audio data.

### Notes
- **Supported Format**: This code works with WAV files. For other formats (e.g., MP3), you’d need a library like `librosa` (install with `!pip install librosa` and modify the code).
- **Y-Axis Scaling**: The `ax.set_ylim(0, 10000)` is a fixed limit. If the bars are too small or clipped, adjust this value based on your audio’s volume.
- **Performance**: For long audio files, animation generation may take time. Use a short clip (e.g., 10-20 seconds) for best results.
- **Synchronization**: The animation and audio aren’t perfectly synced but are close enough for a visual effect. Start the audio manually after the animation appears.

---

In [None]:
# Import necessary libraries
!pip install pydub

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
from google.colab import files
import io
import wave
from IPython.display import Audio, display, HTML

# Step 1: Upload the audio file
print("Please upload a WAV audio file:")
uploaded = files.upload()

# Access the uploaded file
for filename in uploaded.keys():
    audio_data = uploaded[filename]
    audio_file = io.BytesIO(audio_data)

# Step 2: Read the audio file
with wave.open(audio_file, 'rb') as wf:
    sample_rate = wf.getframerate()  # e.g., 44100 Hz
    n_channels = wf.getnchannels()   # 1 for mono, 2 for stereo
    sample_width = wf.getsampwidth() # 1 for 8-bit, 2 for 16-bit
    n_frames = wf.getnframes()       # Total number of frames
    audio_data = wf.readframes(n_frames)  # Raw audio bytes

# Step 3: Convert audio data to NumPy array
if sample_width == 2:
    audio_np = np.frombuffer(audio_data, dtype=np.int16)
elif sample_width == 1:
    audio_np = np.frombuffer(audio_data, dtype=np.uint8) - 128  # Convert to signed
else:
    raise ValueError("Unsupported sample width")

# If stereo, use the first channel only
if n_channels > 1:
    audio_np = audio_np.reshape(-1, n_channels)[:, 0]

# Step 4: Define parameters for Short-Time Fourier Transform (STFT)
window_size = 1024  # Number of samples per FFT window
hop_size = 512     # Number of samples between successive windows
n_bins = 64        # Number of frequency bins to display
n_frames = int(np.floor((len(audio_np) - window_size) / hop_size)) + 1  # Total frames

# Step 5: Set up the visualization plot
fig, ax = plt.subplots(figsize=(10, 5))
ax.set_facecolor('black')          # Black background
ax.set_ylim(0, 10000)              # Y-axis limit (adjust if needed)
ax.set_xlim(-0.5, n_bins - 0.5)    # X-axis limit for 64 bars
ax.axis('off')                     # Hide axes
bars = ax.bar(range(n_bins), np.zeros(n_bins), color='green', width=0.8)  # Create bars

# Step 6: Define the animation update function
def update(frame):
    start = frame * hop_size
    if start + window_size > len(audio_np):
        return bars  # Stop if window exceeds audio length
    # Extract window and apply Hamming window
    windowed_data = audio_np[start:start + window_size] * np.hamming(window_size)
    # Compute FFT and get magnitudes
    fft_data = np.fft.fft(windowed_data, n=window_size)
    magnitudes = np.abs(fft_data[:n_bins])
    # Update bar heights
    for bar, height in zip(bars, magnitudes):
        bar.set_height(height)
    return bars

# Step 7: Create the animation
interval = (hop_size / sample_rate) * 1000  # Frame interval in milliseconds
ani = FuncAnimation(fig, update, frames=n_frames, interval=interval, blit=False)

# Step 8: Display the animation
display(HTML(ani.to_jshtml()))

# Step 9: Display the audio player
display(Audio(audio_np, rate=sample_rate))

✅ This code has been modified to include **both a real-time waveform and spectrogram display** using `matplotlib.animation`.

**Key updates:**

* A subplot layout was added to show **two graphs**: a waveform (top) and a spectrogram (bottom).
* FFT data is visualized as a growing spectrogram (`imshow`) that updates in real-time.
* The waveform scrolls with each audio window.

In [None]:
# Import necessary libraries
!pip install pydub
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
from google.colab import files
import io
import wave
from IPython.display import Audio, display, HTML

# Step 1: Upload the audio file
print("Please upload a WAV audio file:")
uploaded = files.upload()

# Access the uploaded file
for filename in uploaded.keys():
    audio_data = uploaded[filename]
    audio_file = io.BytesIO(audio_data)

# Step 2: Read the audio file
with wave.open(audio_file, 'rb') as wf:
    sample_rate = wf.getframerate()
    n_channels = wf.getnchannels()
    sample_width = wf.getsampwidth()
    n_frames = wf.getnframes()
    audio_data = wf.readframes(n_frames)

# Step 3: Convert audio data to NumPy array
if sample_width == 2:
    audio_np = np.frombuffer(audio_data, dtype=np.int16)
elif sample_width == 1:
    audio_np = np.frombuffer(audio_data, dtype=np.uint8) - 128
else:
    raise ValueError("Unsupported sample width")

if n_channels > 1:
    audio_np = audio_np.reshape(-1, n_channels)[:, 0]

# Step 4: Define parameters
window_size = 1024
hop_size = 512
n_bins = 512
n_frames = int(np.floor((len(audio_np) - window_size) / hop_size)) + 1

# Step 5: Set up plots for both waveform and spectrogram
fig, (ax_wave, ax_spec) = plt.subplots(2, 1, figsize=(12, 8))

# Waveform setup
ax_wave.set_title("Real-Time Waveform")
ax_wave.set_xlim(0, window_size)
ax_wave.set_ylim(-2**15, 2**15)
line_wave, = ax_wave.plot(np.zeros(window_size), color='cyan')

# Spectrogram setup
ax_spec.set_title("Real-Time Spectrogram")
ax_spec.set_facecolor('black')
spec_img = ax_spec.imshow(np.zeros((n_bins, 1)), aspect='auto', origin='lower',
                          extent=[0, 1, 0, sample_rate/2], cmap='magma')

# Step 6: Define animation update function
def update(frame):
    start = frame * hop_size
    if start + window_size > len(audio_np):
        return line_wave, spec_img

    # Extract current audio frame
    frame_data = audio_np[start:start + window_size]

    # Update waveform
    line_wave.set_ydata(frame_data)

    # Apply window and compute FFT
    windowed = frame_data * np.hamming(window_size)
    fft_data = np.abs(np.fft.rfft(windowed))
    fft_data = fft_data[:n_bins].reshape(-1, 1)

    # Update spectrogram
    current_spec = spec_img.get_array()
    updated_spec = np.hstack((current_spec, fft_data))
    spec_img.set_array(updated_spec)
    spec_img.set_extent([0, updated_spec.shape[1], 0, sample_rate/2])

    return line_wave, spec_img

# Step 7: Create the animation
interval = (hop_size / sample_rate) * 1000
ani = FuncAnimation(fig, update, frames=n_frames, interval=interval, blit=False)

# Step 8: Display animation and audio
plt.tight_layout()
display(HTML(ani.to_jshtml()))
display(Audio(audio_np, rate=sample_rate))

## 🎵 **An improved version of your audio visualization code that's optimized for Google Colab with better UI elements, error handling, and user experience.**🎵  

It has significantly enhanced your audio visualization code for Google Colab with the following improvements:

## 🚀 **Key Enhancements:**

### **1. Better File Support**
- Uses `librosa` for loading multiple audio formats (MP3, FLAC, M4A, OGG, etc.)
- Fallback to `wave` module for WAV files
- Improved error handling and user feedback

### **2. Interactive UI Controls**
- **Window Size & Hop Size sliders** for real-time parameter adjustment
- **Colormap selector** with multiple visualization styles
- **Speed control** to adjust playback speed (0.1x to 5x)
- **Play/Stop buttons** for better control

### **3. Enhanced Visualizations**
- **Dark theme** optimized for Colab
- **Gradient waveform** with better styling
- **dB-scale spectrogram** with colorbar
- **Progress indicator** in the title
- **Grid lines and labels** for better readability

### **4. User Experience**
- **Clear status messages** with emojis for better feedback
- **Organized workflow** with step-by-step guidance
- **Professional styling** with proper layouts
- **Audio normalization** to prevent clipping

### **5. Technical Improvements**
- **Object-oriented design** for better code organization
- **Memory management** with limited spectrogram history
- **Error handling** at multiple levels
- **Widget integration** for interactive controls

## 🎯 **Usage:**
1. Run the code in Google Colab
2. Upload your audio file when prompted
3. Adjust the visualization parameters using the sliders
4. Click "Start Visualization" to begin
5. Use the audio player to listen along

The enhanced version provides a much more professional and user-friendly experience while maintaining all the original functionality!

In [None]:
# Enhanced Audio Visualizer for Google Colab
# Install required packages
!pip install pydub librosa -q

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
from matplotlib.widgets import Button
from google.colab import files
import io
import wave
import librosa
from IPython.display import Audio, display, HTML, clear_output
import ipywidgets as widgets
from ipywidgets import interact, FloatSlider, IntSlider, Dropdown
import warnings
warnings.filterwarnings('ignore')

class AudioVisualizer:
    def __init__(self):
        self.audio_np = None
        self.sample_rate = None
        self.filename = None
        self.animation = None
        self.fig = None
        self.is_playing = False

    def upload_and_process_audio(self):
        """Enhanced audio upload with better file support and error handling"""
        print("🎵 Audio Visualizer - Enhanced for Google Colab")
        print("="*50)
        print("📁 Please upload an audio file (WAV, MP3, FLAC, etc.)")
        print("   Supported formats: WAV, MP3, FLAC, M4A, OGG")

        try:
            uploaded = files.upload()

            if not uploaded:
                print("❌ No file uploaded. Please try again.")
                return False

            # Process the uploaded file
            for filename in uploaded.keys():
                self.filename = filename
                print(f"📄 Processing: {filename}")

                # Use librosa for better format support
                audio_data = uploaded[filename]

                # Load audio with librosa (supports many formats)
                try:
                    self.audio_np, self.sample_rate = librosa.load(
                        io.BytesIO(audio_data),
                        sr=None,  # Keep original sample rate
                        mono=True  # Convert to mono
                    )

                    # Normalize audio to prevent clipping
                    if self.audio_np.max() > 1.0:
                        self.audio_np = self.audio_np / np.max(np.abs(self.audio_np))

                    # Convert to int16 for visualization
                    self.audio_np = (self.audio_np * 32767).astype(np.int16)

                    duration = len(self.audio_np) / self.sample_rate

                    print(f"✅ Successfully loaded!")
                    print(f"   📊 Sample Rate: {self.sample_rate} Hz")
                    print(f"   ⏱️  Duration: {duration:.2f} seconds")
                    print(f"   📏 Samples: {len(self.audio_np):,}")

                    return True

                except Exception as e:
                    print(f"❌ Error loading audio with librosa: {e}")
                    # Fallback to wave module for WAV files
                    return self._load_with_wave(io.BytesIO(audio_data))

        except Exception as e:
            print(f"❌ Upload failed: {e}")
            return False

    def _load_with_wave(self, audio_file):
        """Fallback method using wave module"""
        try:
            with wave.open(audio_file, 'rb') as wf:
                self.sample_rate = wf.getframerate()
                n_channels = wf.getnchannels()
                sample_width = wf.getsampwidth()
                n_frames = wf.getnframes()
                audio_data = wf.readframes(n_frames)

            if sample_width == 2:
                self.audio_np = np.frombuffer(audio_data, dtype=np.int16)
            elif sample_width == 1:
                self.audio_np = np.frombuffer(audio_data, dtype=np.uint8) - 128
            else:
                raise ValueError(f"Unsupported sample width: {sample_width}")

            if n_channels > 1:
                self.audio_np = self.audio_np.reshape(-1, n_channels)[:, 0]

            print("✅ Loaded with wave module (WAV format)")
            return True

        except Exception as e:
            print(f"❌ Failed to load audio: {e}")
            return False

    def create_interactive_controls(self):
        """Create interactive widgets for customization"""
        print("\n🎛️ Visualization Controls")
        print("-" * 30)

        # Create interactive widgets
        self.window_size_widget = IntSlider(
            value=1024, min=256, max=4096, step=256,
            description='Window Size:', style={'description_width': 'initial'}
        )

        self.hop_size_widget = IntSlider(
            value=512, min=128, max=2048, step=128,
            description='Hop Size:', style={'description_width': 'initial'}
        )

        self.colormap_widget = Dropdown(
            options=['magma', 'viridis', 'plasma', 'inferno', 'hot', 'cool', 'spring'],
            value='magma', description='Colormap:', style={'description_width': 'initial'}
        )

        self.speed_widget = FloatSlider(
            value=1.0, min=0.1, max=5.0, step=0.1,
            description='Speed:', style={'description_width': 'initial'}
        )

        # Display widgets
        display(widgets.HBox([self.window_size_widget, self.hop_size_widget]))
        display(widgets.HBox([self.colormap_widget, self.speed_widget]))

        # Add buttons
        self.play_button = widgets.Button(
            description="🎬 Start Visualization",
            button_style='success',
            layout=widgets.Layout(width='200px', height='40px')
        )

        self.stop_button = widgets.Button(
            description="⏹️ Stop",
            button_style='danger',
            layout=widgets.Layout(width='100px', height='40px')
        )

        self.play_button.on_click(self._on_play_click)
        self.stop_button.on_click(self._on_stop_click)

        display(widgets.HBox([self.play_button, self.stop_button]))

    def _on_play_click(self, button):
        """Handle play button click"""
        if not self.is_playing:
            self.create_visualization()
            self.is_playing = True
            self.play_button.description = "🔄 Restart"
            self.play_button.button_style = 'warning'

    def _on_stop_click(self, button):
        """Handle stop button click"""
        if self.animation:
            self.animation.event_source.stop()
        if self.fig:
            plt.close(self.fig)
        self.is_playing = False
        self.play_button.description = "🎬 Start Visualization"
        self.play_button.button_style = 'success'
        clear_output(wait=True)
        print("⏹️ Visualization stopped")

    def create_visualization(self):
        """Create the enhanced visualization"""
        if self.audio_np is None:
            print("❌ No audio loaded. Please upload a file first.")
            return

        # Get parameters from widgets
        window_size = self.window_size_widget.value
        hop_size = self.hop_size_widget.value
        colormap = self.colormap_widget.value
        speed_multiplier = self.speed_widget.value

        n_bins = window_size // 2
        n_frames = int(np.floor((len(self.audio_np) - window_size) / hop_size)) + 1

        print(f"\n🎨 Creating visualization...")
        print(f"   🖼️  Window Size: {window_size}")
        print(f"   👣 Hop Size: {hop_size}")
        print(f"   🎨 Colormap: {colormap}")
        print(f"   ⚡ Speed: {speed_multiplier}x")

        # Set up the figure with better styling
        plt.style.use('dark_background')
        self.fig, (ax_wave, ax_spec) = plt.subplots(2, 1, figsize=(14, 10))
        self.fig.patch.set_facecolor('black')

        # Enhanced waveform setup
        ax_wave.set_title(f"🌊 Real-Time Waveform - {self.filename}",
                         fontsize=14, color='white', pad=20)
        ax_wave.set_xlim(0, window_size)
        ax_wave.set_ylim(-32768, 32767)
        ax_wave.set_xlabel('Samples', color='white')
        ax_wave.set_ylabel('Amplitude', color='white')
        ax_wave.grid(True, alpha=0.3)
        ax_wave.set_facecolor('#0a0a0a')

        # Create gradient line for waveform
        line_wave, = ax_wave.plot(np.zeros(window_size), color='cyan',
                                 linewidth=1.5, alpha=0.8)

        # Enhanced spectrogram setup
        ax_spec.set_title("🌈 Real-Time Spectrogram", fontsize=14, color='white', pad=20)
        ax_spec.set_facecolor('black')
        ax_spec.set_xlabel('Time Frames', color='white')
        ax_spec.set_ylabel('Frequency (Hz)', color='white')

        # Initialize spectrogram
        spec_data = np.zeros((n_bins, 100))  # Start with some width
        spec_img = ax_spec.imshow(spec_data, aspect='auto', origin='lower',
                                 extent=[0, 100, 0, self.sample_rate/2],
                                 cmap=colormap, vmin=-60, vmax=0)

        # Add colorbar
        cbar = plt.colorbar(spec_img, ax=ax_spec, shrink=0.8)
        cbar.set_label('Magnitude (dB)', color='white')
        cbar.ax.yaxis.set_tick_params(color='white')

        # Animation variables
        spec_history = []
        frame_counter = 0
        max_history = 200  # Keep last 200 frames

        def update(frame):
            nonlocal frame_counter, spec_history

            start = frame * hop_size
            if start + window_size > len(self.audio_np):
                return line_wave, spec_img

            # Extract current audio frame
            frame_data = self.audio_np[start:start + window_size]

            # Update waveform with envelope
            line_wave.set_ydata(frame_data)

            # Compute spectrogram
            windowed = frame_data * np.hanning(window_size)
            fft_data = np.fft.rfft(windowed)
            magnitude = np.abs(fft_data)[:n_bins]

            # Convert to dB scale
            magnitude_db = 20 * np.log10(np.maximum(magnitude, 1e-10))

            # Add to history
            spec_history.append(magnitude_db)
            if len(spec_history) > max_history:
                spec_history.pop(0)

            # Update spectrogram display
            if len(spec_history) > 1:
                spec_array = np.column_stack(spec_history)
                spec_img.set_array(spec_array)
                spec_img.set_extent([0, len(spec_history), 0, self.sample_rate/2])

            # Update title with progress
            progress = (frame / n_frames) * 100
            ax_wave.set_title(f"🌊 Real-Time Waveform - {self.filename} ({progress:.1f}%)",
                            fontsize=14, color='white', pad=20)

            frame_counter += 1
            return line_wave, spec_img

        # Calculate interval based on speed
        base_interval = (hop_size / self.sample_rate) * 1000
        interval = max(1, int(base_interval / speed_multiplier))

        # Create and start animation
        self.animation = FuncAnimation(
            self.fig, update, frames=n_frames,
            interval=interval, blit=False, repeat=False
        )

        plt.tight_layout()

        # Display the animation and audio player
        print("🎬 Starting visualization...")
        display(HTML(self.animation.to_jshtml()))

        print("\n🔊 Audio Player:")
        # Normalize audio for playback
        audio_normalized = self.audio_np.astype(np.float32) / 32767.0
        display(Audio(audio_normalized, rate=self.sample_rate, autoplay=False))

    def run(self):
        """Main method to run the complete workflow"""
        print("🚀 Initializing Enhanced Audio Visualizer...")

        if self.upload_and_process_audio():
            self.create_interactive_controls()
            print("\n✨ Setup complete! Use the controls above to customize and start the visualization.")
        else:
            print("❌ Failed to load audio. Please try again with a supported audio file.")

# Create and run the visualizer
visualizer = AudioVisualizer()
visualizer.run()