# SonoPleth - Complete Audio Spatial Rendering Pipeline

This notebook runs the complete sonoPleth pipeline from ADM BWF WAV to spatial audio render. Running runPipeline.py executes the entire process demonstrated.here. Easiest to run notebook directly in VS code with python env rather than jupyter server .Steps below:
0. ** Initialize Environment**
1. **Setup & Verification** - Check C++ tools and dependencies
2. **Check Audio Channels** - Scan which channels contain audio
3. **Extract ADM Metadata** - Use bwfmetaedit to get ADM XML from WAV
4. **Parse ADM Metadata** - Extract spatial objects and DirectSpeakers
5. **Package for Render** - Split stems and create spatial instructions
6. **VBAP Spatial Render** - Generate multichannel spatial audio using VBAP
7. **Analyze Render** - Create PDF analysis of output channels

**Outputs:**
- `processedData/containsAudio.json` - Channel audio presence
- `processedData/currentMetaData.xml` - Extracted ADM metadata
- `processedData/objectData.json` - Parsed audio objects
- `processedData/directSpeakerData.json` - Parsed DirectSpeaker channels
- `processedData/globalData.json` - Global technical metadata
- `processedData/stageForRender/` - Audio stems and render instructions
- `processedData/spatial_render.wav` - Final multichannel spatial audio
- `processedData/spatial_render_analysis.pdf` - Channel analysis report

## Environment:
In terminal run:
```bash
source init.sh
```
Should only need to run once, but may need to run 
```bash
source activate.sh
``` 
if terminal is closed. see README.md for more info

## Setup: Import modules and configure pipeline

When prompted to choose an environment in vscode, chose sonoPleth.


In [1]:
# When prompted to choose an environment in vscode, chose sonoPleth.
import sys
import os
import json
from pathlib import Path

# Add src directory to path for imports
sys.path.insert(0, '../src')
sys.path.insert(0, 'src')
# Also add the project root to handle src.* imports
sys.path.insert(0, '../')
sys.path.insert(0, '.')

from configCPP import setupCppTools
from analyzeADM.extractMetadata import extractMetaData
from analyzeADM.parser import parseMetadata, getGlobalData
from analyzeADM.checkAudioChannels import exportAudioActivity
from packageADM.packageForRender import packageForRender
from createRender import runVBAPRender
from analyzeRender import analyzeRenderOutput

print(" Modules imported successfully")

 Modules imported successfully


In [2]:
! python ../utils/getExamples.py
#this will fetch example ADM and audio files for testing the pipeline


Downloading example file to: /Users/lucian/projects/sonoPleth/notebooks/../sourceData/driveExample1.wav
This may take a while for large files...

Downloading...
From (original): https://drive.google.com/uc?id=16Z73gODkZzCWjYy313FZc6ScG-CCXL4h
From (redirected): https://drive.google.com/uc?id=16Z73gODkZzCWjYy313FZc6ScG-CCXL4h&confirm=t&uuid=7e827555-c611-4a1d-90f4-b495fa9cd6ab
To: /Users/lucian/projects/sonoPleth/sourceData/driveExample1.wav
100%|████████████████████████████████████████| 274M/274M [00:02<00:00, 97.8MB/s]

Download complete!
Saved to: /Users/lucian/projects/sonoPleth/notebooks/../sourceData/driveExample1.wav
File verified: 261.6 MB

Downloading example file to: /Users/lucian/projects/sonoPleth/notebooks/../sourceData/driveExample2.wav
This may take a while for large files...

Downloading...
From (original): https://drive.google.com/uc?id=1-oh0tixJV3C-odKdcM7Ak-ziCv5bNKJB
From (redirected): https://drive.google.com/uc?id=1-oh0tixJV3C-odKdcM7Ak-ziCv5bNKJB&confirm=t&uuid=7

In [3]:
# Configure pipeline parameters - adjust paths since notebook is in notebooks/ subdirectory
sourceADMFile = "../sourceData/driveExampleSpruce.wav"  # Change this to your ADM BWF WAV file
sourceSpeakerLayout = "../vbapRender/allosphere_layout.json"  # Speaker layout for VBAP rendering
createRenderAnalysis = True  # Whether to generate PDF analysis
processedDataDir = "../processedData"
finalOutputRenderFile = "../processedData/spatial_render.wav"
finalOutputRenderAnalysisPDF = "../processedData/spatial_render_analysis.pdf"

print(f"Source ADM file: {sourceADMFile}")
print(f"Speaker layout: {sourceSpeakerLayout}")
print(f"Render analysis: {'Enabled' if createRenderAnalysis else 'Disabled'}")

# Check if source file exists
if not Path(sourceADMFile).exists():
    print(f"\n WARNING: Source file not found: {sourceADMFile}")
    print(" May need to run: python utils/getExamples.py in terminal")
else:
    print(f"✓ Source file found: {sourceADMFile}")

Source ADM file: ../sourceData/driveExampleSpruce.wav
Speaker layout: ../vbapRender/allosphere_layout.json
Render analysis: Enabled
✓ Source file found: ../sourceData/driveExampleSpruce.wav


## Step 1: Setup & Verification

Verifies that all C++ tools and dependencies are properly installed:
- `bwfmetaedit` for ADM metadata extraction
- AlloLib submodules for spatial audio processing
- VBAP renderer compilation

This step only installs/builds what's missing.

In [4]:
print("STEP 1: Verifying C++ tools and dependencies\n")
print("="*60)

if not setupCppTools():
    print("\n✗ Error: C++ tools setup failed")
    print("\nTry re-initializing the project:")
    print("  rm .init_complete && source init.sh")
    raise RuntimeError("C++ tools setup failed - please re-initialize project")

print("\n✓ All C++ tools and dependencies verified")

STEP 1: Verifying C++ tools and dependencies


Setting up C++ tools and dependencies...

Checking for bwfmetaedit...
✓ bwfmetaedit already installed at: /opt/homebrew/bin/bwfmetaedit
✓ Git submodules already initialized
✓ VBAP renderer already built at: /Users/lucian/projects/sonoPleth/vbapRender/build/sonoPleth_vbap_render

✓ C++ tools setup complete!


✓ All C++ tools and dependencies verified


## Step 2: Check Audio Channels

Scans all channels in the ADM BWF WAV file to detect which contain audio above a threshold (-100 dBFS by default).
This helps filter empty/silent channels before processing and provides insight into the channel layout.

In [9]:
print("STEP 2: Checking audio channels for content\n")
print("="*60)

exportAudioActivity(sourceADMFile, output_path="../processedData/containsAudio.json", threshold_db=-100)

# Show summary of audio channel activity
with open("../processedData/containsAudio.json", "r") as f:
    audio_data = json.load(f)
    
active_channels = sum(1 for ch in audio_data["channels"] if ch["contains_audio"])
total_channels = len(audio_data["channels"])

print(f"\n✓ Audio channel analysis complete:")
print(f"  Total channels: {total_channels}")
print(f"  Active channels: {active_channels}")
print(f"  Silent channels: {total_channels - active_channels}")
print(f"  Threshold: -100 dBFS")

# Show first few active channels
active_channel_list = [i+1 for i, ch in enumerate(audio_data["channels"]) if ch["contains_audio"]]
if len(active_channel_list) > 10:
    print(f"  Active channel numbers: {active_channel_list[:10]}... (showing first 10)")
else:
    print(f"  Active channel numbers: {active_channel_list}")

STEP 2: Checking audio channels for content

No file to delete at: /Users/lucian/projects/sonoPleth/processedData/containsAudio.json
Scanning 41 channels in '../sourceData/driveExampleSpruce.wav'...
  Channel 1/41 scanned (rms_db=-50.34, contains_audio=True)
  Channel 2/41 scanned (rms_db=-46.71, contains_audio=True)
  Channel 3/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 3/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 4/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 4/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 5/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 5/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 6/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 6/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 7/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 7/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 8/41 scanned (rms_db=-200.0, contains_audio=False)
  Channel 

## Step 3: Extract ADM Metadata

Uses `bwfmetaedit` to extract ADM (Audio Definition Model) metadata from the WAV file's aXML chunk.
This contains all the spatial positioning data for audio objects and DirectSpeakers (static speaker beds).

In [10]:
print("STEP 3: Extracting ADM metadata from WAV file\n")
print("="*60)

extractedMetadata = extractMetaData(sourceADMFile, "../processedData/currentMetaData.xml")

if extractedMetadata:
    xmlPath = extractedMetadata
    print(f"\n✓ ADM metadata successfully extracted to: {xmlPath}")
    
    # Show XML file size as a quick sanity check
    xml_size = Path(xmlPath).stat().st_size
    print(f"  XML file size: {xml_size:,} bytes")
else:
    print("\n⚠ ADM metadata extraction failed, using fallback XML")
    xmlPath = "../data/POE-ATMOS-FINAL-metadata.xml"
    if Path(xmlPath).exists():
        print(f"  Using fallback XML: {xmlPath}")
    else:
        print(f"  ✗ Fallback XML not found: {xmlPath}")
        raise FileNotFoundError(f"No ADM metadata available at {xmlPath}")

STEP 3: Extracting ADM metadata from WAV file

Extracting ADM metadata from WAV file...
Exported ADM metadata to ../processedData/currentMetaData.xml

✓ ADM metadata successfully extracted to: ../processedData/currentMetaData.xml
  XML file size: 42,399,066 bytes
Exported ADM metadata to ../processedData/currentMetaData.xml

✓ ADM metadata successfully extracted to: ../processedData/currentMetaData.xml
  XML file size: 42,399,066 bytes


## Step 4: Parse ADM Metadata

Parses the ADM XML to extract structured data:
- **Audio objects** - Dynamic sources with position/time trajectories
- **DirectSpeakers** - Static speaker positions (e.g., 7.1.4 bed channels)
- **Global metadata** - File info (sample rate, duration, channel format)

DESIGN NOTE: 
I plan to move all this data into internal data structures rather than exporting to JSON. I will replaces the outputted JSON files with one debugging file containing all the JSON information. 

In [None]:
print("STEP 4: Parsing ADM metadata\n")
print("="*60)

# Temporarily change to project root so parseMetadata writes to correct location
import os
original_dir = os.getcwd()

# Resolve xmlPath to absolute path before changing directory -- will be able to remove this once i fix hardcoded paths in parser.py
xmlPath_absolute = str(Path(xmlPath).resolve())

os.chdir('..')
try:
    reformattedMetadata = parseMetadata(xmlPath_absolute, ToggleExportJSON=True, TogglePrintSummary=True)
finally:
    os.chdir(original_dir)

# Load and display detailed summary of parsed data
def load_json_safely(filepath):
    try:
        with open(filepath, 'r') as f:
            return json.load(f)
    except (FileNotFoundError, json.JSONDecodeError):
        return {}

global_data = load_json_safely("../processedData/globalData.json")
object_data = load_json_safely("../processedData/objectData.json")
directspeaker_data = load_json_safely("../processedData/directSpeakerData.json")

print(f"\n✓ ADM metadata parsing complete:")
print(f"  Sample rate: {global_data.get('SampleRate', 'N/A')} Hz")
print(f"  Duration: {global_data.get('Duration', 'N/A')} seconds")
print(f"  Channel format: {global_data.get('audioChannelFormatName', 'N/A')}")
print(f"  DirectSpeaker channels: {len(directspeaker_data)}")

# Count non-empty audio objects
active_objects = sum(1 for k, v in object_data.items() if v)
print(f"  Audio objects with data: {active_objects}/{len(object_data)}")

# Show object IDs with data
object_ids_with_data = [k for k, v in object_data.items() if v]
if object_ids_with_data:
    if len(object_ids_with_data) > 8:
        print(f"  Object IDs: {', '.join(object_ids_with_data[:8])}... (showing first 8)")
    else:
        print(f"  Object IDs: {', '.join(object_ids_with_data)}")

STEP 4: Parsing ADM metadata

Saved technical metadata to processedData/globalData.json
Extracted global technical metadata
Saved technical metadata to processedData/globalData.json
Extracted global technical metadata
Saved DirectSpeaker data to processedData/directSpeakerData.json
Extracted DirectSpeaker channel metadata
Saved DirectSpeaker data to processedData/directSpeakerData.json
Extracted DirectSpeaker channel metadata
Saved object data to processedData/objectData.json

Found 10 fixed channels and 27 audio objects:

Object: Sum 1 L
  Total Blocks: 2250
  Time Range: ('00:00:00.00000', '00:01:48.39285')
  Z-Coordinate Changes: No
  Width Changes: No

Object: Sum 1 R
  Total Blocks: 2250
  Time Range: ('00:00:00.00000', '00:01:48.39285')
  Z-Coordinate Changes: No
  Width Changes: No

Object: se 1 
  Total Blocks: 159
  Time Range: ('00:00:00.00000', '00:00:35.28933')
  Z-Coordinate Changes: No
  Width Changes: No

Object: sm 1 
  Total Blocks: 233
  Time Range: ('00:00:00.00000',

## Step 5: Package for Render

Prepares the audio and spatial data for VBAP rendering:
- Splits the multichannel WAV into individual audio stems
- Creates render instructions JSON with spatial trajectories
- Organizes everything in `processedData/stageForRender/`

DESIGN NOTES: 
1. I will explore avoiding the stem splitting step. Instead, I would like to render the spatialization from each channel in the source file directly. I will compare which method is faster. For prototyping, the current method is fine.
2. I will add options for rendering DBAP and LBAP. Spatialzation technique will be a parameter in the CLI tool.

In [17]:
print("STEP 5: Packaging audio for render\n")
print("="*60)

# Temporarily change to project root to avoid path issues with packageForRender
# TODO: Remove this workaround once hardcoded paths in splitStems.py and createRenderInfo.py are fixed
original_dir = os.getcwd()

# Resolve paths to absolute before changing directory
sourceADMFile_absolute = str(Path(sourceADMFile).resolve())
processedDataDir_absolute = str(Path(processedDataDir).resolve())

os.chdir('..')
try:
    packageForRender(sourceADMFile_absolute, processedDataDir_absolute)
finally:
    os.chdir(original_dir)

# Check what was created in stageForRender
stage_dir = Path("../processedData/stageForRender")
if stage_dir.exists():
    wav_files = list(stage_dir.glob("*.wav"))
    json_files = list(stage_dir.glob("*.json"))
    
    print(f"\n✓ Render staging complete:")
    print(f"  Audio stems created: {len(wav_files)}")
    print(f"  Instruction files: {len(json_files)}")
    print(f"  Stage directory: {stage_dir}")
    
    # Show total size of audio stems
    total_size = sum(f.stat().st_size for f in wav_files)
    print(f"  Total audio data: {total_size / (1024*1024):.1f} MB")
    
    # Check for render instructions
    render_instructions = stage_dir / "renderInstructions.json"
    if render_instructions.exists():
        with open(render_instructions, 'r') as f:
            instructions = json.load(f)
        print(f"  Render instructions: {len(instructions.get('sources', []))} sources")
    else:
        print("  ⚠ Warning: renderInstructions.json not found")
else:
    print("\n✗ Error: Stage directory not created")
    raise RuntimeError("Packaging for render failed")

STEP 5: Packaging audio for render

Attempting to run package for render -- splitting stems and creating render info...
No file to delete at: /Users/lucian/projects/sonoPleth/processedData/stageForRender/renderInstructions.json
Loaded directSpeakerData from /Users/lucian/projects/sonoPleth/processedData/directSpeakerData.json
Loaded objectData from /Users/lucian/projects/sonoPleth/processedData/objectData.json
Loaded containsAudio from /Users/lucian/projects/sonoPleth/processedData/containsAudio.json
Loaded globalData from /Users/lucian/projects/sonoPleth/processedData/globalData.json

Channel assignments:
  DirectSpeakers: channels 1-10
  Objects: channels 11-37
  Total channels assigned: 37

Spatial instructions JSON saved to processedData/stageForRender/renderInstructions.json
  Sources with audio: 19
  Sources without audio (skipped): 18
  Total sources in JSON: 19
Loaded containsAudio from /Users/lucian/projects/sonoPleth/processedData/containsAudio.json
Clearing existing files in

## Step 6: VBAP Spatial Render

Runs the VBAP (Vector Base Amplitude Panning) renderer to create the fixed multichannel wav file.
Uses the staged audio stems and spatial instructions to generate output for the AlloSphere speaker layout.

DESIGN NOTE:
I opted for an offline render pipeline, but much of the same code could be used to run realtime spatialization on the "stage for render" data

In [19]:
print("STEP 6: Running VBAP spatial renderer\n")
print("="*60)

# Temporarily change to project root to avoid path issues with VBAP renderer
# TODO: Remove this workaround once hardcoded paths in createRender.py are fixed
original_dir = os.getcwd()

# Resolve paths to absolute before changing directory
sourceSpeakerLayout_absolute = str(Path(sourceSpeakerLayout).resolve())
finalOutputRenderFile_absolute = str(Path(finalOutputRenderFile).resolve())

os.chdir('..')
try:
    runVBAPRender(
        source_folder="processedData/stageForRender",
        render_instructions="processedData/stageForRender/renderInstructions.json",
        speaker_layout=sourceSpeakerLayout_absolute,
        output_file=finalOutputRenderFile_absolute
    )
finally:
    os.chdir(original_dir)

# Check the rendered output
output_path = Path(finalOutputRenderFile)
if output_path.exists():
    output_size = output_path.stat().st_size
    print(f"\n✓ Spatial render complete:")
    print(f"  Output file: {finalOutputRenderFile}")
    print(f"  File size: {output_size / (1024*1024):.1f} MB")
    
    # Try to get basic audio info if possible
    try:
        import soundfile as sf
        with sf.SoundFile(finalOutputRenderFile) as f:
            print(f"  Channels: {f.channels}")
            print(f"  Sample rate: {f.samplerate} Hz")
            print(f"  Duration: {f.frames / f.samplerate:.2f} seconds")
            print(f"  Format: {f.subtype}")
    except ImportError:
        print("  (Install soundfile for detailed audio info: pip install soundfile)")
    except Exception as e:
        print(f"  (Could not read audio info: {e})")
else:
    print(f"\n✗ Error: Rendered output not found at {finalOutputRenderFile}")
    raise RuntimeError("VBAP rendering failed")

STEP 6: Running VBAP spatial renderer

No existing render to delete at: /Users/lucian/projects/sonoPleth/processedData/spatial_render.wav

Running VBAP Renderer...
  Source folder: /Users/lucian/projects/sonoPleth/processedData/stageForRender
  Instructions: /Users/lucian/projects/sonoPleth/processedData/stageForRender/renderInstructions.json
  Speaker layout: /Users/lucian/projects/sonoPleth/vbapRender/allosphere_layout.json
  Output: /Users/lucian/projects/sonoPleth/processedData/spatial_render.wav

Loading layout...
Loading spatial instructions...
Loading source WAVs...
Rendering...
Finding triplets
Speaker-count=54, Initial triplet-count=24804
Triangles removed because too narrow 4492
Triangles removed because of crossing 20253
Triangles removed because of crossing 13
Triangles removed because of crossing 0
Tris removed because spk inside triangle 0
Rendering 5285333 samples (110.111 sec) to 54 speakers from 19 sources
  Block 0 (0%)
Loading layout...
Loading spatial instructions..

## Step 7: Analyze Render (Optional)

Creates a PDF analysis of the rendered spatial audio with:
- dB level analysis for each output channel
- Peak and RMS measurements
- Channel activity visualization
- Technical metadata summary

## View Render Analysis PDF

Display the generated PDF analysis directly in the notebook.

In [20]:
# Display the render analysis PDF in the notebook
import subprocess
import platform
from IPython.display import IFrame, display, HTML

pdf_path = Path(finalOutputRenderAnalysisPDF)

if pdf_path.exists():
    print(f"Render Analysis PDF: {finalOutputRenderAnalysisPDF}")
    print(f"File size: {pdf_path.stat().st_size / 1024:.1f} KB")
    
    # Try to display PDF inline (works in Jupyter)
    try:
        display(IFrame(finalOutputRenderAnalysisPDF, width=900, height=600))
        print("PDF displayed above")
    except Exception as e:
        print(f"Could not display PDF inline: {e}")
        
        # Fallback: provide link and try to open externally
        print(f"\nPDF file location: {pdf_path.absolute()}")
        
        # Try to open with system default viewer
        try:
            if platform.system() == 'Darwin':  # macOS
                subprocess.run(['open', pdf_path], check=True)
                print("Opened PDF with default macOS viewer")
            elif platform.system() == 'Windows':  # Windows
                subprocess.run(['start', pdf_path], shell=True, check=True)
                print("Opened PDF with default Windows viewer")
            else:  # Linux
                subprocess.run(['xdg-open', pdf_path], check=True)
                print("Opened PDF with default Linux viewer")
        except Exception as open_error:
            print(f"Could not open PDF automatically: {open_error}")
            print(f"Manually open: {pdf_path.absolute()}")
            
    # Show clickable HTML link
    display(HTML(f'<a href="{pdf_path.absolute()}" target="_blank">Click here to open PDF in new tab</a>'))
        
else:
    print(f"PDF not found at: {finalOutputRenderAnalysisPDF}")
    print("Make sure Step 7 (render analysis) completed successfully")

PDF not found at: ../processedData/spatial_render_analysis.pdf
Make sure Step 7 (render analysis) completed successfully
