Skip to content

Split CLI and fix datasets#24

Merged
BrandonWeng merged 17 commits intomainfrom
split-cli-2
Jul 17, 2025
Merged

Split CLI and fix datasets#24
BrandonWeng merged 17 commits intomainfrom
split-cli-2

Conversation

@BrandonWeng
Copy link
Copy Markdown
Member

@BrandonWeng BrandonWeng commented Jul 16, 2025

Split up the massivceCLI file and fix the benchmarking failure from previous PR. This also makes it so far when its below a threshold it fails.

@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@BrandonWeng BrandonWeng added invalid documentation Improvements or additions to documentation enhancement New feature or request and removed invalid labels Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@github-actions
Copy link
Copy Markdown

VAD Benchmark Results

Performance Comparison

Metric FluidAudio VAD Industry Standard Status
Accuracy 98.0% 85-90%
Precision 96.2% 85-95%
Recall 100.0% 80-90%
F1-Score 98.0% 85.9% (Sohn's VAD)
Processing Time 634.3s (100 files) ~1ms per 30ms chunk

Industry Leaders:

  • Silero VAD: ~90-95% F1 (DNN-based, 1.8MB model)
  • WebRTC VAD: ~75-80% F1 (GMM-based, fast but lower accuracy)
  • Sohn's VAD: 77.5% F1 (traditional approach)
  • Modern DNNs: 85-97% F1 (varies by SNR conditions)

@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@github-actions
Copy link
Copy Markdown

🎯 Single File Benchmark Results

Test File: ES2004a (1049.4s audio)
Overall Result:PASSED

📊 Accuracy Metrics

Metric Value
DER (Diarization Error Rate) 18.7%
JER (Jaccard Error Rate) 22.6%
RTF (Real-Time Factor) 0.07x
Speakers Detected 4/4

⏱️ Performance Timing

Stage Time (s) % of Total
Model Download 2.334 3.1%
Model Compilation 0.935 1.2%
Audio Loading 0.136 0.2%
Segmentation 15.755 21.0%
Embedding Extraction 56.007 74.5%
Speaker Clustering 0.036 0.0%
Total Processing 75.204 100%

Inference Time: 71.799s (95.5% of total)
Setup Overhead: 3.269s (4.3% of total)

Research Comparison:

  • Powerset BCE (2023): 18.5% DER
  • EEND (2019): 25.3% DER
  • x-vector clustering: 28.7% DER

print(" Debug mode: \(debugMode ? "enabled" : "disabled")")
print(" Auto-download: \(autoDownload ? "enabled" : "disabled")")
print(" VAD: \(disableVad ? "disabled" : "enabled")")
if iterations > 1 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just an arbitrary number , seem like the 1 is to represent the min iterations for consistency.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread Sources/DiarizationCLI/BenchmarkRunner.swift Outdated
Comment thread Sources/DiarizationCLI/BenchmarkRunner.swift Outdated
@github-actions
Copy link
Copy Markdown

🎯 Single File Benchmark Results

Test File: ES2004a (1049.4s audio)
Overall Result:PASSED

📊 Accuracy Metrics

Metric Value
DER (Diarization Error Rate) 18.7%
JER (Jaccard Error Rate) 22.6%
RTF (Real-Time Factor) 0.04x
Speakers Detected 4/4

⏱️ Performance Timing

Stage Time (s) % of Total
Model Download 2.698 5.3%
Model Compilation 0.766 1.5%
Audio Loading 0.099 0.2%
Segmentation 11.471 22.7%
Embedding Extraction 35.578 70.3%
Speaker Clustering 0.025 0.0%
Total Processing 50.637 100%

Inference Time: 47.074s (93.0% of total)
Setup Overhead: 3.464s (6.8% of total)

Research Comparison:

  • Powerset BCE (2023): 18.5% DER
  • EEND (2019): 25.3% DER
  • x-vector clustering: 28.7% DER

@BrandonWeng BrandonWeng enabled auto-merge (squash) July 16, 2025 23:29
@BrandonWeng BrandonWeng requested a review from Alex-Wengg July 16, 2025 23:31
@github-actions
Copy link
Copy Markdown

Single File Benchmark Results

Test File: ES2004a (1049.4s audio)
Overall Result:

Accuracy Metrics

Metric Value
DER (Diarization Error Rate) 18.7%
JER (Jaccard Error Rate) 22.6%
RTF (Real-Time Factor) 0.06x
Speakers Detected 4/4

⏱️ Performance Timing

Stage Time (s) % of Total
Model Download 3.179 5.0%
Model Compilation 0.934 1.5%
Audio Loading 0.091 0.1%
Segmentation 13.240 21.0%
Embedding Extraction 45.536 72.3%
Speaker Clustering 0.032 0.1%
Total Processing 63.011 100%

Inference Time: 58.807s (93.3% of total)
Setup Overhead: 4.113s (6.5% of total)

Research Comparison:

  • Powerset BCE (2023): 18.5% DER
  • EEND (2019): 25.3% DER
  • x-vector clustering: 28.7% DER

@FluidInference FluidInference deleted a comment from github-actions Bot Jul 16, 2025
@github-actions
Copy link
Copy Markdown

VAD Benchmark Results

Performance Comparison

Metric FluidAudio VAD Industry Standard Status
Accuracy 98.0% 85-90%
Precision 96.2% 85-95%
Recall 100.0% 80-90%
F1-Score 98.0% 85.9% (Sohn's VAD)
Processing Time 636.2s (100 files) ~1ms per 30ms chunk

Industry Leaders:

  • Silero VAD: ~90-95% F1 (DNN-based, 1.8MB model)
  • WebRTC VAD: ~75-80% F1 (GMM-based, fast but lower accuracy)
  • Sohn's VAD: 77.5% F1 (traditional approach)
  • Modern DNNs: 85-97% F1 (varies by SNR conditions)

@BrandonWeng BrandonWeng merged commit 64bd19a into main Jul 17, 2025
4 checks passed
@BrandonWeng BrandonWeng deleted the split-cli-2 branch July 17, 2025 00:03
Alex-Wengg pushed a commit that referenced this pull request Jan 1, 2026
Split up the massivceCLI file and fix the benchmarking failure from
previous PR. This also makes it so far when its below a threshold it
fails.
SGD2718 pushed a commit that referenced this pull request Jan 4, 2026
Split up the massivceCLI file and fix the benchmarking failure from
previous PR. This also makes it so far when its below a threshold it
fails.
Alex-Wengg pushed a commit that referenced this pull request Jan 5, 2026
Split up the massivceCLI file and fix the benchmarking failure from
previous PR. This also makes it so far when its below a threshold it
fails.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants