Skip to content

Video Presentation Outline

Eric Chin edited this page Nov 15, 2013 · 4 revisions
  1. Introduction
  • Names
    • Eric Chin
    • David Corbett
    • Pranav Gandhi
    • Daniel Moran
  • Project details (general requirements, etc)
    • Requirements
      • compile and run on CCIS Linux System
      • given two sets of audio files, identify audio files from one set that are derived from audio files of the other set
  • Brief outline of presentation
    • Introduction
    • Architecture
    • Algorithms
    • Critical data structures
  1. Software Architecture
  • Modules
    • Data ingestion
    • Data normalization (to canonical form)
      • WAVE audio format
        • 44100 Hz
        • 16 bits per second
        • 1 channel
    • Comparison (using FFT comparisons, using MFCCs)
    • Data analysis (determine if compared audio constitutes a match)
  1. Critical Data Structures
  • Representing audio
    • Use WAVE file
    • Each WAVE file is represented by a sequence of FFTs (overlapping?) chunks (array)
    • Each WAVE file is represented by their respective MFCCs
  • Purpose
  1. Algorithms
  • FFT
    • Purpose
    • Implementation (numpy.fft)
  • MFCC
    • Purpose
    • Implementation
      1. Take the Fourier transform of (a windowed excerpt of) a signal
      2. Use triangular overlapping windows to map the spectrum power onto the Mel scale
      3. Take the logs of the power at each of the Mel frequencies
      4. Take the discrete cosine transform of the powers as if it were a signal
      5. Results in the MFCCs which are amplitudes of the resulting spectrum
Clone this wiki locally