-
Notifications
You must be signed in to change notification settings - Fork 1
Video Presentation Outline
Eric Chin edited this page Nov 15, 2013
·
4 revisions
- Introduction
- Names
- Eric Chin
- David Corbett
- Pranav Gandhi
- Daniel Moran
- Project details (general requirements, etc)
- Requirements
- compile and run on CCIS Linux System
- given two sets of audio files, identify audio files from one set that are derived from audio files of the other set
- Requirements
- Brief outline of presentation
- Introduction
- Architecture
- Algorithms
- Critical data structures
- Software Architecture
- Modules
- Data ingestion
- Data normalization (to canonical form)
- WAVE audio format
- 44100 Hz
- 16 bits per second
- 1 channel
- WAVE audio format
- Comparison (using FFT comparisons, using MFCCs)
- Data analysis (determine if compared audio constitutes a match)
- Critical Data Structures
- Representing audio
- Use WAVE file
- Each WAVE file is represented by a sequence of FFTs (overlapping?) chunks (array)
- Each WAVE file is represented by their respective MFCCs
- Purpose
- Algorithms
- FFT
- Purpose
- Implementation (numpy.fft)
- MFCC
- Purpose
- Implementation
- Take the Fourier transform of (a windowed excerpt of) a signal
- Use triangular overlapping windows to map the spectrum power onto the Mel scale
- Take the logs of the power at each of the Mel frequencies
- Take the discrete cosine transform of the powers as if it were a signal
- Results in the MFCCs which are amplitudes of the resulting spectrum