### Copyright-protected material, all rights reserved. (c) University of Vienna.
_Copyright Notice of the corresponding course at Moodle applies. <br> Only to be used in the MRE course._

# MRE Assignment 2 - Digital Audio Processing 

In this assignment you will load, decode, and process digital audio files (e.g., MP3, WAV) using Python. For the following tasks, you will use our suggested libraries (see the setup section). For both audio formats you will extract and process content and some basic metadata. For the following tasks, you will use our suggested libraries. 

In this notebook, you find the detailed specification. For assessment of your solution you are expected to demonstrate your implementation and answer questions in mostly textual form here.

❗ **Note:** Please make sure that all potential errors, including handling files, paths, and run-time errors are handled properly (e.g., useful error messages to users).

## Import your implementation

Import the corresponding Jupyter Notebook named "*_impl.ipynb" for this assignment here.

In [13]:
%%capture
%run MRE_A2_impl.ipynb

## Task 2.1 Organize Audio files by specific criteria (35P):



Write a Python function MyAudioFilesOrganizer using mutagen and Wave libraries (Mutagen: https://mutagen.readthedocs.io/en/latest/) so that:
- One can call it with two parameters, i.e., an input directory path and an enum representing the grouping criteria (the grouping criteria enum can be the artist, the album, or the genre).
- The function lists the audio files in the directory grouped by the provided criteria. The list should also disply the following columns (if available from the source, probably in a specific format mentioned): 
  - artist (string)
  - album (string)
  - genre (string)
  - filename (string)
  - format (string)
  - duration (float)
  - title (string)
  - date (string)
  - sample rate (integer)
  - bitrate (integer)
  - track (string)
  - composer (string)
  - encoder (string)
  
- The function returns a pandas DataFrame that can be displayed. The DataFrame represents a table with the columns mentioned above.

**Example:**<br>

input = `./media/audio/`, `Criteria.ARTIST`<br>
Function call: `MyAudioFilesOrganizer("./media/audio/", Criteria.ARTIST)`
<br>
<br>
The result might look like this:<br>
![SampleTable](./A2T3_sampleTable.png)


### Demonstrate your implementation:

In [16]:
# Demonstrate your implementation here.
# Only enter the calls to your functions here so you can demonstrate validity of your solution.
class Criteria(Enum):
    artist = 1,
    album = 2,
    genre = 3
MyAudioFilesOrganizer("./media/audio/Task2.1/*", Criteria.artist)

Unnamed: 0,artist,album,genre,format,duration,title,date,sample rate,bitrate,track,composer,encoder,channels
./media/audio/Task2.1/FireFire.mp3,"(M.I.A.,)",[Arular],[Hip Hop/Rap],mp3,3.480381,"[Fire, Fire]",[2005],44100,160000,[5/13],"[Maya Arulpragasam, Anthony Whiting]",[iTunes v7.1],2
./media/audio/Task2.1/Amazon.mp3,"(M.I.A.,)",[Arular],[Hip Hop/Rap],mp3,4.278857,[Amazon],[2005],44100,160000,[7/13],"[Maya Arulpragasam, Richard X.]",[iTunes v7.1],2
./media/audio/Task2.1/DashTheCurry[Skit].mp3,"(M.I.A.,)",[Arular],[Hip Hop/Rap],mp3,0.669605,[Dash The Curry [Skit]],[2005],44100,160000,[6/13],[Maya Arulpragasam],[iTunes v7.1],2
./media/audio/Task2.1/error.wav,"(-,)",-,-,wav,0.054422,-,-,44100,88200,-,-,-,1
./media/audio/Task2.1/Hombre.mp3,"(M.I.A.,)",[Arular],[Hip Hop/Rap],mp3,4.035482,[Hombre],[2005],44100,160000,[9/13],"[Maya Arulpragasam, Anthony Whiting]",[iTunes v7.1],2


## Task 2.2: Audio mixer (25P):

Write a Python function `TwoAudioMixer` using `ffmpeg` so that:
- One can call it with the parameters as below: 
  - audio file 1
  - start in seconds
  - end in seconds
  - audio file 2
  - start in seconds
  - end in seconds
  - overlapDur
  - outputDir
  - outputFilename
  <br>
Where start and end in seconds specify the part of the audio file to be mixed, i.e., start and end. The transition from audio 1 to audio 2 should overlap as specified by the input parameter overlap duration.
	
**Example:**
Function call: `TwoAudioMixer('../a1.mp3', 0, 6, '../a2.mp3', 0, 6, 2, "output-a2", "t2-mixed.mp3")`

### Demonstrate your implementation:

In [17]:
# Demonstrate your implementation here. 
# Only enter the calls to your functions here so you can demonstrate validity of your solution.
TwoAudioMixer("media/audio/FireFire.mp3", 0, 10, "media/audio/Task2.2/DashTheCurry[Skit].mp3", 0, 10, 2, "media/audio/Task2.2/", "outputMix.mp3")

## Task 2.3: Concealing speakers ID by lowering/increasing the audio pitch (20P):

Write a Python function VoicePitchChanger so that:
- One can call it with four parameters: 
  - audio file 1
  - shift degree: e.g., -5 to 5
  - outputDir
  - outputFilename
- Try to reverse the result by providing the output file to your function.
- Note that the length of the audio file should not be affected by the pitch change.

**Example:**
Function call: `VoicePitchChanger('../a1.mp3', 1.5, "output-a2", "t3-pitched.mp3")`

### Demonstrate your implementation:

In [None]:
# Demonstrate your implementation here.
# Only enter the calls to your functions here so you can demonstrate validity of your solution.
VoicePitchChanger("media/audio/Task2.3/Hombre.mp3", 1.5, "media/audio/Task2.3/", "pitch.mp3")

## Task 2.4: Theoretical part (20P):

Answer the following questions in written form:

- How is the volume (i.e., how loud a sound is) reflected in analog and digital audio signals?

In analog signals, volume is represented by different pressure values over time, while in digital signals volume is represented as amplitude.
  
- Why does it make sense to perform non-uniform quantization?

Linear Quantization stores samples linearly. If storage is important, non-uniform quantization is better since it uses non uniform quantization levels which are more suited to the frequencies that humans hear best.

- What is Pulse Code Modulation (PCM)?

It is equivalent to the sampling and quantization techniques used in the digitization of audio signals


- Why do WAV files require more storage space than MP3 files?

Firstly, WAV is a lossless compression method while MP3 is a lossy one. This automatically means the size of the first one will be bigger than the second one. WAV contains the original data that was converted from analog to digital. MP3 uses psychoacoustics in the compressing process so that unimportant data is reduced in favor of smaller storage size: frequencies that cannot be heard by human hearing are removed, sounds that are masked by other sounds are removed.

- Describe the physical appearance of sound and how it is converted to digitally sampled audio. Explain how sampling works and the meaning of the terms amplitude, sampling frequency, and quantization.

Sound is a wave shaped signal. Through digitization of an analog signal, the signal is converted to a digital signal (a series of bits). The sound wave is considered to be 1 dimensional: pressure of sound (amplitude) changes in time, time being the only independent variable. The pressure over time is a continuous variable so it has infinite values, which is why it needs to be transformed into discrete values by sampling when we digitize it. To digitize the signal, sampling in both dimensions - time ("sampling") and amplitude ("quantization") - is required.

Sampling is measuring (sampling) a signal at equally spaced intervals in the time dimension.

Amplitude or magnitude measures the intensity of a sound and it is measured in decibels.

Sampling frequency is the rate at which sampling is being performed, so how many samples (sinus periods) are being measured in a second.

Quantization means sampling a signal in the amplitude dimension.

In [11]:
!pip freeze > requirements.txt