# DT2470 Music Informatics project
Alice Anselmi | Simone Clemente | Stefano Scolari
### Task and purpose of the project
The goal of this project is to create an automatic mixer based on key and beat detection. <br>
The core idea is to start from a series of segments and mix them in the best possible way considering both BPM and key similarity.
### Dataset and methods
We worked with annotated datasets in order to have ground truth for beat and key. (For instance: https://zenodo.org/record/3967852). The dataset is composed by a series of tracks in *wav* format which have been annotated with the key and the BPM, as well as other information related to the track. <br>
The data was filtered to select only the tracks that were classified as *techno* and *house* music.
### Musical lineup creation
Once we have computed the estimated key and bpm for each track it is time to order them with the purpose of creating an actual lineup. <br>The basic idea behind this process is to find tracks which are similar to each other and put them one close in order to have a seamless passage.
In order to define the similarity we could consider a 2D-dimensional plane with key and bpm as axis: our target is to find the closest neighbors of our current track. Since summing together the distances in terms of key and bpm is not simple, we decided to use an approach derived from the classical mixing theory.
##### Key similarity
In order to find the similarity between keys we decided to apply the Camelot Wheel rule: said wheel is composed by two circles, the outer representing major keys and the inner minor keys. It is often used for mixing since it indicates the keys which works well together: as a matter of fact, each key sounds well with the two neighbors in its circle and with the correspondent element from the other circle. In this way, once we select a track, we can divide the 2D plane in horizontal segments and select only those which have a key admissible from the Camelot rule.
##### BPM similarity
Since we used the key to segment the plane now the nearest neighbors search can be done in 1D using simply the BPM difference between two tracks as similarity measure. Usually DJs tend to increase the BPM of the mix with time: to mimic this behaviour we inserted an additional constraint related to the tempo. As a matter of fact we only consider the tracks with BPM higher or equal to the current one. In this way, the next piece will be the track with the most similar BPM to the actual one only considering those from the selected group of keys and having a faster tempo.

We are working with two features so the first idea was to apply a simple 2-dimensional nearest neighbor search to find the closest [TODO]


##### Input directory selection

In [13]:
import os

# insert the tracks you want to mix into the input folder
input_path = "input_folder"

##### Track features extraction

In [14]:
from tracks import TrackFeatures
track_files_list = os.listdir(input_path)

# Key extraction modes: "determ" for deterministic, "nn" for neural network
key_mode = "determ"
# Bpm extraction modes: "dynamic" for dynamic, "nn" for neural network
bpm_mode = "dynamic"

track_list = []
for track_file in track_files_list:
    file_path = str(input_path + "/" + track_file)
    track = TrackFeatures(file_path, key_mode, bpm_mode)
    track.extractFeatures()
    track_list.append(track)


TRACK: input_folder/Bitter Sweet (mp3cut.net).wav -> BPM:129.19921875 KEY:A major
TRACK: input_folder/Inside Out (mp3cut.net).wav -> BPM:129.19921875 KEY:G minor
TRACK: input_folder/MOI004 A (mp3cut.net).wav -> BPM:83.35433467741936 KEY:A minor


##### Musical lineup creation

In [15]:
from nn_procedure import NNMixing
track_selection = NNMixing(track_list=track_list,number_of_tracks=5)
track_lineup = track_selection.createMix()

ValueError: 'A major' is not in list

##### Mix creation

In [None]:
from mixer import Mixer
mixer = Mixer(track_lineup)
mixer.createMix()

### Accuracy of key and bpm estimation
In order to evaluate the accuracy of the key and bpm estimation we decided to use the *accuracy* and *mean absolute error* metrics. <br>

[TODO add plots/values]

### Mix creation 
The most important thing when mixing two tracks is the beatmatching aspect of it. 'Beatmatching' or 'pitch cue', is a disc jockey technique of pitch shifting or time stretching an upcoming track to match its tempo to that of the currently playing track, and to adjust them such that the beats are synchronized. As previously mentioned, we thought of the output playlist in terms of a 'crescendo', meaning that we opted to favour for a track lineup where the tempo follows an increasing BPM (Beats Per Minute) order. 

Another basic technique, which is commonly used for Techno and House tracks is to 'filter' out the low frequencies of the entering track. Low frequencies can be sub-bass', kick-drums and bass elements, which are not considered to introduce new musical elements and would result in a 'clashing effect' if introduced at full force on top of the low-frequency elements of the already playing track. While introducing other high and mid frequency elements together with the ones already being played is a classic approach. Once track is introduced with its high and mid frequencies, it is common to then 'swap the bass', this can be done gradually over a phrase or by abtrupyu swapping them at appropriate moments during the track: these could be for example when a new pharase is introduced or whenever the DJ 'feels' it is most appropriate. It is not a science.

With the previous said, the following are the **main steps** we took is our approach to achieve a similar result to what a simple mix between two tracks would look (or sound) like:
- Define a transition period, indicating how long the active overlaying of the two tracks will last.
- Gradually increase the tempo of the track which needs to be mixed out to match the tempo of the next track in the playlist. No library offered built-in functions to handle a gradual tempo increase. So we had to adopt a **mix** of pydub and ffmpeg to achieve the desired result. The trick was to subdivide the track in a high amount of subsegments, and apply a tempo increase by a calculated factor to each subsegment in order to reach the desired tempo by the desired frame of the track. This approch may lead sometimes to the introduction of a certain degree of noise.
- The track that needs to be mixed out will reach the desidered tempo before the start of the transition section between the two tracks.
- Identify the first beat within the transition section of the exiting track. Meaning that we identify the frame of song where such first beat takes place.
- Identify the first beat within the transition section of the entering track.
- Overlay the exiting track from its first beat onto the exiting track on the previously identified first beat of the transition section.

<p align="center">
    <img src="imgs/Beatsync.jpg" alt="Image" width="900"/>
</p>

- Apply a time-varying high pass filter to the entering track. In this case the threshold representing the cutoff for the minimum frequencies is increased over time to gradually introduced the low-frequency elements. This posed an interesting challenge, as no library implemented filtering functions which changed the intensity over time, so we had to take the high-pass filter from the Pydub library and modify it for our needs.
- Apply a time-varying high pass filter to the exiting track. In this case the threshold representing the cutoff for the minimum frequencies is lowered over time to gradually remove the low-frequency elements.
- Apply a gradual volume fade-out effect to the exiting track.
- Apply a gradual volume fade-in effect to the entering track.

Repeat the process to subsequent tracks. In our case, we created a non-mixed section subject only to a gradual tempo change (if required) and a mixed-section for each pair of tracks.
This way the full mix resulted by **"simply"** appending non-mixed sections with mixed sections.

Imagine having 3 tracks in the playlist which need to be mixed, the full mix of these tracks can be broken down in the following way:
<p align="center">
    <img src="imgs/Mixed.jpg" alt="Image" width="600"/>
</p>


### Limitations and possible improvements
What we created is a very basic semplification of what a DJ actually does when mixing two tracks together. We omitted HOW and WHICH parts of the two tracks should be selected for the mix. Usually DJ's use what is called phrase matching. To put it simply: a phrase is a collection of bars (Simply group of beats, in dance music usually made up of 4 beats) - usually 8 or 16, which make up distinct sections of a track.
As each track has its own 'story', phrases have a 'meaning' within the story which is the whole track. By phrase matching, DJ's make sure that the mix aligns not only the single beats, but a whole phrase of the two tracks being mixed, or at least, attempts to align the start of them.
In our work, this factor is not considered, and tracks are mixed by simply considering the ending and starting parts of them, which may lead to worse mixes. Future work could include automatic ways of identifying such phrases in the tracks and improve the mix by aligning them, on top of the beatmatching.

Moreover, the gradual tempo increase is not really something that always works, tempo increases usually need to be 'hidden' by the DJ, by increase the tempo in random or unexpected moments (if the increase does not want to be noticeable), or as we did, via a gradual increase.

Furthermore, for simplicity, the fade-out and fade-in effect are applied on top of the time varying high pass filter, which can lead to a decrease energy in these parts. Ideally, the fade-in effect should be applied to the entering track before applying any kind of fade-out effect to the exiting track. By doing this, the mix will really create a new part of the song. After introducing the high and mid frequency elements, a following step could be to gradually introduce the low frequency until 3/4, then bring the low pass filter for the exiting track to 3/4 and only then gradually remove the low frequency elements for the exiting track and bring in to full force the low frequency elements for the entering track. At this point it is only a matter of finding the right moment where to fade-out the mid and high frequencies of the exiting track and the mix is finally complete.

This is only one approach to mixing, which is obviously not universal but can often be used as a 'standard' for Techno and House tracks.

Due to time constrains, we were not able to implement a mixer capable of accounting for all these details, but it would not be too hard to add them given enough time.