# 475 Project - Creating a Trailer for a Musical Album


## Introduction

Using various methods such as key detection, tempo matching, and musical structure analysis, we will create a software that takes an album as input, and returns a single cohesive song that is created from segments of each song from the inputted set. Our project will choose sections by analyzing each song and determining verses and chorus sections in order to choose the portion that is to be added into the outputted song. Once these sections are determined, they will be stitched together into a single song by various possible methods, including fading in/out, downbeat repetition, or hard cuts. The order that the segments are placed in is yet to be determined, whether it be in the same album tracklist ordering, or an order that is deemed to be the most smooth for transitions between song segments. In order to create smooth transitions between song segments, the tempo and key of each segment must be detected in order to be able to adjust properly during transitions.

When searching for a new artist, many listeners choose to listen to the singles released preceding an album drop to determine whether or not listening to an album is worth their time. With the rising popularity of Spotify recommendation algorithms and curated playlists, it is easy to see the demand for a song that gives the listener a short preview of an album without requiring to listen to it front to back.

### Importing dependencies

In [1]:
%matplotlib notebook
%matplotlib inline 
import matplotlib.pyplot as plt
import numpy as np
import IPython.display as ipd
import scipy.io.wavfile as wav
import os
import glob
import librosa
import pychorus as pc
import vamp

### Initialization

#### Manual Variable Entry & Directory Location

In [2]:
path = "albums/R&B/Malibu"
new_file_name = "AlbumTrailerMalibu.wav" #wav file

# how many beats the segment lasts after segment start time
num_beats = 16

# would you like to timestretch files to master tempo?
timestretch_segs = False

# would you like to reorder tracks according to tempo? Default: Slow to Fast
reorder_tracks = True

# order them from Fast To Slow instead
fast_to_slow = False

#Fade and delay times between files (in seconds)
cross_fade = 2
init_fade = 1 
final_fade = 2
delay = 1 # should be less than cross_fade


#### Variable Initialization

In [3]:
#General
srates = []
audio_files = []
srate = 0

#Chorus/Beat Detection
hopSize = 512
song_data = {}
beats = []

#Splitting and Crossfades
aud = []
trailer = []
prevl = 0

#### File Initialization

In [4]:
# Load files
for i, filename in enumerate(glob.glob(os.path.join(path, '*.wav'))):
    data, s = librosa.load(filename)
    srates.append(s)
    audio_files.append(data)
    # Scale to -1.0/1.0
    audio_files[i] = audio_files[i].astype(np.float32) / 32767.0
    # Make max be 0.9
    audio_files[i] = ((0.9 / np.max(audio_files[i])) * audio_files[i])
    
    # Handle when files have different sampling rates
    if (i != 0):
        if (srates[i] != srates[i-1]):
            raise ValueError("Mismatched sampling rates between file {} and {}".format(i-1, i))
            
    # If all srates are the same, simplify to single value variable
    srate = srates[i]


titles = [f for f in os.listdir(path) if f.endswith('.wav')]
num_files = len(audio_files)

### Beat Tracking & Chorus Detection

Basically what i've done below is created a dictionary with data of each song (song_data).
this contains (for now):
chorus_starttime: start of a chorus timestamp. This value is in seconds
tempo: this is in bpm
length: length of the song

In [5]:

pairs = []
labels = []

for i, filename in enumerate(glob.glob(os.path.join(path, '*.wav'))):
    #finds the chorus_start
    chroma, y, sr, sec = pc.create_chroma(filename)
    chorus_start = pc.find_chorus(chroma, sr, sec, 10)
    if(isinstance(chorus_start, float)):
        pass
        # chorus_start worked
    else:
        segments = vamp.process_audio(y, sr, "segmentino:segmentino")
        # Get labels and timestamps for each section
        for s in segments:
            pairs.append(tuple((s['label'], s['timestamp'])))
        
        # Get list of just labels (for counting purposes)
        for p in pairs:
            labels.append(p[0])
            
        # Get most common label
        m = max(set(labels), key=labels.count)
        
        # Get chorus start
        chorus_start = pairs[labels.index(m)][1]
        
    # finds the tempo and song length
    tempo, beat_times = librosa.beat.beat_track(audio_files[i], sr=srate, hop_length=hopSize, start_bpm=70, units='time')
    if (tempo < 60):
        tempo = tempo*2
    if (tempo > 140):
        tempo = tempo/2
    song_length = len(audio_files[i])/srate
    # beat times are an array of times that beat tracker has located
    beats.append(beat_times)
    
    song_data[i] = {
        'length': song_length,
        'tempo' : tempo,
        'chorus_starttime': chorus_start
    }        

No choruses were detected. Try a smaller search duration


In [6]:
# check if tempos have been registered
tempos_exist = True
for i in range(num_files):
    if(song_data[i].get('tempo')==0):
        tempos_exist = False;

if(tempos_exist == False):
    timestretch_segs = False
    reorder_tracks = False

### Splitting at Chorus, Crossfading, and Merging

#### Defining Functions

In [8]:
#converts a time in seconds to a specific sample
def sec_convert(seconds):
    samples = seconds*srate
    return round(samples)

In [9]:
# finds the first beat after chorus_start, and a beat num_beats after 
# if first_beat + num_beats is past the last beat, choose the very last one
# converts these to a sample value
def choose_first_last_beat(timestamp, num_beats, songindex):

    for i, x in enumerate(beats[songindex]):
        if (timestamp+3 > x > timestamp):
            try:
                # +1 because you want the end of the last of num_beats
                last_beat = beats[songindex][i+num_beats]
            except IndexError:
                last_beat = beats[songindex][-1]

            return x, last_beat
        else:
            continue
    
    #found no suitable beats so choose a time duration instead (~22 seconds)
    last_beat = timestamp+(num_beats*1.5)
        
    return timestamp, last_beat


In [10]:
#Find Chorus, takes song_info array, int i, signal array x, and int num_beats
def chorus(s, i, x, b):
    start_time = s[i].get('chorus_starttime')
    first_beat, last_beat = choose_first_last_beat(start_time, b, i)
    first_sample = int(sec_convert(first_beat))
    last_sample = int(sec_convert(last_beat))
    try:
        c = x[first_sample:last_sample]
    except IndexError:
            last_sample = x[-1]
            c = x[first_sample:last_sample]
    
    return c
    

In [11]:
#CrossFading Function, takes signal arrays s1, s2, and fadetime f and delay time d in samples
def crossfade(s1,s2,f,d):
    
    #Add padded 0's to delay overlap of signals
    a = np.pad(s1[(len(s1)-f):], (0, d), 'constant', constant_values=(0, 0))
    b = np.pad(s2[:f], (d, 0), 'constant', constant_values=(0, 0))
    c = []
    l = f+d
    
    #Fade out s1 and fade in s2
    for i in range(0, l):
        m = i/f
        a[i] = a[i]*(1-m) #Decreases from 1 to 0 over fade duration
        b[(l-1)-i] = b[(l-1)-i]*(1-m) #Increase from 0 to 1 over fade duration
        
    a = a + b #Overlap both faded signals
    c = np.concatenate((s1[:len(s1)-f],a,s2[f:]), axis=0)
   
    #For testing
    #ipd.display(ipd.Audio(c,rate=srate))
    return c

In [12]:
#Determines if Fade length is longer than half the duration of the segment, 
#and corrects to .75*l/2 if it is too long
def det_fade (f, l):
    if (f >= (l/2)):
        newf = int(0.75 * (l/2))
    else:
        newf = f   
    return newf

#### Main Segmenting/Fading Code

In [13]:
#creates a list of tempos for the tracks and a mean tempo
if(timestretch_segs == True or reorder_tracks == True):
    tempo_list = []
    for i in range(num_files):
        tempo_list.append(song_data[i].get('tempo'))

    mean_tempo = np.mean(tempo_list)
    #print(mean_tempo)

In [14]:
#finds the scaling value for the timestretch function
def timestretch_value(mean_tempo, tempo):
    value = mean_tempo/tempo
    return value

In [None]:
#stretches the segments accordingly and places them in an array
if(timestretch_segs == True or reorder_tracks == True):
    timestretched_segments = []
    for i, x in enumerate(audio_files):
        audio = chorus(song_data, i, x, num_beats)
        stretch_value = timestretch_value(mean_tempo, song_data[i].get('tempo'))
        timestretched_segments.append(librosa.effects.time_stretch(audio, stretch_value))

In [None]:
# THIS FUNCTION OVERWRITES THE TEMPO_LIST CREATED EARLIER
# reorders segments according to their tempo by chosen ordering
if(timestretch_segs == True or reorder_tracks == True):
    reordered_segments = []
    for x in audio_files:
        if(fast_to_slow == True):
            index = tempo_list.index(max(tempo_list))
        else:
            index = tempo_list.index(min(tempo_list))
            
        if(fast_to_slow == True):
            tempo_list[index] = -1
        else:
            tempo_list[index] = 200
        segment = chorus(song_data, index, x, num_beats)
        reordered_segments.append(segment)

In [None]:
#Segmenting Trailer and adding Fade in/out and Crossfades
crsf = sec_convert(cross_fade)
inif = sec_convert(init_fade)
finf = sec_convert(final_fade)
d = sec_convert(delay)

for i, x in enumerate(audio_files):
    #Pull Track Segment
    if timestretch_segs == True:
        aud = timestretched_segments[i]
    elif reorder_tracks == True:
        aud = reordered_segments[i]
    else:
        aud = chorus(song_data, i, x, num_beats)
    l = len(aud)

    ### First Track ###
    if (i == 0):
        #print("First Track 1")
        #Determine if Initial Fade is too long
        f = det_fade(inif, l)
        
        #add fade in to first track
        for k in range(0, f):
            m = k /f
            aud[k] = aud[k]*m
        
        trailer = aud
        prevl = l 
    
    ### last Track ###
    elif (i == num_files-1):
        #print("Last Track %i" % (i+1))
        #Determine if Final Fade is too long
        f = det_fade(finf, l)
            
        #Determine if Crossfade is too long
        f2 = det_fade(crsf, l) #new segment
        f2 = det_fade(f2, prevl) #prev segment
        
        #add fade out to last track
        for k in range(l-f, l):
            m = (l - k)/f
            aud[k] = aud[k]*m
       
        trailer = crossfade(trailer,aud,f2,d)
        
    ### Middle Tracks ###
    else:
        #print("Track %i" % (i+1))
        #Determine if Crossfade is too long
        f = det_fade(crsf, l) #new segment
        f = det_fade(f, prevl) #prev segment
        
        trailer = crossfade(trailer,aud,f,d)
        prevl = l

#Convert data type back so that conversion to wav file works
trailer = np.float32(trailer)

#For Testing
#ipd.display(ipd.Audio(trailer,rate=srate))

### File Output

In [None]:
wav.write(new_file_name, int(srate), np.array(trailer))