<a id='top'></a> 

# Building the Predictive Model
In this notebook, we will build a Recurrent Neural Network (RNN) using a song's timbre from Spotify's audio analysis to cluster songs that sound alike, and use the results to train an unsupervised learning model that will return 'like' songs.<br>

*Unfortunately, while I was working on this portion of the project, Spotify shut down access to the audio analysis. I was able to acquire data for over 500,000 songs, but many of them were not useful for building the final product I desired, and it was not enough. By the time I figured how to adjust my methodology to pull more relevant songs, the access had been cut-off. As a result, this notebook serves to explain the build of the RNN and the result.*

---

## Prerequisites
[1) Setting Up the API Connection](https://nbviewer.org/github/JonYarber/music_modeling/blob/main/python/01SettingUptheAPIConnection.ipynb)<br>
[2) Using the Spotify API](https://nbviewer.org/github/JonYarber/music_modeling/blob/main/python/02UsingtheSpotifyAPI.ipynb)<br>
[3) Spotify Audio Data Insights](https://nbviewer.org/github/JonYarber/music_modeling/blob/main/python/03SpotifyAudioDataInsights.ipynb)<br>

---

<a id='AcquiringTimbres'></a>

## Acquiring Timbres
Since the plan was to build a machine learning model, I needed to pull a large number of track timbres for analysis. This came with several challenges:
<h3>Rate Limiting</h3>
Spotify enforces strict rate limits on API calls. When I ran a large volume of searches or tried to retrieve bulk audio analyses, I was cut off from the API within minutes.<br>
I initially added a pause, limiting requests to only a couple per minute, but even at that rate I was still blocked after an hour or two.<br>
<br>
<b>Solution:</b> I set up multiple API keys. When the rate limit was exceeded on one key, another could take over.<br>
Even with this workaround, I still had to keep the request rate low, so retrieving track URIs and timbres became a days-long process.
<h3>Finding Track URIs</h3>
From <a href = "https://nbviewer.org/github/JonYarber/music_modeling/blob/main/python/03SpotifyAudioDataInsights.ipynb">Spotify Audio Data Insights</a> we know that a track URI is required to retrieve timbre data using the <code>audio_analysis</code> method. From <a href="https://nbviewer.org/github/JonYarber/music_modeling/blob/main/python/02UsingtheSpotifyAPI.ipynb">Using the Spotify API</a> we know that the <code>search</code> method can be used to retreive track URIs. But exactly should we feed into the search to generate URIs?<br>
<br>
<b>Solution:</b> A key line in  <a href="https://nbviewer.org/github/JonYarber/music_modeling/blob/main/python/02UsingtheSpotifyAPI.ipynb">Using the Spotify API</a> came in handy here:<br>
        <blockquote><i>The <code>search</code> function will always return the specified number of results... regardless of how good a 
            match the result is.</blockquote></i>
I leveraged this “fuzzy matching” behavior by writing an algorithm that iterated through the alphabet, stringing together various letter combinations one at a time. This worked, but introduced another problem.
<h3>Picking Up Where I Left Off</h3>
If I was generating random letter combinations, how could I resume from the right place once the API cut me off due to rate limits?<br>
<br>
<b>Solution:</b> I built another algorithm that converted the last search sequence into an index (a number), then picked up from the next sequence when restarted.<br>
<b>Example:</b> If the last search term was "ab", that represented the 27th search. The next term ("ac") would then be search number 28.
<h3>Data Management</h3>
Timbre is recorded per microsecond of a song, so the entire timbre profile for just one song is hundreds or thousands of arrays of length 12. That's a lot of data. Since my goal was to collect over a million track timbres, I needed an efficient way to store and manage all this data.</b>
<br>
<b>Solution:</b> I built a MySQL database in DBeaver, which made it much easier to organize and query the growing dataset.<br>

<a id='TimbrePipeline'></a>

## The Pipeline
Once all of these challenges were addressed, I constructed a pipeline that would:
<ol>
    <li>Rotate through API keys whenever a rate limit was reached.</li> 
    <li>Search for track URIs using the <code>search</code> method and my <code>create_search_term()</code> function.</li>
    <li>Store each retrieved <code>track_uri</code>, <code>track_name</code>, <code>artist_uri</code>, and the <code>search_term</code> used in the <b>tracks</b> table in the DB.</li>
    <li>Store each retreived <code>artist_uri</code> and <code>artist_name</code> in the <b>artists</b> table in the DB.</li>
    <li>When the rate limit was met or more track URIs were required, use the last <code>search_term</code> in my from my <b>tracks</b>
        table and my <code>get_next_term()</code> function to generate the next search term and continue.</li>
    <li>Once a set number of track URIs were collected, call the <code>audio_analysis</code> method on those URIs. Store each <code>track_uri</code> and its <code>track_timbre</code> in the <b>timbres</b> table in the DB.</li>
</ol>
Sound simple? <b>It wasn’t!</b><br>
Along the way, I learned a lot about error handling, batching requests, and working around rate limitations. While I’m not uploading a notebook that explains this process in detail (since the approach no longer works), the Python script GetAppendTimbres.py can still be found in the repo.

---

## Functions

### Get Timbre Dataframe

Load the acquired track timbres from my database. Only load songs of a minimum length (<code>min_segments</code>) and <code>popularity</code>.

In [None]:
def get_timbre_df(min_popularity, min_segments):
    
    print("Building timbre DF from DB.")
    
    start_time = time.time()
    
    timbre_query = """SELECT 
                        	t.track_name,
                        	a.artist_name,
                        	t.popularity,
                        	tim.track_timbre
                        FROM timbres tim
                        LEFT JOIN tracks t ON t.track_uri = tim.track_uri
                        LEFT JOIN artists a ON a.artist_uri = t.artist_uri
                        WHERE popularity > %s 
                            AND JSON_LENGTH(track_timbre) > %s"""
                      
    timbre_df = pd.read_sql(timbre_query, 
                            con = engine, 
                            params = (min_popularity, min_segments))
    
    print(f"Timbre DF loaded in {round(time.time() - start_time)} seconds.")

    print("Applying JSON loads.")
    
    start_time = time.time()
    
    # Fix timbre DF
    timbre_df['track_timbre'] = timbre_df.track_timbre.apply(json.loads)
    
    print(f"JSON load took {round(time.time() - start_time)} seconds to execute.")
    
    print(f"Final timbre data frame: {len(timbre_df)} rows.")

    return timbre_df

### Get Middle Portion

Because tracks of different lengths (number of features), we have to make them all the same length. For this, I opted to just pull the middle portions of the tracks.

In [None]:
def get_middle_portion(vectors, target_length):
    
    length = len(vectors)
    
    if length >= target_length:
        start_idx = (length - target_length) // 2
        end_idx = start_idx + target_length
        return vectors[start_idx:end_idx]
    else:
        padding_needed = target_length - length
        return vectors + [[0] * 12] * padding_needed

### Clean Timbre Dataframe

Use the <code>get_middle_portion()</code> function, flatten the timbres, and scale.

In [None]:
def clean_timbres(timbres, final_length):
    
    segments = np.array([get_middle_portion(seq, final_length) for seq in timbres])

    segments_scaled = StandardScaler().fit_transform(segments.reshape(-1, 1))
    
    segments_scaled = MinMaxScaler(feature_range=(-1, 1)).fit_transform(segments_scaled)
    
    segments_scaled = segments_scaled.reshape(len(timbres), final_length, 12)
    
    return list(segments_scaled)

### Build Autoencoder

Here's what we've all been waiting for: <b>the brain</b>.<br>
This is RNN which compresses these tracks with millions of parameters down to just two numbers. 

In [None]:
def build_autoencoder(num_segments):
    
    encoder_input = layers.Input(shape=(num_segments, 12),name='encoder_input')
    
    encode_conv1d_16 = layers.Conv1D(16,kernel_size=10,activation='relu',name='encode_conv1d_16',padding='same')(encoder_input)
    
    encode_conv1d_32 = layers.Conv1D(32,kernel_size=10, activation='relu',name='encode_conv1d_32',padding='same')(encode_conv1d_16)
    
    encode_conv1d_64 = layers.Conv1D(64,kernel_size=10,activation='relu',name='encode_conv1d_64',padding='same')(encode_conv1d_32)
    
    encode_lstm = layers.LSTM(128,activation='tanh',return_sequences=False, name='encode_lstm')(encode_conv1d_64)
    
    encode_dense_64 = layers.Dense(64,activation='relu',name='encode_dense_64')(encode_lstm)
    
    encode_dense_32 = layers.Dense(32,activation='relu',name='encode_dense_32')(encode_dense_64)
    
    latent_layer = layers.Dense(2,activation='linear',name='latent_layer')(encode_dense_32)
    
    decode_dense_32 = layers.Dense(32,activation='relu',name='decode_dense_32')(latent_layer)
    
    decode_dense_64 = layers.Dense(64,activation='relu',name='decode_dense_64')(decode_dense_32)
    
    #decode_dense_128 = layers.Dense(128,activation='relu',name='decode_dense_128')(decode_dense_64)
    
    decode_dense_expand = layers.Dense(num_segments*64,activation='relu',name='decode_dense_expand')(decode_dense_64)
    
    decoder_reshape = layers.Reshape((num_segments,64),name='decoder_reshape')(decode_dense_expand)
    
    decode_lstm = layers.LSTM(128,activation='tanh',return_sequences=True,name='decode_lstm')(decoder_reshape)
    
    decode_conv1d_64 = layers.Conv1D(64,activation='relu',kernel_size=10,name='decode_conv1d_64',padding='same')(decode_lstm)
    
    decode_conv1d_32 = layers.Conv1D(32,activation='relu',kernel_size=10,name='decode_conv1d_32',padding='same')(decode_conv1d_64)
    
    decode_conv1d_16 = layers.Conv1D(16,activation='relu',kernel_size=10,name='decode_conv1d_16',padding='same')(decode_conv1d_32)
    
    output_layer = layers.Conv1D(12,activation='linear',kernel_size=10,name='output_layer',padding='same')(decode_conv1d_16)
    
    autoencoder = models.Model(inputs=encoder_input, outputs=output_layer,name='my_model')

    return autoencoder