# Audioalchemy

    Musical transitions in a DJ set are optimal when the song transitioned from and the song to have technical similarities. Our goal in this project is to use a heuristic search algorithm called Beam Search and personal rankings of transitions to build an optimal playlist with an optimal set of transitions. We will be connecting to the Spotify API to get pre-built playlists and reorganizing them using our own Beam Search algorithm with customized transition score calculations.

## Collecting song information

In [None]:
!pip install spotipy

In [None]:
import spotipy
# from spotipy.oauth2 import SpotifyClientCredentials
from spotipy.oauth2 import SpotifyOAuth

In [None]:
cid = "3519692942004325a2c7160c90717ca5"
secret = "9243be1df96e48bb829c4b07254bd82c"
redirect_uri = 'https://github.com/manav-s/audioalchemy'
scope = 'playlist-modify-public playlist-modify-private'

In [None]:
import os
cache_path = ".cache"
if os.path.exists(cache_path):
    os.remove(cache_path)

In [50]:
# client_credentials_manager = SpotifyClientCredentials(client_id=cid, client_secret=secret)
# sp = spotipy.Spotify(client_credentials_manager = client_credentials_manager)

auth_manager = SpotifyOAuth(client_id=cid, client_secret=secret, redirect_uri=redirect_uri, scope=scope)
sp = spotipy.Spotify(auth_manager=auth_manager)

In [None]:
playlist_link = "https://open.spotify.com/playlist/0whpp60V3f8y5G1e6HV59N?si=28d21cbd6b7c4137"
uri = playlist_link.split("/")[-1].split("?")[0]
track_uris = [x["track"]["uri"] for x in sp.playlist_tracks(uri)["items"]]

In [None]:
print(sp.audio_features(track_uris)[0])

In [None]:
import pandas as pd

# Initialize empty lists for columns
song_names = []
danceability = []
energy = []
key = []
loudness = []
mode = []
speechiness = []
acousticness = []
instrumentalness = []
liveness = []
valence = []
tempo = []

# Loop through the tracks and append the data to the respective lists
for track in sp.playlist_tracks(playlist_link)["items"]:
    track_name = track["track"]["name"]
    song_names.append(track_name)
    
    audio_features = sp.audio_features(track["track"]["uri"])[0]
    danceability.append(audio_features["danceability"])
    energy.append(audio_features["energy"])
    key.append(audio_features["key"])
    loudness.append(audio_features["loudness"])
    mode.append(audio_features["mode"])
    speechiness.append(audio_features["speechiness"])
    acousticness.append(audio_features["acousticness"])
    instrumentalness.append(audio_features["instrumentalness"])
    liveness.append(audio_features["liveness"])
    valence.append(audio_features["valence"])
    tempo.append(audio_features["tempo"])

# Create a DataFrame
data = {
    "Song Name": song_names,
    "Danceability": danceability,
    "Energy": energy,
    "Key": key,
    "Loudness": loudness,
    "Mode": mode,
    "Speechiness": speechiness,
    "Acousticness": acousticness,
    "Instrumentalness": instrumentalness,
    "Liveness": liveness,
    "Valence": valence,
    "Tempo": tempo
}

df = pd.DataFrame(data)
df

# Score

We need a weighting system that identifies the importance of each feature.

In [None]:
def get_relative_key(key, mode):
    if mode == 1:  # Major key
        return (key + 9) % 12  # Relative minor key
    else:  # Minor key
        return (key + 3) % 12  # Relative major key

def key_compatibility(key1, mode1, key2, mode2):
    if key1 == key2:
        return True
    if mode1 != mode2 and (key1 == get_relative_key(key2, mode2) or key2 == get_relative_key(key1, mode1)):
        return True
    return False

def genre_similarity(genres1, genres2):
    if not genres1 or not genres2:
        return 0
    shared_genres = len(set(genres1).intersection(genres2))
    # print("shared_genres: ", shared_genres)
    # print("shared_genres / max(len(genres1), len(genres2)): ", shared_genres / max(len(genres1), len(genres2)))
    
    # max of 5 shared genres
    return min(shared_genres, 5)

# The closer the score to 0, the better the transition 
def evaluate_transition(song1, song2):
    score = 0
    #total genres for a given song
    max_genres = 10
    
    # Weights for different attributes
    weights = {
        'danceability': 7,
        'energy': 5,
        'loudness': 1,
        'tempo': 100,
        'valence': 5,
        'genre': 4 
    }

    # Check key compatibility
    if not key_compatibility(song1['key'], song1['mode'], song2['key'], song2['mode']):
        score += 6  # Return a large score if keys are not compatible

    # Calculate the differences in attributes
    diff = {}
    for attribute in weights:
        if attribute == 'genre':
            # if they have 5 in common, it is going to add 0
            diff[attribute] = 5 - genre_similarity(song1[attribute], song2[attribute])
        else:
            diff[attribute] = abs(song1[attribute] - song2[attribute])

    # Normalize loudness differences
    diff['loudness'] /= 60  # Assume max difference is 60 dB

    # Calculate tempo difference considering double/half time mixing
    tempo_diff = min(
        abs(song1['tempo'] - song2['tempo']),
        abs(song1['tempo'] * 2 - song2['tempo']),
        abs(song1['tempo'] / 2 - song2['tempo'])
    )
    diff['tempo'] = tempo_diff / 200  # Normalize tempo difference

    # Calculate the transition score
    for attribute in weights:
        # print("attribute: ", attribute, "   score: ", weights[attribute] * diff[attribute])
        score += weights[attribute] * diff[attribute]

    return score


### Transition Score Testing

In [None]:
def print_song_data(song):
    print("Song Name:", song["track_name"])
    print("Danceability:", song["danceability"])
    print("Energy:", song["energy"])
    print("Key:", song["key"])
    print("Loudness:", song["loudness"])
    print("Mode:", song["mode"])
    print("Valence:", song["valence"])
    print("Tempo:", song["tempo"])
    print("Genre:", ', '.join(song["genre"]))
    print()

def get_related_artist_genres(artist_id):
    related_artists = sp.artist_related_artists(artist_id)
    genres = []
    for artist in related_artists['artists']:
        genres.extend(artist['genres'])
    return list(set(genres))

def get_song_data(track_id):
    track = sp.track(track_id)
    audio_features = sp.audio_features(track_id)[0]
    artist_id = track['artists'][0]['id']
    genres = get_related_artist_genres(artist_id)

    song_data = audio_features.copy()
    song_data['genre'] = genres
    song_data['track_name'] = track['name']
    return song_data

#### Same Song score

In [None]:
song1_id = "0hURIUSiPFIv7dzlejdf3N"
song2_id = "0hURIUSiPFIv7dzlejdf3N"

song1 = get_song_data(song1_id)
song2 = get_song_data(song2_id)

print("Song 1 Info:")
print_song_data(song1)

print("Song 2 Info:")
print_song_data(song2)

transition_score = evaluate_transition(song1, song2)
print(f"Transition score between the two songs: {transition_score:.2f}")

#### Good Song score

In [None]:
song1_id = "0yLdNVWF3Srea0uzk55zFn"
song2_id = "3RfNQMIeuL2QC9l4VxOMoj"

song1 = get_song_data(song1_id)
song2 = get_song_data(song2_id)

print("Song 1 Info:")
print_song_data(song1)

print("Song 2 Info:")
print_song_data(song2)

transition_score = evaluate_transition(song1, song2)
print(f"Transition score between the two songs: {transition_score:.2f}")

#### Medium Song score

In [None]:
song1_id = "1j6kDJttn6wbVyMaM42Nxm"
song2_id = "2I9foKseoFQh07p6sD2voE"

song1 = get_song_data(song1_id)
song2 = get_song_data(song2_id)

print("Song 1 Info:")
print_song_data(song1)

print("Song 2 Info:")
print_song_data(song2)

transition_score = evaluate_transition(song1, song2)
print(f"Transition score between the two songs: {transition_score:.2f}")

#### Bad Song score

In [None]:
song1_id = "1j6kDJttn6wbVyMaM42Nxm"
song2_id = "0hURIUSiPFIv7dzlejdf3N"

song1 = get_song_data(song1_id)
song2 = get_song_data(song2_id)

print("Song 1 Info:")
print_song_data(song1)

print("Song 2 Info:")
print_song_data(song2)

transition_score = evaluate_transition(song1, song2)
print(f"Transition score between the two songs: {transition_score:.2f}")

### Beam Search Description

	Beam Search is a heuristic algorithm that explores a graph by expanding only on the most promising of nodes. To summarize prior, a heuristic algorithm is an algorithm that prioritizes a near optimal solution based on a variety of specialized techniques. As a result, heuristic algorithms like Beam Search are greedy and prioritize a locally optimal solution trading accuracy for speed. 

	Beam Search is a breadth search algorithm that generates all the current successors. Here is the general outline of our algorithm in terms of musical transitions:
		 First, we will begin with a a singular node, containing a transition score of 0, current songs (which is empty to begin with), and a list of available songs.

		Second, for each available song, we will evaluate a transition score from the last song of the current playlist to the available song selected. The way we evaluated transition scores for each song is done by scoring musical likeness between songs. We will then remove the available song from the available songs list and add it to the current songs list. We will add this transition score to the total score and store this iteration of current songs, transition score, available songs as a new node in our heap of nodes to be explored.
		
		Once all available songs have been visited we will then remove from the heap of newly created nodes with some of the highest transition scores. We will now consider only a constant set of nodes (which are of the lowest scores).
		
		We will repeat these steps until all available songs have been considered and a full playlist has been created. We will select the remaining node with the highest transition score, as this node contains the playlist with the heuristically determined optimal solution.

In [None]:
import heapq

def beam_search(songs, beam_width=3):
    # Initialize the search space with the initial state
    search_space = [([], songs, 0)]

    while search_space:
        # Keep track of the next states with their corresponding costs
        next_states = []

        for state in search_space:
            playlist, remaining_songs, cost = state

            # If there are no remaining songs, we have a complete playlist
            if not remaining_songs:
                return playlist, cost

            # Generate possible next states by adding one of the remaining songs
            for song in remaining_songs:
                if song not in playlist:
                    new_remaining_songs = remaining_songs.copy()
                    new_remaining_songs.remove(song)
                    new_playlist = playlist + [song]

                    if len(new_playlist) > 1:
                        last_song = get_song_data(playlist[-1])
                        current_song = get_song_data(song)
                        transition_cost = evaluate_transition(last_song, current_song)
                    else:
                        transition_cost = 0

                    new_cost = cost + transition_cost
                    next_states.append((new_playlist, new_remaining_songs, new_cost))

        # Keep only the best 'beam_width' states for the next iteration
        # You can experiment with beam width lengths.
        search_space = heapq.nsmallest(beam_width, next_states, key=lambda x: x[2])

    return None, float('inf')

# Example usage with a list of song URIs:
# song_uris = [
#     "2M2urNXOgop2isPZ9Vv4f7",
#     "3AntMecs7ThSZLnr2o8r78",
#     # Add more URIs
# ]



In [None]:
sample_playlist_link = "https://open.spotify.com/playlist/0whpp60V3f8y5G1e6HV59N?si=28d21cbd6b7c4137"
sample_uri = sample_playlist_link.split("/")[-1].split("?")[0]
sample_playlist = [x["track"]["uri"] for x in sp.playlist_tracks(sample_uri)["items"]]

optimal_playlist, cost = beam_search(sample_playlist, beam_width=3)
print("Optimal playlist order:")
for i, song in enumerate(optimal_playlist):
    print(f"{i + 1}. {get_song_data(song)['track_name']}")

print(f"\nTransition cost: {cost:.2f}")

In [None]:
# print(transition_cost("2I9foKseoFQh07p6sD2voE", "4E5IFAXCob6QqZaJMTw5YN"))

song1_id = "2I9foKseoFQh07p6sD2voE"
song2_id = "4E5IFAXCob6QqZaJMTw5YN"

song1 = get_song_data(song1_id)
song2 = get_song_data(song2_id)

print("Song 1 Info:")
print_song_data(song1)

print("Song 2 Info:")
print_song_data(song2)

print(evaluate_transition(get_song_data("2I9foKseoFQh07p6sD2voE"), get_song_data("4E5IFAXCob6QqZaJMTw5YN")))


In [None]:

song1_id = "2I9foKseoFQh07p6sD2voE"
song2_id = "4E5IFAXCob6QqZaJMTw5YN"

song1 = get_song_data(song1_id)
song2 = get_song_data(song2_id)

print("Song 1 Info:")
print_song_data(song1)

print("Song 2 Info:")
print_song_data(song2)

print(evaluate_transition(get_song_data("2I9foKseoFQh07p6sD2voE"), get_song_data("5Tbpp3OLLClPJF8t1DmrFD")))

## Implementing Beam Search

In [None]:
# # Loop through the tracks and print the track name and its artist's genres
# for item in sp.playlist_tracks(uri)["items"]:
#     track = item["track"]
#     track_name = track["name"]
#     artist_id = track["artists"][0]["id"]

#     # Get the artist's genres
#     artist = sp.artist(artist_id)
#     genres = artist["genres"]

#     print(f"Track: {track_name}\nArtist's Genres: {', '.join(genres)}\n")




## Reordering the Spotify playlist

We will add the new index of each song to the dataframe

In [None]:
df['new_index'] = pd.Series(None, index=df.index)
df

In [None]:
# testing by reversing the index
df['new_index'] = df.index[::-1]
df

In [None]:
def reorder_playlist(playlist_id, original_index, new_index):
    # Get the current tracks in the playlist
    current_tracks = sp.playlist_tracks(playlist_id)['items']
    current_uris = [track['track']['uri'] for track in current_tracks]
    
    # Rearrange the tracks according to the new index
    new_uris = [current_uris[idx] for idx in new_index]
    
    # Replace the tracks in the playlist with the new order
    sp.playlist_replace_items(playlist_id, new_uris)
    
    # Update the DataFrame with the new indices
    new_order = [i for i in range(len(original_index))]
    df = original_index.to_frame()
    df['new_index'] = new_order

In [None]:
reorder_playlist(uri, df.index, df['new_index'])

In [None]:
# Initialize empty lists for columns
song_names = []
danceability = []
energy = []
key = []
loudness = []
mode = []
speechiness = []
acousticness = []
instrumentalness = []
liveness = []
valence = []
tempo = []

# Loop through the tracks and append the data to the respective lists
for track in sp.playlist_tracks(uri)["items"]:
    track_name = track["track"]["name"]
    song_names.append(track_name)
    
    audio_features = sp.audio_features(track["track"]["uri"])[0]
    danceability.append(audio_features["danceability"])
    energy.append(audio_features["energy"])
    key.append(audio_features["key"])
    loudness.append(audio_features["loudness"])
    mode.append(audio_features["mode"])
    speechiness.append(audio_features["speechiness"])
    acousticness.append(audio_features["acousticness"])
    instrumentalness.append(audio_features["instrumentalness"])
    liveness.append(audio_features["liveness"])
    valence.append(audio_features["valence"])
    tempo.append(audio_features["tempo"])

# Create a DataFrame
data = {
    "Song Name": song_names,
    "Danceability": danceability,
    "Energy": energy,
    "Key": key,
    "Loudness": loudness,
    "Mode": mode,
    "Speechiness": speechiness,
    "Acousticness": acousticness,
    "Instrumentalness": instrumentalness,
    "Liveness": liveness,
    "Valence": valence,
    "Tempo": tempo
}

df = pd.DataFrame(data)
df

In [None]:
import os
import spotipy
from spotipy.oauth2 import SpotifyOAuth

def authenticate_user(cid, secret, redirect_uri, scope):
    sp_oauth = SpotifyOAuth(client_id=cid, client_secret=secret, redirect_uri=redirect_uri, scope=scope)
    token_info = sp_oauth.get_cached_token()

    if not token_info:
        auth_url = sp_oauth.get_authorize_url()
        print(f'Please navigate here and authorize access: {auth_url}')
        response = input('Enter the URL you were redirected to: ')
        code = sp_oauth.parse_response_code(response)
        token_info = sp_oauth.get_access_token(code)

    return spotipy.Spotify(auth=token_info['access_token'])

def reorder_playlist(sp, playlist_link):
    user_id = sp.current_user()['id']
    playlist_id = playlist_link.split('/')[-1].split(':')[-1]

    # Fetch playlist details and tracks
    playlist = sp.playlist(playlist_id)
    tracks = playlist['tracks']['items']

    # Reorder the tracks (example: reverse the order)
    reordered_tracks = list(reversed(tracks))

    # Create a new playlist
    new_playlist = sp.user_playlist_create(user_id, f'Reordered: {playlist["name"]}')

    # Add the reordered tracks to the new playlist
    track_uris = [track['track']['uri'] for track in reordered_tracks]
    sp.playlist_add_items(new_playlist['id'], track_uris)

    return new_playlist['external_urls']['spotify']

if __name__ == '__main__':
    sp = authenticate_user(cid, secret, redirect_uri, scope)
    playlist_link = input('Enter the playlist link: ')
    new_playlist_link = reorder_playlist(sp, playlist_link)
    print(f'New reordered playlist created: {new_playlist_link}')


# Conclusion

### Analysis

    During the course of this project, we aimed to find an optimal sorting mechanism for reordering the Spotify playlist tracks. To achieve this goal, we developed an evaluation score function, which we tweaked iteratively using a combination of tests, intuition, and analysis. The following is an analysis of the process we went through to arrive at our final evaluation score function:

    Initial function: We started with a basic evaluation score function that considered a few track features. However, we quickly realized that this initial function did not produce satisfactory results in terms of reordering the playlist tracks.

    Feature analysis: To improve our evaluation score function, we first analyzed the available track features in the dataset. We focused on those features that could have a significant impact on the listening experience, such as tempo, danceability, energy, and valence.

    Feature selection: Based on our analysis, we selected a set of track features that we believed would contribute to a better listening experience. We then experimented with different combinations of these features to understand their relative importance in the evaluation score function.

    Tweaking weights: Once we had chosen the relevant features, we assigned weights to them in our evaluation score function. We tweaked these weights iteratively, running tests on different playlists to see how they affected the resulting track order. We adjusted the weights based on the test results and our intuition about the importance of each feature in the listening experience.

    Testing: To ensure the effectiveness of our evaluation score function, we ran multiple tests using different playlists with diverse characteristics. We compared the original track order to the reordered tracklist generated by our function, looking for improvements in the listening experience.

    Intuition and feedback: Throughout the process, we also relied on our intuition and gathered feedback from users to understand how the reordered playlists aligned with their preferences. This information helped us further refine the weights and the overall evaluation score function.

    Final function: After numerous iterations, tests, and refinements, we arrived at an optimal evaluation score function that produced satisfactory results in terms of reordering the playlist tracks. This function was able to effectively balance the selected features, creating a more enjoyable listening experience for the users.

    In conclusion, our iterative approach, involving feature analysis, feature selection, tweaking of weights, testing, and the incorporation of intuition and user feedback, helped us develop an optimal evaluation score function for reordering Spotify playlists. This function is capable of enhancing the listening experience and can be further refined or expanded upon as needed.

### Next Steps

    Ponential next steps could be:

    Implement additional sorting methods to allow users to experiment with different listening experiences. Some examples include sorting by tempo, danceability, valence (mood), or track length.

    Analyze the playlists of multiple users to understand the common characteristics and trends in playlist creation, which could lead to insights into user behavior and preferences using a deep learning algorithm and or reinforcement learning.

    Create a user-friendly interface (e.g., a web or desktop application) that allows users to interact with the project easily, reorder their playlists, and visualize the results.

    Consider using machine learning techniques to create personalized playlist recommendations based on the user's listening history and preferences.

    Experiment with creating playlists that cater to specific moods, activities, or situations (e.g., workout playlists, relaxation playlists, or party playlists) by analyzing the audio features of the tracks and generating playlists that meet the desired criteria.