### 🎧Algoritmo Recomendador de Canciones


Introducción



En este proyecto, se desarrolla un sistema de recomendación de canciones inspirado en plataformas de música como Spotify. El objetivo principal es ofrecer sugerencias personalizadas basadas en las preferencias del usuario, utilizando técnicas de análisis de datos y machine learning.

El sistema se enfoca en analizar características musicales como el género, tempo,  popularidad,  para encontrar similitudes entre canciones y realizar recomendaciones relevantes.

Este proyecto busca simular una versión simple pero funcional de un motor de recomendación musical, que puede ser útil tanto para aprender sobre sistemas de recomendación como para explorar técnicas de procesamiento y análisis de datos musicales.

Importar librerías

In [80]:
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.preprocessing import OneHotEncoder

Generar información del dataset

In [81]:
data = {
'títulos' : [
    "Blinding Lights", "Bad Guy", "Uptown Funk", "Stay", "Levitating", "Shape of You", "Believer",
    "Lose Yourself", "Sicko Mode", "Senorita", "One Dance", "Blame It on Me", "Can't Feel My Face",
    "Rockstar", "Peaches", "Dance Monkey", "Happier Than Ever", "Take On Me", "Don't Start Now",
    "Drivers License", "Bohemian Rhapsody", "Heat Waves", "La Fama", "Tusa", "Dakiti",
    "INDUSTRY BABY", "abcdefu", "Easy on Me", "Butter", "Permission to Dance", "Save Your Tears",
    "As It Was", "Montero", "Telepatía", "Staying Alive", "Shivers", "Señales", "Todo de Ti",
    "Me Porto Bonito", "Ojitos Lindos"
],

'artistas' : [
    "The Weeknd", "Billie Eilish", "Bruno Mars", "The Kid LAROI & Justin Bieber", "Dua Lipa",
    "Ed Sheeran", "Imagine Dragons", "Eminem", "Travis Scott", "Shawn Mendes & Camila Cabello",
    "Drake", "George Ezra", "The Weeknd", "Post Malone", "Justin Bieber", "Tones and I",
    "Billie Eilish", "a-ha", "Dua Lipa", "Olivia Rodrigo", "Queen", "Glass Animals", "Rosalía",
    "Karol G", "Bad Bunny", "Lil Nas X", "GAYLE", "Adele", "BTS", "BTS", "The Weeknd",
    "Harry Styles", "Lil Nas X", "Kali Uchis", "DJ Khaled", "Ed Sheeran", "Duki", "Rauw Alejandro",
    "Bad Bunny", "Bad Bunny"
],

'géneros' : [
    "Pop", "Pop", "Funk", "Pop", "Pop", "Pop", "Rock", "Hip-Hop", "Hip-Hop", "Pop", "Afrobeats",
    "Folk", "Pop", "Hip-Hop", "R&B", "Pop", "Alternative", "Pop", "Dance", "Pop", "Rock", "Indie",
    "Latin", "Reggaeton", "Reggaeton", "Hip-Hop", "Pop-Rock", "Soul", "K-Pop", "Pop", "Synth-pop",
    "Pop", "Pop", "R&B", "Hip-Hop", "Pop", "Trap", "Pop Latino", "Reggaeton", "Latin"
],

'ratings' : [
    4.8, 4.5, 4.7, 4.6, 4.4, 4.5, 4.6, 4.9, 4.4, 4.2, 4.3, 4.1, 4.5, 4.2, 4.0, 4.3,
    4.4, 4.7, 4.5, 4.6, 5.0, 4.5, 4.3, 4.1, 4.4, 4.5, 4.0, 4.7, 4.5, 4.4, 4.6, 4.6,
    4.5, 4.3, 4.2, 4.4, 4.2, 4.5, 4.4, 4.6
],

'BPM' : [
    171, 135, 115, 170, 103, 96, 125, 171, 78, 117, 104, 92, 108, 160, 90, 98, 122,
    169, 124, 144, 72, 160, 98, 105, 108, 150, 122, 70, 110, 120, 118, 174, 180, 100,
    125, 140, 110, 128, 92, 90
],
     
}



Creación del dataframe y matrices en base a la información generada

In [82]:
df = pd.DataFrame(data)
df.head(15)

Unnamed: 0,títulos,artistas,géneros,ratings,BPM
0,Blinding Lights,The Weeknd,Pop,4.8,171
1,Bad Guy,Billie Eilish,Pop,4.5,135
2,Uptown Funk,Bruno Mars,Funk,4.7,115
3,Stay,The Kid LAROI & Justin Bieber,Pop,4.6,170
4,Levitating,Dua Lipa,Pop,4.4,103
5,Shape of You,Ed Sheeran,Pop,4.5,96
6,Believer,Imagine Dragons,Rock,4.6,125
7,Lose Yourself,Eminem,Hip-Hop,4.9,171
8,Sicko Mode,Travis Scott,Hip-Hop,4.4,78
9,Senorita,Shawn Mendes & Camila Cabello,Pop,4.2,117


In [83]:
# Crear la matriz de géneros

encoder = OneHotEncoder(sparse_output=False)  #array denso, no una matriz dispersa
genero_encoder =encoder.fit_transform(df[['géneros']]) 

""" 
df[['géneros']] selecciona la columna géneros como un DataFrame, 
no como una Serie (eso es importante porque fit_transform espera una matriz 2D).


fit_transform(...) hace dos cosas:
Aprende qué valores únicos hay en esa columna (fit).
Los transforma en vectores one-hot (transform).
"""

" \ndf[['géneros']] selecciona la columna géneros como un DataFrame, \nno como una Serie (eso es importante porque fit_transform espera una matriz 2D).\n\n\nfit_transform(...) hace dos cosas:\nAprende qué valores únicos hay en esa columna (fit).\nLos transforma en vectores one-hot (transform).\n"

In [84]:
# Crear la matriz de ratings y BPM
rating_bpm =  df [['ratings', 'BPM']].values



In [85]:
# Crear matriz final de caracteristicas, concatena horizontalmente(np.hstack)
matriz_caracteristicas = np.hstack((rating_bpm,genero_encoder))


In [86]:
# Creando un matriz de similitud
matriz_similitud = cosine_similarity(matriz_caracteristicas) 

#Calcula la similitud del coseno entre todas las canciones usando sus características numéricas


Creación y presentación final del dataset con columnas y filas

In [87]:
similarity_df = pd.DataFrame(matriz_similitud, index=df['títulos'], columns=df['títulos'])
similarity_df.head()

títulos,Blinding Lights,Bad Guy,Uptown Funk,Stay,Levitating,Shape of You,Believer,Lose Yourself,Sicko Mode,Senorita,...,Save Your Tears,As It Was,Montero,Telepatía,Staying Alive,Shivers,Señales,Todo de Ti,Me Porto Bonito,Ojitos Lindos
títulos,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Blinding Lights,1.0,0.999985,0.999863,0.999999,0.999886,0.999813,0.999913,0.999966,0.999501,0.999966,...,0.999888,0.999999,0.999995,0.999822,0.999936,0.999994,0.999891,0.999927,0.999729,0.999657
Bad Guy,0.999985,1.0,0.999907,0.999979,0.999953,0.999904,0.999935,0.999945,0.999626,0.999996,...,0.999921,0.999975,0.999964,0.999876,0.999941,0.999998,0.99992,0.99994,0.999809,0.999754
Uptown Funk,0.999863,0.999907,1.0,0.99985,0.999914,0.99989,0.999922,0.999871,0.99976,0.999913,...,0.999925,0.999842,0.999821,0.99991,0.999904,0.999892,0.999917,0.999916,0.999879,0.999848
Stay,0.999999,0.999979,0.99985,1.0,0.99987,0.999794,0.999903,0.999964,0.999472,0.999957,...,0.999876,1.0,0.999998,0.999806,0.999929,0.99999,0.99988,0.99992,0.999709,0.999633
Levitating,0.999886,0.999953,0.999914,0.99987,1.0,0.999991,0.999904,0.999837,0.999778,0.999976,...,0.99991,0.99986,0.999835,0.999903,0.99988,0.999933,0.999901,0.999894,0.999881,0.999856


In [88]:
def recomendar_canciones(cancion):
    """
    Devuelve las 3 canciones más similares a la canción dada,
    basándose en la matriz de similitud del coseno.

    Parámetros:
    - cancion (str): Título de la canción de referencia.

    Retorna:
    - pandas.Series: Canciones más parecidas con su puntuación de similitud.
    """
    similar_song = similarity_df[cancion].drop(cancion) # eliminamos la original
    recomendaciones = similar_song.sort_values(ascending=False).head(4)
    return recomendaciones

In [89]:
cancion_usuario = 'Uptown Funk'
print(f"Recomendaciones basadas en: {cancion_usuario}")
print(recomendar_canciones(cancion_usuario ))

Recomendaciones basadas en: Uptown Funk
títulos
Save Your Tears    0.999925
Believer           0.999922
Butter             0.999921
Dakiti             0.999919
Name: Uptown Funk, dtype: float64
