## **2024년 발매된 1000곡의 특성 가져오기**

Spotify API를 사용해 플레이리스트를 분석하기 위한 초기 설정

In [None]:
CLIENT_ID = ''
CLIENT_SECRET = ''
REDIRECT_URI = 'http://localhost:8888/callback'

from spotify_analyzer import SpotifyPlaylistAnalyzer
import pandas
from spotipy.exceptions import SpotifyException
import time




Spotify API 인증을 설정하고, 인증된 클라이언트를 통해 Spotify 데이터에 접근할 수 있는 객체 `sp`를 생성

In [None]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
client_credentials_manager = SpotifyClientCredentials(client_id=CLIENT_ID, client_secret=CLIENT_SECRET)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

2024년에 발표된 트랙의 아티스트, 제목, 아이디, 인기 등의 정보를 각각의 리스트에 저장하는 코드

In [None]:
artist_name =[]
track_name = []
popularity =[]
artist_id =[]
track_id =[]
for i in range(0,1000,50):
    track_results = sp.search(q='year:2024', type='track', limit=50, offset=i)
    for i, t in enumerate(track_results['tracks']['items']):
        artist_name.append(t['artists'][0]['name'])
        artist_id.append(t['artists'][0]['id'])
        track_name.append(t['name'])
        track_id.append(t['id'])
        popularity.append(t['popularity'])

만들어진 리스트를 데이터프레임으로 변환

In [None]:
import pandas as pd
df_2024_song = pd.DataFrame({
    'artist_name': artist_name,
    'track_name': track_name,
    'track_id': track_id,
    'popularity': popularity,
    'artist_id': artist_id
})
print(f"DataFrame shape: {df_2024_song.shape}")
df_2024_song.head()

(1000, 5)


Unnamed: 0,artist_name,track_name,track_id,track_popularity,artist_id
0,Jimin,Who,7tI8dRuH2Yc6RuoTjxo4dU,92,1oSPZhvZMIrWW5I41kPkkY
1,ROSÉ,APT.,2vDkR3ctidSd17d2CygVzS,65,3eVa5w3URK5duf6eyVDbu9
2,aespa,Whiplash,6uPnrBgweGOcwjFL4ItAvV,80,6YVMFz59CuY7ngCxTxjpxE
3,Lim Young Woong,Warmth,3vnaEaDxMKdBhqA1t0uAwl,62,75MOYjGEyyH5U4ZFHOPvxR
4,Lim Young Woong,Home,2PlsVMcOn6ujc2UEYs2Yat,62,75MOYjGEyyH5U4ZFHOPvxR


2024년에 발표된 트랙들의 아티스트, 제목, 아이디, 인기 등의 정보를 각 리스트에 저장

In [None]:
artist_popularity, artist_genres, artist_followers = [], [], []

for a_id in df_2024_song['artist_id']:
    artist = sp.artist(a_id)
    artist_popularity.append(artist['popularity'])
    artist_genres.append(artist['genres'])
    artist_followers.append(artist['followers']['total'])

`df_2024_song` 데이터프레임에 아티스트의 인기(`artist_popularity`), 장르(`artist_genres`), 팔로워 수(`artist_followers`) 정보를 추가

In [None]:
df_2024_song['artist_popularity'] = artist_popularity
df_2024_song['artist_genres'] = artist_genres
df_2024_song['artist_followers'] = artist_followers
df_2024_song.head()

Unnamed: 0,artist_name,track_name,track_id,track_popularity,artist_id,artist_popularity,artist_genres,artist_followers
0,Jimin,Who,7tI8dRuH2Yc6RuoTjxo4dU,92,1oSPZhvZMIrWW5I41kPkkY,88,[k-pop],9730251
1,ROSÉ,APT.,2vDkR3ctidSd17d2CygVzS,65,3eVa5w3URK5duf6eyVDbu9,85,[k-pop],8127798
2,aespa,Whiplash,6uPnrBgweGOcwjFL4ItAvV,80,6YVMFz59CuY7ngCxTxjpxE,83,[k-pop girl group],7122044
3,Lim Young Woong,Warmth,3vnaEaDxMKdBhqA1t0uAwl,62,75MOYjGEyyH5U4ZFHOPvxR,63,[trot],100559
4,Lim Young Woong,Home,2PlsVMcOn6ujc2UEYs2Yat,62,75MOYjGEyyH5U4ZFHOPvxR,63,[trot],100559


트랙 ID에 대한 오디오 특성 데이터를 가져와 데이터프레임으로 변환

In [None]:
track_features = [
    feature for t_id in track_df['track_id']
    for feature in sp.audio_features(t_id) or []  # None일 경우 빈 리스트 처리
]

# 데이터프레임 생성 및 병합
columns = [
    'danceability', 'energy', 'key', 'loudness', 'mode', 'speechiness',
    'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo',
    'type', 'id', 'uri', 'track_href', 'analysis_url', 'duration_ms', 'time_signature'
]

# track_features를 한 번에 데이터프레임으로 변환
df_2024_feature = pd.DataFrame(track_features, columns=columns)

# 데이터 확인
df_2024_feature.head()

   danceability  energy  key  loudness  mode  speechiness  acousticness  \
0         0.660   0.756    0    -3.743     0       0.0320       0.00289   
1         0.778   0.786    0    -4.473     0       0.2590       0.02860   
2         0.856   0.901    8    -2.954     0       0.0455       0.09430   
3         0.354   0.232    5    -7.656     1       0.0329       0.79000   
4         0.586   0.816    6    -4.513     1       0.0456       0.00921   

   instrumentalness  liveness  valence    tempo            type  \
0          0.000000    0.1930    0.838  116.034  audio_features   
1          0.000000    0.3470    0.942  149.030  audio_features   
2          0.020600    0.0802    0.743  126.007  audio_features   
3          0.000000    0.1270    0.342  171.198  audio_features   
4          0.000001    0.0810    0.407  128.021  audio_features   

                       id                                   uri  \
0  7tI8dRuH2Yc6RuoTjxo4dU  spotify:track:7tI8dRuH2Yc6RuoTjxo4dU   
1  2vDkR3cti

`df_2024_feature` 데이터프레임에서 불필요한 열(`key`, `mode`, `type`, `uri`, `track_href`, `analysis_url`)을 삭제

In [None]:
cols_to_drop = ['key', 'mode', 'type', 'uri', 'track_href', 'analysis_url']
df_2024_feature.drop(columns=cols_to_drop, inplace=True)


print(df_2024_song.info())
print(df_2024_feature.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   artist_name        1000 non-null   object
 1   track_name         1000 non-null   object
 2   track_id           1000 non-null   object
 3   track_popularity   1000 non-null   int64 
 4   artist_id          1000 non-null   object
 5   artist_popularity  1000 non-null   int64 
 6   artist_genres      1000 non-null   object
 7   artist_followers   1000 non-null   int64 
dtypes: int64(3), object(5)
memory usage: 62.6+ KB
None
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 12 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   danceability      1000 non-null   float64
 1   energy            1000 non-null   float64
 2   loudness          1000 non-null   float64
 3   speechiness       1000 

`df_2024_feature`와 `df_2024_song` 데이터프레임을 `track_id`와 `id` 컬럼을 기준으로 병합하고, 병합된 데이터프레임에서 불필요한 `id` 컬럼을 제거

In [None]:
# track_df와 tf_df를 'track_id'와 'id' 컬럼을 기준으로 병합
feature_df = pd.merge(df_2024_song, df_2024_feature, left_on='track_id', right_on='id', how='inner')

# 'id' 컬럼은 불필요하므로 제거
feature_df = feature_df.drop(columns=['id'])

# 병합된 DataFrame 확인
print(feature_df.head())

# CSV 파일로 저장
output_file = "final_1000_features.csv"
feature_df.to_csv(output_file, index=False, encoding='utf-8')
print(f"병합된 데이터가 '{output_file}'로 저장되었습니다!")


       artist_name track_name                track_id  track_popularity  \
0            Jimin        Who  7tI8dRuH2Yc6RuoTjxo4dU                92   
1             ROSÉ       APT.  2vDkR3ctidSd17d2CygVzS                65   
2            aespa   Whiplash  6uPnrBgweGOcwjFL4ItAvV                80   
3  Lim Young Woong     Warmth  3vnaEaDxMKdBhqA1t0uAwl                62   
4  Lim Young Woong       Home  2PlsVMcOn6ujc2UEYs2Yat                62   

                artist_id  artist_popularity       artist_genres  \
0  1oSPZhvZMIrWW5I41kPkkY                 88             [k-pop]   
1  3eVa5w3URK5duf6eyVDbu9                 85             [k-pop]   
2  6YVMFz59CuY7ngCxTxjpxE                 83  [k-pop girl group]   
3  75MOYjGEyyH5U4ZFHOPvxR                 63              [trot]   
4  75MOYjGEyyH5U4ZFHOPvxR                 63              [trot]   

   artist_followers  danceability  energy  loudness  speechiness  \
0           9730251         0.660   0.756    -3.743       0.0320   
1   

2024년 발매된 노래들중 popularity 상위 20곡

In [None]:
feature_df.sort_values(by=['track_popularity'], ascending=False)[['track_name', 'artist_name']].head(20)


Unnamed: 0,track_name,artist_name
18,Die With A Smile,Lady Gaga
35,BIRDS OF A FEATHER,Billie Eilish
260,That’s So True,Gracie Abrams
118,"Good Luck, Babe!",Chappell Roan
331,Sailor Song,Gigi Perez
128,Taste,Sabrina Carpenter
181,WILDFLOWER,Billie Eilish
648,Si Antes Te Hubiera Conocido,KAROL G
0,Who,Jimin
132,Timeless (with Playboi Carti),The Weeknd


장르별 관계된 아티스트 수

In [None]:
def to_1D(series):
    return pd.Series([x for _list in series for x in _list])
to_1D(feature_df['artist_genres']).value_counts().head(20)

k-pop                       259
pop                         118
5th gen k-pop               113
k-pop boy group             102
k-pop girl group             86
rap                          61
hip hop                      47
korean pop                   33
k-rap                        33
j-pop                        32
anime                        28
viral pop                    27
canadian pop                 27
west coast rap               25
conscious hip hop            25
art pop                      22
korean r&b                   19
korean singer-songwriter     18
k-pop ballad                 17
singer-songwriter pop        15
Name: count, dtype: int64