# Data Processing with NumPy and Pandas
# Top 50 Spotify Tracks- 2020

https://intra.turingcollege.com/hardskills/data-processing-with-numpy-and-pandas-v3
https://www.kaggle.com/datasets/atillacolak/top-50-spotify-tracks-2020

Learning project for Data Processing with NumPy and Pandas.

### Requirements and tasks
Download the data from Spotify Top 50 Tracks of 2020 dataset.
- Load the data using Pandas.
- Perform data cleaning by:
- Handling missing values.
- Removing duplicate samples and features.
- Treating the outliers.
- Perform exploratory data analysis. Your analysis should provide answers to these questions:
- How many observations are there in this dataset?
- How many features this dataset has?
- Which of the features are categorical?
Which of the features are numeric?
Are there any artists that have more than 1 popular track? If yes, which and how many?
Who was the most popular artist?
How many artists in total have their songs in the top 50?
Are there any albums that have more than 1 popular track? If yes, which and how many?
How many albums in total have their songs in the top 50?
Which tracks have a danceability score above 0.7?
Which tracks have a danceability score below 0.4?
Which tracks have their loudness above -5?
Which tracks have their loudness below -8?
Which track is the longest?
Which track is the shortest?
Which genre is the most popular?
Which genres have just one song on the top 50?
How many genres in total are represented in the top 50?
Which features are strongly positively correlated?
Which features are strongly negatively correlated?
Which features are not correlated?
How does the danceability score compare between Pop, Hip-Hop/Rap, Dance/Electronic, and Alternative/Indie genres?
How does the loudness score compare between Pop, Hip-Hop/Rap, Dance/Electronic, and Alternative/Indie genres?
How does the acousticness score compare between Pop, Hip-Hop/Rap, Dance/Electronic, and Alternative/Indie genres?
Provide clear explanations in your notebook. Your explanations should inform the reader what you are trying to achieve, the results you got, and what these results mean.
Provide suggestions for how your analysis could be improved.

### Load the data using Pandas

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv('spotifytoptracks.csv', index_col=0)

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 50 entries, 0 to 49
Data columns (total 16 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   artist            50 non-null     object 
 1   album             50 non-null     object 
 2   track_name        50 non-null     object 
 3   track_id          50 non-null     object 
 4   energy            50 non-null     float64
 5   danceability      50 non-null     float64
 6   key               50 non-null     int64  
 7   loudness          50 non-null     float64
 8   acousticness      50 non-null     float64
 9   speechiness       50 non-null     float64
 10  instrumentalness  50 non-null     float64
 11  liveness          50 non-null     float64
 12  valence           50 non-null     float64
 13  tempo             50 non-null     float64
 14  duration_ms       50 non-null     int64  
 15  genre             50 non-null     object 
dtypes: float64(9), int64(2), object(5)
memory usage: 6.

In [4]:
df.head()

Unnamed: 0,artist,album,track_name,track_id,energy,danceability,key,loudness,acousticness,speechiness,instrumentalness,liveness,valence,tempo,duration_ms,genre
0,The Weeknd,After Hours,Blinding Lights,0VjIjW4GlUZAMYd2vXMi3b,0.73,0.514,1,-5.934,0.00146,0.0598,9.5e-05,0.0897,0.334,171.005,200040,R&B/Soul
1,Tones And I,Dance Monkey,Dance Monkey,1rgnBhdG2JDFTbYkYRZAku,0.593,0.825,6,-6.401,0.688,0.0988,0.000161,0.17,0.54,98.078,209755,Alternative/Indie
2,Roddy Ricch,Please Excuse Me For Being Antisocial,The Box,0nbXyq5TXYPCO7pr3N8S4I,0.586,0.896,10,-6.687,0.104,0.0559,0.0,0.79,0.642,116.971,196653,Hip-Hop/Rap
3,SAINt JHN,Roses (Imanbek Remix),Roses - Imanbek Remix,2Wo6QQD1KMDWeFkkjLqwx5,0.721,0.785,8,-5.457,0.0149,0.0506,0.00432,0.285,0.894,121.962,176219,Dance/Electronic
4,Dua Lipa,Future Nostalgia,Don't Start Now,3PfIrDoz19wz7qK7tYeu62,0.793,0.793,11,-4.521,0.0123,0.083,0.0,0.0951,0.679,123.95,183290,Nu-disco


50 rows and columns of types: float64(9), int64(2), object(5)

### Handling missing values.

In [5]:
df[df.isna().any(axis=1)]

Unnamed: 0,artist,album,track_name,track_id,energy,danceability,key,loudness,acousticness,speechiness,instrumentalness,liveness,valence,tempo,duration_ms,genre


No missing data

### Removing duplicate samples and features.

In [6]:
df[df.duplicated()]

Unnamed: 0,artist,album,track_name,track_id,energy,danceability,key,loudness,acousticness,speechiness,instrumentalness,liveness,valence,tempo,duration_ms,genre


No duplicates

### Treating the outliers.

In [7]:
df.describe()

Unnamed: 0,energy,danceability,key,loudness,acousticness,speechiness,instrumentalness,liveness,valence,tempo,duration_ms
count,50.0,50.0,50.0,50.0,50.0,50.0,50.0,50.0,50.0,50.0,50.0
mean,0.6093,0.71672,5.72,-6.2259,0.256206,0.124158,0.015962,0.196552,0.55571,119.69046,199955.36
std,0.154348,0.124975,3.709007,2.349744,0.26525,0.116836,0.094312,0.17661,0.216386,25.414778,33996.122488
min,0.225,0.351,0.0,-14.454,0.00146,0.029,0.0,0.0574,0.0605,75.801,140526.0
25%,0.494,0.6725,2.0,-7.5525,0.0528,0.048325,0.0,0.09395,0.434,99.55725,175845.5
50%,0.597,0.746,6.5,-5.9915,0.1885,0.07005,0.0,0.111,0.56,116.969,197853.5
75%,0.72975,0.7945,8.75,-4.2855,0.29875,0.1555,2e-05,0.27125,0.72625,132.317,215064.0
max,0.855,0.935,11.0,-3.28,0.934,0.487,0.657,0.792,0.925,180.067,312820.0


In [8]:
num_cols = df.describe().columns

In [9]:
outliers_mask = abs(df[num_cols] - df[num_cols].mean()) > (3 * df[num_cols].std())

In [10]:
df[outliers_mask.any(axis=1)]

Unnamed: 0,artist,album,track_name,track_id,energy,danceability,key,loudness,acousticness,speechiness,instrumentalness,liveness,valence,tempo,duration_ms,genre
2,Roddy Ricch,Please Excuse Me For Being Antisocial,The Box,0nbXyq5TXYPCO7pr3N8S4I,0.586,0.896,10,-6.687,0.104,0.0559,0.0,0.79,0.642,116.971,196653,Hip-Hop/Rap
19,Future,High Off Life,Life Is Good (feat. Drake),1K5KBOgreBi5fkEHvg5ap3,0.574,0.795,2,-6.903,0.067,0.487,0.0,0.15,0.537,142.053,237918,Hip-Hop/Rap
24,Billie Eilish,everything i wanted,everything i wanted,3ZCTVFBt2Brf31RLEnCkWJ,0.225,0.704,6,-14.454,0.902,0.0994,0.657,0.106,0.243,120.006,245426,Pop
41,Black Eyed Peas,Translation,RITMO (Bad Boys For Life),4NCsrTzgVfsDo8nWyP8PPc,0.704,0.723,10,-7.088,0.0259,0.0571,0.00109,0.792,0.684,105.095,214935,Pop
49,Travis Scott,ASTROWORLD,SICKO MODE,2xLMifQCjDGFmkHkpNLD9h,0.73,0.834,8,-3.714,0.00513,0.222,0.0,0.124,0.446,155.008,312820,Hip-Hop/Rap


There are values lying outside of 3 sigmas. All values look adequate, let's count as there are no outliers in the dataset.

## Perform exploratory data analysis. Your analysis should provide answers to these questions:

### How many observations are there in this dataset?

In [11]:
num_obs = df.shape[0]

In [12]:
f'There are {df.shape[0]} observations'

'There are 50 observations'

### How many features this dataset has?

In [13]:
print(f'there are {df.shape[1]} features in the dataset')

there are 16 features in the dataset


### Which of the features are categorical?

In [14]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 50 entries, 0 to 49
Data columns (total 16 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   artist            50 non-null     object 
 1   album             50 non-null     object 
 2   track_name        50 non-null     object 
 3   track_id          50 non-null     object 
 4   energy            50 non-null     float64
 5   danceability      50 non-null     float64
 6   key               50 non-null     int64  
 7   loudness          50 non-null     float64
 8   acousticness      50 non-null     float64
 9   speechiness       50 non-null     float64
 10  instrumentalness  50 non-null     float64
 11  liveness          50 non-null     float64
 12  valence           50 non-null     float64
 13  tempo             50 non-null     float64
 14  duration_ms       50 non-null     int64  
 15  genre             50 non-null     object 
dtypes: float64(9), int64(2), object(5)
memory usage: 6.

In [15]:
cols_categorical = [col for col in df.columns if df[col].dtype == "object"]

In [16]:
print(f' There are not numeric features in the dataset: {cols_categorical}. However only one can be marked as truely categorical.')

 There are not numeric features in the dataset: ['artist', 'album', 'track_name', 'track_id', 'genre']. However only one can be marked as truely categorical.


Categorical feature: 'genre'

### Which of the features are numeric?

In [17]:
f'Numeric features: {set(df.columns) - set(cols_categorical)}'

"Numeric features: {'acousticness', 'valence', 'key', 'danceability', 'instrumentalness', 'energy', 'liveness', 'duration_ms', 'tempo', 'speechiness', 'loudness'}"

### Are there any artists that have more than 1 popular track? If yes, which and how many?

In [64]:
# good practice:
(df
  ['artist']
  .value_counts()
  .loc[lambda x: x > 1]
)

artist
Dua Lipa         3
Billie Eilish    3
Travis Scott     3
Harry Styles     2
Lewis Capaldi    2
Justin Bieber    2
Post Malone      2
Name: count, dtype: int64

In [18]:
tracks_by_artists = df['artist'].value_counts()

In [19]:
artists_with_multiple_songs = tracks_by_artists[tracks_by_artists>1]

In [20]:
f'{len(artists_with_multiple_songs)} artists with more than one track in the list: {artists_with_multiple_songs.to_dict()}'

"7 artists with more than one track in the list: {'Dua Lipa': 3, 'Billie Eilish': 3, 'Travis Scott': 3, 'Harry Styles': 2, 'Lewis Capaldi': 2, 'Justin Bieber': 2, 'Post Malone': 2}"

### Who was the most popular artist?

In [61]:
tracks_by_artists.max()

np.int64(3)

In [62]:
tracks_by_artists[tracks_by_artists == tracks_by_artists.max()]

artist
Dua Lipa         3
Billie Eilish    3
Travis Scott     3
Name: count, dtype: int64

The most popular artists (with 3 songs in the list): 'Dua Lipa', 'Billie Eilish', 'Travis Scott'

### How many artists in total have their songs in the top 50?

In [23]:
len(df['artist'].unique())

40

40 artists in total have their songs in the top 50

### Are there any albums that have more than 1 popular track? If yes, which and how many?

In [24]:
artist_album = df.groupby(['artist', 'album'])['track_name'].count()

In [25]:
print(f"albums that have more than 1 popular track and tracks number:\n{artist_album[artist_album > 1]}")

albums that have more than 1 popular track and tracks number:
artist         album               
Dua Lipa       Future Nostalgia        3
Harry Styles   Fine Line               2
Justin Bieber  Changes                 2
Post Malone    Hollywood's Bleeding    2
Name: track_name, dtype: int64


### How many albums in total have their songs in the top 50?

In [26]:
df['album'].nunique()

45

albums in total have their songs in the top 50: 45

### Which tracks have a danceability score above 0.7?

In [27]:
danceability_above_0_7 = df[df['danceability']>0.7][['track_name', 'danceability']].reset_index(drop=True)

In [28]:
print(f"{df['track_name'][df['danceability']>0.7].count()} tracks have a danceability score above 0.7:\n{danceability_above_0_7}")

32 tracks have a danceability score above 0.7:
                                       track_name  danceability
0                                    Dance Monkey         0.825
1                                         The Box         0.896
2                           Roses - Imanbek Remix         0.785
3                                 Don't Start Now         0.793
4                    ROCKSTAR (feat. Roddy Ricch)         0.746
5                death bed (coffee for your head)         0.726
6                                         Falling         0.784
7                                            Tusa         0.803
8                                 Blueberry Faygo         0.774
9                        Intentions (feat. Quavo)         0.806
10                                   Toosie Slide         0.830
11                                         Say So         0.787
12                                       Memories         0.764
13                     Life Is Good (feat. Drake)        

### Which tracks have a danceability score below 0.4?

In [29]:
danceability_below_0_4 = df[df['danceability']<0.4][['track_name', 'danceability']].reset_index(drop=True)

In [65]:
print(f"{len(danceability_below_0_4)} track/s have a danceability score below 0.4:\n{danceability_below_0_4}")

1 track/s have a danceability score below 0.4:
             track_name  danceability
0  lovely (with Khalid)         0.351


### Which tracks have their loudness above -5?

In [31]:
loudness_above_neg_5 = df['track_name'][df['loudness']>-5]

In [67]:
print(f"{len(loudness_above_neg_5)} tracks have their loudness above -5:\n{loudness_above_neg_5.reset_index(drop=True)}")

19 tracks have their loudness above -5:
0                                   Don't Start Now
1                                  Watermelon Sugar
2                                              Tusa
3                                           Circles
4                                     Before You Go
5                                            Say So
6                                         Adore You
7                            Mood (feat. iann dior)
8                                    Break My Heart
9                                          Dynamite
10                 Supalonely (feat. Gus Dapperton)
11                  Rain On Me (with Ariana Grande)
12    Sunflower - Spider-Man: Into the Spider-Verse
13                                            Hawái
14                                          Ride It
15                                       goosebumps
16                                          Safaera
17                                         Physical
18                      

### Which tracks have their loudness below -8?

In [33]:
loudness_below_neg_8 = df['track_name'][df['loudness']<-8]

In [34]:
print(f"{len(loudness_below_neg_8)} tracks have their loudness below -8:\n{loudness_below_neg_8.reset_index(drop=True)}")

9 tracks have their loudness below -8:
0                  death bed (coffee for your head)
1                                           Falling
2                                      Toosie Slide
3                  Savage Love (Laxed - Siren Beat)
4                               everything i wanted
5                                           bad guy
6                               HIGHEST IN THE ROOM
7                              lovely (with Khalid)
8    If the World Was Ending - feat. Julia Michaels
Name: track_name, dtype: object


### Which track is the longest?

In [35]:
df[['track_name','duration_ms']].iloc[df['duration_ms'].idxmax()].to_frame()

Unnamed: 0,49
track_name,SICKO MODE
duration_ms,312820


'SICKO MODE' is the longest in the list with 312 secs (5 min 12 s)

### Which track is the shortest?

In [36]:
df[['track_name','duration_ms']].iloc[df['duration_ms'].idxmin()].to_frame()

Unnamed: 0,23
track_name,Mood (feat. iann dior)
duration_ms,140526


'Mood (feat. iann dior)' is the shortest song: 141 secs (2 min 21 s)

### Which genre is the most popular?

In [37]:
genre_count = df['genre'].value_counts()

In [38]:
genre_count

genre
Pop                                   14
Hip-Hop/Rap                           13
Dance/Electronic                       5
Alternative/Indie                      4
R&B/Soul                               2
 Electro-pop                           2
R&B/Hip-Hop alternative                1
Nu-disco                               1
Pop/Soft Rock                          1
Pop rap                                1
Hip-Hop/Trap                           1
Dance-pop/Disco                        1
Disco-pop                              1
Dreampop/Hip-Hop/R&B                   1
Alternative/reggaeton/experimental     1
Chamber pop                            1
Name: count, dtype: int64

In [39]:
genre_count[genre_count == genre_count.iloc[0]]

genre
Pop    14
Name: count, dtype: int64

POP genre has the greatest number songs in the list: 14.

### Which genres have just one song on the top 50?

In [40]:
genre_count[genre_count == 1].index

Index(['R&B/Hip-Hop alternative', 'Nu-disco', 'Pop/Soft Rock', 'Pop rap',
       'Hip-Hop/Trap', 'Dance-pop/Disco', 'Disco-pop', 'Dreampop/Hip-Hop/R&B',
       'Alternative/reggaeton/experimental', 'Chamber pop'],
      dtype='object', name='genre')

In [41]:
genre_count[genre_count == 1].count()

np.int64(10)

10 genres have just one song on the top 50: 'R&B/Hip-Hop alternative', 'Nu-disco', 'Pop/Soft Rock', 'Pop rap', 'Hip-Hop/Trap', 'Dance-pop/Disco', 'Disco-pop', 'Dreampop/Hip-Hop/R&B', 'Alternative/reggaeton/experimental', 'Chamber pop'

### How many genres in total are represented in the top 50?

In [42]:
df['genre'].nunique()

16

16 genres in total are represented in the top 50

### Which features are strongly positively correlated?

In [43]:
cols_numeric = [col for col in df.columns if df[col].dtype != "object"]

In [44]:
cor_matrix = df[cols_numeric].corr()

In [45]:
cor_matrix

Unnamed: 0,energy,danceability,key,loudness,acousticness,speechiness,instrumentalness,liveness,valence,tempo,duration_ms
energy,1.0,0.152552,0.062428,0.79164,-0.682479,0.074267,-0.385515,0.069487,0.393453,0.075191,0.081971
danceability,0.152552,1.0,0.285036,0.167147,-0.359135,0.226148,-0.017706,-0.006648,0.479953,0.168956,-0.033763
key,0.062428,0.285036,1.0,-0.009178,-0.113394,-0.094965,0.020802,0.278672,0.120007,0.080475,-0.003345
loudness,0.79164,0.167147,-0.009178,1.0,-0.498695,-0.021693,-0.553735,-0.069939,0.406772,0.102097,0.06413
acousticness,-0.682479,-0.359135,-0.113394,-0.498695,1.0,-0.135392,0.352184,-0.128384,-0.243192,-0.241119,-0.010988
speechiness,0.074267,0.226148,-0.094965,-0.021693,-0.135392,1.0,0.028948,-0.142957,0.053867,0.215504,0.366976
instrumentalness,-0.385515,-0.017706,0.020802,-0.553735,0.352184,0.028948,1.0,-0.087034,-0.203283,0.018853,0.184709
liveness,0.069487,-0.006648,0.278672,-0.069939,-0.128384,-0.142957,-0.087034,1.0,-0.033366,0.025457,-0.090188
valence,0.393453,0.479953,0.120007,0.406772,-0.243192,0.053867,-0.203283,-0.033366,1.0,0.045089,-0.039794
tempo,0.075191,0.168956,0.080475,0.102097,-0.241119,0.215504,0.018853,0.025457,0.045089,1.0,0.130328


In [46]:
cor_matrix_stacked = cor_matrix.unstack()

In [47]:
cor_matrix_stacked = cor_matrix_stacked[cor_matrix_stacked != 1]

In [48]:
cor_matrix_stacked

energy       danceability        0.152552
             key                 0.062428
             loudness            0.791640
             acousticness       -0.682479
             speechiness         0.074267
                                   ...   
duration_ms  speechiness         0.366976
             instrumentalness    0.184709
             liveness           -0.090188
             valence            -0.039794
             tempo               0.130328
Length: 110, dtype: float64

In [49]:
correlation_dict_mirrored = cor_matrix_stacked.sort_values().to_dict()

In [50]:
correlation_dict = {}

In [51]:
for item in correlation_dict_mirrored:
    if (item[1], item[0]) not in correlation_dict:
        correlation_dict.update({item: correlation_dict_mirrored[item]})

In [52]:
strong_positive_correl = {key:value for key, value in correlation_dict.items() if value>=0.7}
strong_positive_correl

{('energy', 'loudness'): 0.7916395653045617}

('energy', 'loudness') are strongly positively correlated: 0.7916395653045617}

### Which features are strongly negatively correlated?

In [53]:
strong_negative_correl = {key:value for key, value in correlation_dict.items() if value<=-0.7}
strong_negative_correl

{}

('energy', 'acousticness'): -0.6824785203241528
is close to count as strongly correlated

### Which features are not correlated?

In [54]:
[{item: correlation_dict[item]} for item in correlation_dict if abs(correlation_dict[item])<=0.1]

[{('key', 'speechiness'): -0.09496505735843172},
 {('duration_ms', 'liveness'): -0.09018826695099239},
 {('liveness', 'instrumentalness'): -0.0870339124461283},
 {('loudness', 'liveness'): -0.06993949725852708},
 {('valence', 'duration_ms'): -0.03979436283824896},
 {('duration_ms', 'danceability'): -0.033763480296644874},
 {('valence', 'liveness'): -0.03336630340518706},
 {('loudness', 'speechiness'): -0.021692935459647147},
 {('danceability', 'instrumentalness'): -0.01770638521729678},
 {('acousticness', 'duration_ms'): -0.010988051809892976},
 {('key', 'loudness'): -0.009178410631968104},
 {('liveness', 'danceability'): -0.006648475599485623},
 {('key', 'duration_ms'): -0.003345303142861897},
 {('tempo', 'instrumentalness'): 0.01885267572734445},
 {('instrumentalness', 'key'): 0.020802356350748005},
 {('tempo', 'liveness'): 0.025456740041450245},
 {('instrumentalness', 'speechiness'): 0.028948017426321342},
 {('valence', 'tempo'): 0.04508867269936379},
 {('speechiness', 'valence'): 0

Not correlated features are shown above. Pairs with correlation within -0.1 to 0.1 are considered as not correlated here.

### How does the danceability score compare between Pop, Hip-Hop/Rap, Dance/Electronic, and Alternative/Indie genres?

In [55]:
danceability_genre = df.groupby('genre')['danceability'].agg(['min', 'mean', 'median', 'max'])

In [56]:
danceability_genres_interested = ['Pop', 'Hip-Hop/Rap', 'Dance/Electronic', 'Alternative/Indie']

In [57]:
danceability_genre.loc[danceability_genres_interested]

Unnamed: 0_level_0,min,mean,median,max
genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Pop,0.464,0.677571,0.69,0.806
Hip-Hop/Rap,0.598,0.765538,0.774,0.896
Dance/Electronic,0.647,0.755,0.785,0.88
Alternative/Indie,0.459,0.66175,0.663,0.862


danceability doen't differ much for 'Pop', 'Hip-Hop/Rap', 'Dance/Electronic', 'Alternative/Indie' genres.

### How does the acousticness score compare between Pop, Hip-Hop/Rap, Dance/Electronic, and Alternative/Indie genres?

In [58]:
acousticness_genre = df.groupby('genre')['acousticness'].agg(['min', 'mean', 'median', 'max'])

In [59]:
acousticness_genre.loc[danceability_genres_interested]

Unnamed: 0_level_0,min,mean,median,max
genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Pop,0.021,0.323843,0.259,0.902
Hip-Hop/Rap,0.00513,0.188741,0.145,0.731
Dance/Electronic,0.0137,0.09944,0.0686,0.223
Alternative/Indie,0.291,0.5835,0.646,0.751


Alternative/Indie genre has more acousticness score in average (0.583500) from specified genres. However maximum score has Pop genre song: 0.902.

## Results

In [60]:
print(f"An average data of track in the dataset:\n{df.describe().loc['mean'].to_frame()}")

An average data of track in the dataset:
                           mean
energy                 0.609300
danceability           0.716720
key                    5.720000
loudness              -6.225900
acousticness           0.256206
speechiness            0.124158
instrumentalness       0.015962
liveness               0.196552
valence                0.555710
tempo                119.690460
duration_ms       199955.360000


the most popular genre according to the list is POP.

The dataset with 50 rows and 16 features. No missing data, no duplicates. There are values lying outside of 3 sigmas, but look to be adequate. So, dataset is clean.

32 tracks have a danceability score above 0.7

19 tracks have their loudness above -5

2 genres are the leaders in the dataset: Pop with 14 songs, Hip-Hop/Rap with 13.
The others: Dance/Electronic                       5
Alternative/Indie                      4
R&B/Soul                               2
 Electro-pop                           2

16 genres in total are represented in the top 50