# Top 50 Spotify Songs - 2019
**Top 50 songs listened in 2019 on spotify**

# Content:
<hr>
* [Introduction to my Notebook](#Introduction-to-my-Notebook)
* [Peek at the dataset](#Peek-at-the-dataset)
* [Prominent Singers](#Prominent-Singers)
* [Prominent Genres](#Prominent-Genres)
* [Barplot of features](#Barplot-of-features)
* [Histogram of features](#Histogram-of-features)
* [Danceability](#Danceability)
* **[Best Dance Songs of 2017](#Best-Dance-Songs-of-2017)** ***<span style="color:red">Required Table</span>***


# Introduction to my Notebook

When “Funkytown” comes on at a wedding, you can’t help but dance, right? What about Nelly’s “Hot in Herre” or “Bad Girls” by Donna Summer?

If your answer to any of those is no, you have defied computer science(Also, I don’t want you at my party.) Those songs are among the most “danceable” number-one hits in the history of pop.

Similarly, this notebook will introduce you to the best party songs which are amongst the most popular songs of 2019.  

In [None]:
import numpy as np 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from plotly import __version__
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import cufflinks as cf
cf.go_offline()
init_notebook_mode(connected=True)
%matplotlib inline
from PIL import Image


import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))



# Peek at the dataset

In [None]:
dataset=pd.read_csv("../input/top50spotify2019/top50.csv",encoding='ISO-8859-1')
dataset.head()

In [None]:
dataset.columns=[each.split()[0] if(len(each.split())>=2) else each.replace(".","_") for each in dataset.columns]

In [None]:
print("Are there any Missing Data? :",dataset.isnull().any().any())
print(dataset.isnull().sum())

# Prominent Singers

In [None]:
from wordcloud import WordCloud
plt.style.use('seaborn')
wrds1 = dataset["Artist_Name"].str.split("(").str[0].value_counts().keys()
from matplotlib.colors import LinearSegmentedColormap
colors = ["#000000", "#111111", "#101010", "#121212", "#212121", "#222222"]
cmap = LinearSegmentedColormap.from_list("mycmap", colors)
wc1 = WordCloud(scale=5,max_words=1000,colormap=cmap,background_color="white").generate(" ".join(wrds1))
plt.figure(figsize=(12,18))
plt.imshow(wc1,interpolation="bilinear")
plt.axis("off")
plt.title("Most Featured Artists in the Top 50 list ",color="r",fontsize=30)
plt.show()

Seems like Ed Sheeren, Shawn, Chainsmokers were the best music artists in 2019.

# Prominent Genres 

In [None]:
print("\n\nDifferent Genres in Dataset:\n")
print("There are {} different values\n".format(len(dataset.Genre.unique())))
print(dataset.Genre.unique())

In [None]:
print(type(dataset['Genre']))
popular_genre = dataset.groupby('Genre').size()
popular_genre = popular_genre.sort_values(ascending=False)
popular_genre
genre_list = dataset['Genre'].values.tolist()
genre_top10 = popular_genre[0:10,]
genre_top10 = genre_top10.sort_values(ascending=True)
genre_top10 = pd.DataFrame(genre_top10, columns = [ 'Number of Songs'])
genre_top10

In [None]:
plt.figure(figsize=(16,8))


ax = sns.barplot(x = genre_top10.index ,y = 'Number of Songs' , data = genre_top10, orient = 'v', palette = sns.color_palette("muted", 20), saturation = 0.8)

plt.title("Top 10 Genres among the top 50 songs of 2019",fontsize=30)
plt.ylabel('Number of Songs', fontsize=25)
plt.xlabel('Genre', fontsize=10)




plt.show

# Barplot of features

In [None]:
#Horizontal bar plot
Genre_lists=list(dataset['Genre'].unique())
BeatPerMinute=[]
Energy_=[]
share_Dance=[]
Acousticness=[]
#share_trust=[]
for each in Genre_lists:
    region=dataset[dataset['Genre']==each]
    BeatPerMinute.append(sum(region.Beats_Per_Minute)/len(region))
    Energy_.append(sum(region.Energy)/len(region))
    share_Dance.append(sum(region.Danceability)/len(region))
    Acousticness.append(sum(region.Acousticness__)/len(region))
    #share_trust.append(sum(region.Trust)/len(region))
#Visualization
f,ax = plt.subplots(figsize = (9,5))
sns.set_color_codes("pastel")
sns.barplot(x=BeatPerMinute,y=Genre_lists,color='g',label="Beat Per Minute")
sns.barplot(x=Energy_,y=Genre_lists,color='b',label="Energy")
sns.barplot(x=share_Dance,y=Genre_lists,color='c',label="Danceability")
sns.barplot(x=Acousticness,y=Genre_lists,color='y',label="Acousticness")
#sns.barplot(x=share_trust,y=region_lists,color='r',label="Trust")
ax.legend(loc="lower right",frameon = True)
ax.set(xlabel='Added features value', ylabel='Genre',title = "Top Genres Similarity")
plt.show()

In [None]:
sns.catplot(x = "Length_", y = "Genre", kind = "bar" ,palette = "pastel",
            edgecolor = ".6",data = dataset)

Dancing on a single song for a while often gets boring. The plot gives us insights about the length of the songs, so that we can add short songs to our album. Genre bars with dashed line on them are the ones which have atleast 2 songs of their kind in the top 50 list.  

# Histogram of features


In [None]:
dataset.columns
plt.figure(figsize=(10,6))
sns.heatmap(dataset[['Beats_Per_Minute','Energy','Danceability','Liveness','Valence_','Length_','Acousticness__','Speechiness_','Popularity']].corr(),annot=True)

From the obtained Histogram, we can see that none of the two features are closely related.


# Danceability

The next plot is kernel density estimate (KDE) plot showing danceability of the top 50 songs. We attempt to distinguish the songs as `Danceable` and `Non-Danceable`.

In [None]:
sns.set_style(style='dark')
sns.kdeplot(data=dataset['Danceability'], shade=True)

We catogorize the songs having "Danceability">=75 as `Danceable` songs, else `Non-Danceable`.

In [None]:
# Set conditions
D=dataset['Danceability']>=75
Nd=(dataset['Danceability']<75)


In [None]:
# Create DataFrame 
data=[D.sum(),Nd.sum()]
Dance=pd.DataFrame(data,columns=['percent'],
                   index=['Danceable','Non-Danceable'])
Dance

# Best Dance Songs of 2017

Here is the list containing the best party songs that entered the top 50 spotify songs-2019 as well.

In [None]:
dataset[['Track_Name','Artist_Name','Genre','Danceability']].sort_values(by='Danceability',ascending=False).head(24)

## <font style="color:#bd0a6d">This is my very first notebook and I'm still working on it.. <br> So, please feel free to give your feedback!