# Top 50 ASEAN Song Charts August 2022 Analysis

#### By : <font color='blue'>Dimas Brahmantyo (https://github.com/dimasbramski472)</font>

The chart data is taken from 10 ASEAN countries including Brunei Darussalam, Cambodia, Indonesia, Laos, Malaysia, Myanmar, Philippines, Singapore, Thailand, and Vietnam as of August 26, 2022.

Objectives:
* See the correlation between each data variable.
* Distribution of Artists, Song Genres, and Song Years.
* Comparison between genres of songs on charts with the average of song position, duration, energy ratio, valence, speechiness, loudness, acousticness, popularity, and modes.


NB : The Spotify API Code can be found in the Spotify API.ipynb file

### <font color='teal'>A. Library Used</font>

* `pandas` is a Python library used for working with data sets. It has functions for analyzing, cleaning, exploring, and manipulating data.
* `pandasql` allows you to query pandas DataFrames using SQL syntax. It works similarly to sqldf in R .
* `NumPy` which stands for Numerical Python, is a library consisting of multidimensional array objects and a collection of routines for processing those arrays. Using NumPy, mathematical and logical operations on arrays can be performed.
* `Matplotlib` is a comprehensive library for creating static, animated, and interactive visualizations in Python.
* `Seaborn` has a similar function to Matplotlib. tt provides higher-level interface for drawing attractive and informative statistical graphics than Matplotlib.
* `Glob` is a Python library used to finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order.
* `Warnings` Warning messages are typically issued in situations where it is useful to alert the user of some condition in a program, where that condition (normally) doesn’t warrant raising an exception and terminating the program.

In [None]:
#install all libraries
!pip install pandas
!pip install pandasql
!pip install numpy
!pip install matplotlib
!pip install seaborn

In [None]:
#Import All of the libraries
import pandas as pd
import pandasql as ps
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import glob
import warnings
warnings.simplefilter("ignore")

### <font color='teal'>B. Understanding & Importing Data</font>

In this analysis process, 2 files are used which in file 1 contains analytical data from the song, which contains:
* `songid` : The Spotify ID for the track

* `song_title` : Track title

* `song_singer` : Singer of the track

* `song_album` : The title of the album in which the track is located

* `song_position` : Position of the track on the chart

* `song_country` : Country where the song chart takes place

* `song_bpm`: Beats per minute showing the tempo of the track

* `song_duration` : Duration of the track (in milliseconds)

* `genre`: Track genre

* `danceability`: Describes how suitable a track is for dancing based on a combination of musical elements including tempo,  rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

* `energy`: Measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy.

* `key`: The key the track is in. Integers map to pitches using standard "Pitch Class" notation. For more details regarding pitch class notation, see the following link (https://en.wikipedia.org/wiki/Pitch_class).

* `loudness` : The overall loudness of a track in decibels (dB). Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude).

* `mode`: Indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.

* `speechiness`: Detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value.

* `acousticness`: A confidence measure from 0.0 to 1.0 of whether the track is acoustic.

* `instrumentalness`: Predicts whether a track contains no vocals. "Ooh" and "aah" sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly "vocal".

* `liveness`: Detects the presence of an audience in the recording.

* `valence`: A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track.

* `popularity`: A range of values from 1-100 indicating the level of popularity of the track in the world.

In [None]:
#Import and Read the Dataset of Song Chart
data1 = pd.read_csv("C:/Users/HP/Downloads/dataset_music_chart_fix.csv", encoding ='unicode_escape')
#Unicode escape sequence is a backslash followed by the letter 'u' followed by four hexadecimal digits (0-9a-fA-F). 
#It matches a character in the target sequence with the value specified by the four digits.
data1

File 2 contains data related to singers on the chart, including:
* `singer_id` : The Spotify ID for the track singer
* `song_singer` : contains singer name data
* `singer_origin` : The origin of the singer of the track, whether from ASEAN countries or not.
* `singer_based_from` : The country where the singer of the track is domiciled or does his job.

In [None]:
data2 = pd.read_csv("C:/Users/HP/Downloads/dataset_singer_music_chart_ASEAN_August2022.csv", encoding = 'unicode_escape')
data2

### <font color='teal'>C. Selecting Data</font>

After we know what data is in data1 and data2, we can combine the two data into one query data. Here we use the pandasql library to unionize the two data.

NB. When using a conventional SQL program, combining the two data to retrieve the entire data can use a "full outer join". However, because pandasql uses the SQLite system as its Database Engine, which cannot perform the "full outer join" command, we can do a left join on both data, then union all of the two data that have been joined.

In [None]:
#Select the data
data = ps.sqldf("""
select data1.songid as songid, data2.singerid as singerid, data1.song_title as song_title, data1.song_singer as singer, data2.singer_origin as singer_origin, data2.singer_based_from as singer_based_from,
data1.song_album as album, data1.song_year as year, data1.genre as genre, data1.song_position as position, data1.song_country as country, data1.song_bpm as BPM, data1.song_duration as duration, data1.danceability as danceability, data1.energy as energy,
data1.key as key, data1.loudness as loudness, data1.mode as mode, data1.speechiness as speechiness, data1.acousticness as acousticness, data1.instrumentalness as instrumentalness, data1.liveness as liveness, data1.valence as valence, data1.popularity as popularity
from data1 left join data2 using(song_singer)
union all
select data1.songid as songid, data2.singerid as singerid, data1.song_title as song_title, data1.song_singer as singer, data2.singer_origin as singer_origin, data2.singer_based_from as singer_based_from,
data1.song_album as album,  data1.song_year as year, data1.genre as genre, data1.song_position as position, data1.song_country as country, data1.song_bpm as BPM, data1.song_duration as duration, data1.danceability as danceability, data1.energy as energy,
data1.key as key, data1.loudness as loudness, data1.mode as mode, data1.speechiness as speechiness, data1.acousticness as acousticness, data1.instrumentalness as instrumentalness, data1.liveness as liveness, data1.valence as valence, data1.popularity as popularity
from data2 left join data1 using(song_singer)
where data1.song_singer is null
""")
#Show selected data
data

### <font color='teal'>D. Reading the Data</font>

In [None]:
#Check if there are missing data values
data.isna().sum()

In [None]:
#The data types present in the 'data' dataframe
data.dtypes

In [None]:
#Brief description of the variables in the 'data' dataframe
data.describe()

In [None]:
ps.sqldf("""
select count (distinct songid) as song_unique_count from data1
""")

There are 349 different songs in the Top 50 Song Chart in 10 ASEAN countries,

In [None]:
ps.sqldf("""
select count (distinct singerid) as singer_unique_count from data2
""")

And there are 245 different artists in the Top 50 Song Chart in 10 ASEAN countries.

## <font color='teal'>E. Data Analysis</font>

# <font color='black'>Correlation Analysis</font>

* First, We look for the correlation between each data variable. Here we use a Correlation Map with a color gradient that shows how positive or negative the relationship between data variables is.

In [None]:
#Plotting Correlation Map Data
data_corr = data.corr()
plt.figure(figsize=(15,15))

#Make the Correlation Map using Seaborn
data_corr_map = sns.heatmap(data_corr, vmin = -0.8, vmax = 0.8, cmap = 'RdYlGn', linewidth = 0.4, annot = True)

#### <font color='teal'>Correlation Map Analysis :</font>

      * Valence has a positive correlation with danceability and track energy.
      * The biggest positive correlation was found in the relationship between energy and loudness of the track (0.76).
      * The most significant negative correlation was found in the relationship between energy and acousticness of the track (-0.69).

To see the spread of correlation data, we can use a scatter plot with a scatter line as a positive/negative indicator of the correlation of the data.

In [None]:
#Create multiple plot designs
fig, ax = plt.subplots(2,1, figsize=(10,10))

#Import the datas
x = data['valence']
y1 = data['danceability']
y2 = data['energy']

#Fit our data inside a polynomial function for Scatter line calculation
m, a = np.polyfit(x,y1,1)
n, b = np.polyfit(x,y2,1)

#Create scatter plot graphic design
sns.scatterplot(x = x, y = y1, ax = ax[0], color = 'crimson')
sns.lineplot(x = x, y = m*x+a, ax = ax[0], color =  '#233d0d',linewidth=4)

sns.scatterplot(x = x, y = y2, ax = ax[1], color = "orange")
sns.lineplot(x = x, y = n*x+b, ax = ax[1], color =  '#1f68e0',linewidth=4)

#Show the graph
plt.show()

In [None]:
fig, ax = plt.subplots(2,1, figsize=(10,10))

x = data['energy']
y3 = data['loudness']
y4 = data['acousticness']

a, b = np.polyfit(x,y3,1)

c, d = np.polyfit(x,y4,1)

sns.scatterplot(x = x, y = y3, ax = ax[0], color = 'olive')
sns.lineplot(x = x, y = a*x+b, ax = ax[0], color =  '#9400d3',linewidth=4)

sns.scatterplot(x = x, y = y4, ax = ax[1], color = "teal")
sns.lineplot(x = x, y = c*x+d, ax = ax[1], color =  '#800000',linewidth=4)

plt.show()

# <font color='black'>Artists and Genres Analysis</font>

After looking at the correlation between data variables, we will analyze the artists and genres of the tracks listed on the ASEAN Song Charts for August 2022.

* First we create a data query to see top 10 singers with the most tracks on the chart.

In [None]:
#Query Data
data['singer_obj'] = data['singerid'].astype('str')
top10_artist = data.groupby(['singer','genre','singer_origin','singer_based_from'],as_index=False)['singer_obj'].count()
data_top10_artist = top10_artist.nlargest(10,'singer_obj')
data_top10_artist

After knowing the top 10 artists on the ASEAN Charts, we can create a column chart based on the data that has been created previously.

In [None]:
#Determine the size of the plot chart image
plt.subplots(figsize=(40,24))

#Then, we define axis chart variables
x = data_top10_artist['singer']
y = data_top10_artist['singer_obj']

#Plotting the chart
plt.barh(x, y, height = 0.9, color = ['#5017c1','#5017c1','#7b2d7f','#92395c','#a7433d','#a7433d','#a26421','#8d7228','#6e8733','#4b9e3f','#17c150'])

# Add annotation to bars
for index, value in enumerate(y):
    plt.text(value, index, str(value), fontsize = 20, fontweight="bold")

#Insert Title, X-axis and Y-axis label
plt.title('Singer with Most Songs on ASEAN Music Charts\nin August 2022', fontname = 'Times New Roman', fontsize = 40, fontweight="bold")
plt.xlabel('\nTotal Songs', fontsize = 24, fontweight="bold")
plt.ylabel('\nArtist Name', fontsize = 24, fontweight="bold")

#Manage the X-ticks and Y-ticks
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=18, fontweight="bold")
plt.tick_params(left = False)

#Limit the X-axis
plt.xlim(xmin = 0, xmax = 29)

#Show the Chart
plt.show()

#### <font color='teal'> Analysis :</font>

    * Mark Tuan became the singer with the most songs on the Charts with a total of 24 songs.
    * All the singers who are included in the top 10 artists are from outside ASEAN; and 80% are from Pop and K-Pop genres.

* Next, we measure the comparison of artists from ASEAN countries and from foreign countries.

In [None]:
data['singer_origin_obj'] = data['singer_origin'].astype('str')

singer_origin_data = data.groupby(['singer_origin'], as_index = False)['singer_origin_obj'].count()
singer_origin_data

In [None]:
#Set the figure plot size
plt.figure(figsize=(20,10))

# create plot
myexplode=[0.1,0]
plt.pie(singer_origin_data['singer_origin_obj'],labels = singer_origin_data['singer_origin'] ,autopct='%1.1f%%', explode=myexplode ,startangle=90, textprops={'fontsize': 20,'color':'w'}, labeldistance=None)

# set label & title
plt.title('Comparison of Artists from ASEAN Countries and Foreign Countries\non ASEAN Song Charts August 2022\n', fontname = 'Times New Roman', fontsize = 20, fontweight="bold")

# add legend
plt.legend(bbox_to_anchor=(1.3,0.2), prop={'size': 14})

#Show the plot chart data
plt.show()

#### <font color='teal'>Analysis :</font>

    * Foreign Artist(s) dominate the number of artists origin by 77.8%, while ASEAN Countries Artist(s) make up 22.2% of the total artists in the data. 

* We create a data query to see the composition of the number of songs by genre.

In [None]:
#The distribution of the types of genres on the ASEAN song charts in August 2022
data['genre_obj'] = data['genre']
data_top = data.groupby(['genre'],as_index=False)['genre_obj'].count()
data_top

In [None]:
#resize figure chart
plt.figure(figsize=(30,25))

#input axis variable
x = data_top['genre']
y = data_top['genre_obj']

#Plotting the chart
plot3 = plt.bar(x,y, width = 0.65, color = ['#199536','#198943','#196f5d','#195f6e','#194d80','#19418d','#193995','#253692','#32338f','#462e8a','#632783','#702380','#7b207e','#951978','#95226f','#952e62','#953c53','#954f40','#957419'])
plt.bar_label(plot3, label_type = 'edge', fontsize = 25, fontweight = 'bold')

#Insert Title, X-axis and Y-axis label
plt.title('Comparison Graph of the Number of Songs on the ASEAN Charts in August 2022\nby Music Genre\n', fontname = 'Times New Roman', fontsize = 40, fontweight="bold")
plt.xlabel('Song Genres', fontsize = 30, fontweight="bold")
plt.ylabel('Number of Songs', fontsize = 30, fontweight="bold")

#Manage the X-ticks and Y-ticks
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the Chart
plt.show()

#### <font color='teal'>Analysis :</font>

       * There are 19 genre tracks on the ASEAN song charts in August 2022.
       * Pop and K-Pop tracks have the most significant number among other genres, where these two genres make up 55% of the total number of songs.
       * J-Pop, Nu Metal, and Reggae become the genre with the least number of songs with 1 song each.

In [None]:
top10_songs = data.groupby(['song_title','singer','genre'])['songid'].count()
top10_songs.sort_values(ascending=False).head(10)

#### <font color='teal'>Analysis :</font>

    Pink Venom became the most famous song on ASEAN Music Charts in August 2022 with distribution in 9 ASEAN countries.
   

* The distribution of the tracks by track year is indicated by a line chart.

In [None]:
data['year_obj']=data['year'].astype('int')
df_year = data.groupby(['year'],as_index=False)['year_obj'].count()
df_year

In [None]:
#Resize figure chart
plt.figure(figsize=(16,16))

#Input axis variable
x = df_year['year']
y = df_year['year_obj']

#Plotting the chart
plot2 = plt.plot(x,y)

#Axis Chart Setting
plt.yticks([1, 30, 60, 90, 120, 150, 180, 210, 240],['1','30','60','90','120','150','180','210','240'],fontweight="bold", fontsize = 14)
plt.xticks([1969, 1981, 1992, 2001, 2002, 2003, 2008, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022],['1969', '1981', '1992', '2001', '2002', '2003', '2008', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021', '2022'], rotation=90, fontweight="bold", fontsize = 14)
plt.ylim(ymin=-5, ymax=250)
plt.tick_params(bottom = False)

#Set the color, shape, and size of lines and markers
plt.setp(plot2, color='steelblue', linestyle='-',linewidth=3, marker='o', markersize=8, markerfacecolor='firebrick')

#Add Grid
plt.grid(color='grey', linestyle=':', linewidth=0.75)

#Add Title, X-axis and Y-axis label
plt.title('Spread of Number of Songs on ASIAN Charts for August 2022\n(By Year)\n', fontsize = 30, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nYear', fontsize = 20, fontweight = 'bold')
plt.ylabel('Total Song\n', fontsize = 20, fontweight = 'bold')

#Show chart
plt.show()

#### <font color='teal'>Analysis :</font>

    The tracks on the ASEAN Song Charts August 2022 are from 1969 to 2022, with the highest number of songs dating from 2022 that is as many as 219 tracks.

# <font color='black'>Analysis of Average Attributes in Track Genres</font>

   Here only track genres with more than 5 tracks are used, so that the analytical measurements can be more valid.

### <font color='dark brown'>Track Genre with Average Track Position on Chart</font>

In [None]:
#Query Data
data_genre_position = data.groupby(['genre'], as_index = False)['position'].mean()
data_genre_position = data_genre_position.drop([2,3,7,9,11,14,15,16,18])
data_genre_position

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_genre_position['genre']
y = data_genre_position['position']

#Visualize the data with Bar Chart
data_genre_position_plot = plt.bar(x,y, color = ['#db2a27','#db6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#800a70'])
plt.bar_label(data_genre_position_plot, label_type = 'edge', fontsize = 26, fmt='%.2f', fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Genres on ASEAN Charts in August 2022\nBased on Average Song Positions on Charts\n', fontsize = 40, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Genre', fontsize = 26, fontweight = 'bold')
plt.ylabel('Average Song Position on Charts\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.ylim(ymax = 40)
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the chart
plt.show()

#### <font color='teal'>Analysis :</font>

Tracks with the Pop Rock genre have the highest average chart ranking with an average of being at position 22.03 out of 50. While tracks with the Pop Rap genre have the lowest average chart ranking, which is 30.86 out of 50.

### <font color='dark brown'>Track Genre with Average Track Energy Ratio</font>

In [None]:
#Query Data
data_genre_energy = data.groupby(['genre'], as_index = False)['energy'].mean()
data_genre_energy = data_genre_energy.drop([2,3,7,9,11,14,15,16,18])
data_genre_energy

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_genre_energy['genre']
y = data_genre_energy['energy']

#Visualize the data with Bar Chart
data_genre_energy_plot = plt.bar(x,y, color = ['#db2a27','#dc6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#800a70'])
plt.bar_label(data_genre_energy_plot, label_type = 'edge', fontsize = 24, fmt='%.3f', fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Genres on ASEAN Charts in August 2022\nBased on Song Energy\n', fontsize = 36, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Genre', fontsize = 26, fontweight = 'bold')
plt.ylabel('Energy Ratio\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.ylim(ymax = 1)
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the Chart
plt.show()

#### <font color='teal'>Analysis :</font>

EDM is the track genre with the highest average energy ratio, followed by K-pop. Meanwhile, RnB/Soul is the track genre with the smallest energy ratio.

### <font color='dark brown'>Track Genre with Average Track Valence Ratio</font>

In [None]:
#Data Query
data_genre_valence = data.groupby(['genre'], as_index = False)['valence'].mean()
data_genre_valence = data_genre_valence.drop([2,3,7,9,11,14,15,16,18])
data_genre_valence

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_genre_valence['genre']
y = data_genre_valence['valence']

#Visualize the data with Bar Chart
data_genre_valence_plot = plt.bar(x,y, width = 0.9, color = ['#db2a27','#dc6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#800a70'])
plt.bar_label(data_genre_valence_plot, label_type = 'edge', fontsize = 24, fmt='%.4f', fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Genres on ASEAN Charts in August 2022\nBased on Song Valence\n', fontsize = 36, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Genre', fontsize = 26, fontweight = 'bold')
plt.ylabel('Valence Ratio\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.ylim(ymax = 1)
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the Chart
plt.show()

#### <font color='teal'>Analysis :</font>

Hip-hop/Rap became the genre with the highest average valence ratio; which shows the greatest positivity on the mood of the genre track. C-Pop is the genre with the lowest valence ratio among other genres.

### <font color='dark brown'>Track Genre with Average Track Duration</font>

In [None]:
#Data Query
data_genre_duration = data.groupby(['genre'], as_index = False)['duration'].mean()
data_genre_duration = data_genre_duration.drop([2,3,7,9,11,14,15,16,18])
data_genre_duration

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_genre_duration['genre']
y = data_genre_duration['duration']

#Visualize the data with Bar Chart
data_genre_duration_plot = plt.bar(x,y, width = 0.88, color = ['#db2a27','#dc6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#800a70'])
plt.bar_label(data_genre_duration_plot, label_type = 'edge', fontsize = 22, fmt='%.0f', fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Genres on ASEAN Charts in August 2022\nBased on Song Duration\n', fontsize = 36, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Genre', fontsize = 26, fontweight = 'bold')
plt.ylabel('Duration (ms)\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the Chart
plt.show()

#### <font color='teal'>Analysis :</font>

    The RnB/Soul genre has the longest average track duration, which is 248567 ms, while K-Pop has the fastest average track duration of 197075 ms.

### <font color='dark brown'>Track Genre with Average Track Speechiness Ratio</font>

In [None]:
#Query Data
data_genre_speechiness = data.groupby(['genre'], as_index = False)['speechiness'].mean()
data_genre_speechiness = data_genre_speechiness.drop([2,3,7,9,11,14,15,16,18])
data_genre_speechiness

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_genre_speechiness['genre']
y = data_genre_speechiness['speechiness']

#Visualize the data with Bar Chart
data_genre_speechiness_plot = plt.bar(x,y, width = 0.88, color = ['#db2a27','#dc6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#800a70'])
plt.bar_label(data_genre_speechiness_plot, label_type = 'edge', fontsize = 22, fmt='%.4f', fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Genres on ASEAN Charts in August 2022\nBased on Song Speechiness\n', fontsize = 36, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Genre', fontsize = 26, fontweight = 'bold')
plt.ylabel('Speechiness Ratio\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the Query Data Chart
plt.show()

#### <font color='teal'>Analysis :</font>

    Hip-Hop has the highest speechiness ratio compared to other genres.

### <font color='dark brown'>Track Genre with Average BPM</font>

In [None]:
#Query Data
data_genre_bpm = data.groupby(['genre'], as_index = False)['BPM'].mean()
data_genre_bpm = data_genre_bpm.drop([2,3,7,9,11,14,15,16,18])
data_genre_bpm

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_genre_bpm['genre']
y = data_genre_bpm['BPM']

#Visualize the data with Bar Chart
data_genre_bpm_plot = plt.bar(x,y, width = 0.88, color = ['#db2a27','#dc6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#800a70'])
plt.bar_label(data_genre_bpm_plot, label_type = 'edge', fontsize = 22, fmt='%.4f', fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Genres on ASEAN Charts in August 2022\nBased on Song BPM\n', fontsize = 36, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Genre', fontsize = 26, fontweight = 'bold')
plt.ylabel('Beats per Minute\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the Query Data Chart
plt.show()

#### <font color='teal'>Analysis :</font>

       Pop Rock has a higher BPM than any other genre, followed by Hip-hop/Rap. Meanwhile, Electropop is the genre with the lowest BPM.

### <font color='dark brown'>Track Genre with Average Loudness (dB)</font>

In [None]:
data_genre_loudness = data.groupby(['genre'], as_index = False)['loudness'].mean()
data_genre_loudness = data_genre_loudness.drop([2,3,7,9,11,14,15,16,18])
data_genre_loudness

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_genre_loudness['genre']
y = data_genre_loudness['loudness']

#Visualize the data with Bar Chart
data_genre_loudness_plot = plt.bar(x,y, width = 0.88, color = ['#db2a27','#dc6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#800a70'])
plt.bar_label(data_genre_loudness_plot, label_type = 'edge', fontsize = 26, fmt='%.3f', fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Genres on ASEAN Charts in August 2022\nBased on Song Loudness\n', fontsize = 36, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Genre', fontsize = 26, fontweight = 'bold')
plt.ylabel('Loudness (dB)\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.ylim(ymax = 0, ymin = -10)
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the Query Data Chart
plt.show()

#### <font color='teal'>Analysis :</font>

Pop genre tracks have the smallest loudness among other genres, followed by Alternative/Indie tracks; inversely proportional to the K-pop tracks that have the greatest loudness.

### <font color='dark brown'>Track Genre with Average Acousticness Ratio</font>

In [None]:
#Query Data
data_genre_acousticness = data.groupby(['genre'], as_index = False)['acousticness'].mean()
data_genre_acousticness = data_genre_acousticness.drop([2,3,7,9,11,14,15,16,18])
data_genre_acousticness

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_genre_acousticness['genre']
y = data_genre_acousticness['acousticness']

#Visualize the data with Bar Chart
data_genre_acousticness_plot = plt.bar(x,y, width = 0.88, color = ['#db2a27','#dc6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#800a70'])
plt.bar_label(data_genre_acousticness_plot, label_type = 'edge', fontsize = 26, fmt='%.4f', fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Genres on ASEAN Charts in August 2022\nBased on Song Acousticness\n', fontsize = 36, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Genre', fontsize = 26, fontweight = 'bold')
plt.ylabel('Acousticness\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.ylim(ymax = 1)
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the Query Data Chart
plt.show()

#### <font color='teal'>Analysis :</font>

   Pop tracks have the largest average acousticness ratio among other genres, followed by Alternative/Indie tracks; inversely proportional to the EDM tracks that have the smallest average ratio.

### <font color='dark brown'>Track Genre with Average Worldwide Popularity Score</font>

In [None]:
#Query Data
data_genre_popularity = data.groupby(['genre'], as_index = False)['popularity'].mean()
data_genre_popularity = data_genre_popularity.drop([2,3,7,9,11,14,15,16,18])
data_genre_popularity

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_genre_popularity['genre']
y = data_genre_popularity['popularity']

#Visualize the data with Bar Chart
data_genre_popularity_plot = plt.bar(x,y, width = 0.88, color = ['#db2a27','#dc6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#800a70'])
plt.bar_label(data_genre_popularity_plot, label_type = 'edge', fontsize = 26, fmt='%.4f', fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Genres on ASEAN Charts in August 2022\nBased on Song Popularity\n', fontsize = 36, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Genre', fontsize = 26, fontweight = 'bold')
plt.ylabel('Popularity\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.ylim(ymax = 100)
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks(fontsize=17, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

#Show the Query Data Chart
plt.show()

#### <font color='teal'>Analysis :</font>

Alternative/Indie genre tracks have the highest average popularity score, followed by Electropop. Meanwhile the C-Pop genre track has the lowest average popularity score.

### <font color='dark brown'>Track Genre with Song Modes</font>

In [None]:
plt.figure(figsize=(15,10))

#To see a comparison of the number of modes, a gradient color table is used.
data_genre_mode = data[['genre','mode']]
data_genre_mode = (data_genre_mode
        .groupby(['genre', 'mode'])['mode']
        .count()
        .unstack(1))

#The color gradient used is "Red, Yellow, and Green".
data_genre_mode = data_genre_mode.style.background_gradient(cmap='RdYlGn')

#Show Data Table
data_genre_mode

#### <font color='teal'>Analysis :</font>

    Pop genre tracks equally dominate the number of songs in both types of modes.

### <font color='dark brown'>Track Key Analysis</font>

In [None]:
#Query Data
data['key_obj'] = data['key'].astype('str')
data_key = data.groupby(['key'], as_index = False)['key_obj'].count()
data_key

In [None]:
#Resize figure chart
plt.figure(figsize=(30,25))

#Input axis variable
x = data_key['key']
y = data_key['key_obj']

#Visualize the data with Bar Chart
data_key_plot = plt.bar(x,y, width = 0.88, color = ['#db2a27','#dc6c27','#b37d1b','#9e9719','#9ec20e','#329906','#0b9c62','#174673','#3d0f94','#2392a1','#82714d','#524f4a'])
plt.bar_label(data_key_plot, label_type = 'edge', fontsize = 26, fontweight = 'bold')

#Add Title, X-axis and Y-axis label
plt.title('\nComparison Graph of Song Keys on ASEAN Charts in August 2022\n', fontsize = 36, fontname = 'Times New Roman', fontweight = 'bold')
plt.xlabel('\nSong Key', fontsize = 26, fontweight = 'bold')
plt.ylabel('Total Songs\n', fontsize = 26, fontweight = 'bold')

#Axis Chart Settings
plt.ylim(ymax = 100)
plt.yticks(fontsize=18, fontweight="bold")
plt.xticks([0,1,2,3,4,5,6,7,8,9,10,11], ['C','C♯,D♭','D','D♯,E♭','E','F','F♯,G♭','G','G♯,A♭','A','A♯,B♭','B'], fontsize=22, fontweight="bold", rotation = 75)
plt.tick_params(bottom = False)

plt.savefig('Comparison Graph of Song Keys on ASEAN Charts in August 2022.jpg', dpi=300)

#Show the Query Data Chart
plt.show()

#### <font color='teal'>Analysis :</font>

Tracks with the C key have the highest number of comparisons followed by the A key.

# <font color='k'>Conclusions</font>

Based on the data of the Top 50 ASEAN Song Charts that we have analyzed previously, the following information is obtained:

* There are 349 songs from 245 different artists on the song charts of 10 ASEAN countries.

* Mark Tuan became the singer with the most number of tracks as many as 24 tracks. "Pink Venom" from Black Pink became the song with the widest chart spread, namely in 9 out of 10 ASEAN countries.

* The tracks listed on the charts are from 1969 to 2022, with the music tracks published in 2022 having the highest number of tracks at 219.

* The tracks on the ASEAN song charts are divided into 19 genres, with Pop followed by K-pop having the highest number of genres.

* Based on correlation map analysis, the valence ratio of music tracks has the largest positive correlation to the ratio of danceability and energy. The most significant positive correlation is found in the relationship between energy ratio and loudness of the track, while the most significant negative correlation occurs in the relationship between energy ratio and acousticness ratio of the music track.

* The genre of a music track has a different effect on each attribute that affects a track.