<a href="https://colab.research.google.com/github/KaterynaRovinska/Study-practice/blob/main/Confidence_Intervals_%26_Probability_Spotify.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The dataset

The dataset contains information about 41,099 unique songs found on the popular music streaming service Spotify. The data describing these songs was collected from the Spotify API and merged with data from the Billboard API. All songs in the dataset were released between the 1960s and the 2010s. Spotify algorithmically generates ratings for track features like tempo, acousticness, valence, etc. 

**The dataset contains:**

10 numerical variables, with different probability measures like danceability, energy or speechiness of the songs
6 integer values, as the key or the mode of the songs
4 string variables, as the name of the song or the artists

**The variables are the following:**

track: The Name of the track.

artist: The Name of the Artist.

uri: The resource identifier for the track.

danceability: Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

energy: Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.

key: The estimated overall key of the track. Integers map to pitches using standard Pitch Class notation. E.g. 0 = C, 1 = C?/D?, 2 = D, and so on. If no key was detected, the value is -1.

loudness: The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db.

mode: Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.

speechiness: Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks.

acousticness: A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic. 

instrumentalness: Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0. 

liveness: Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.

valence: A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

tempo: The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.

duration_ms: The duration of the track in milliseconds.

time_signature: An estimated overall time signature of a track. The time signature (meter) is a notational convention to specify how many beats are in each bar (or measure).

chorus_hit: This the the author's best estimate of when the chorus would start for the track. Its the timestamp of the start of the third section of the track. This feature was extracted from the data received by the API call for Audio Analysis of that particular track.

sections: The number of sections the particular track has. This feature was extracted from the data received by the API call for Audio Analysis of that particular track.

target: The target variable for the track. It can be either '0' or '1'. '1' implies that this song has featured in the weekly list (Issued by Billboards) of Hot-100 tracks in that decade at least once and is therefore a 'hit'. '0' Implies that the track is a 'flop'.

In [None]:
# Base
import numpy as np
import pandas as pd
import math
from scipy.stats import norm
import scipy.stats as stats

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('ggplot')

# Estimation and Models
import statsmodels.stats.proportion as smp       # Estimation of proportions
import statsmodels.stats.weightstats as smw      # Estimation of the Mean
from statsmodels.stats.power import TTestPower   # Power of the Test

In [None]:
from google.colab import drive
drive.mount('Kateryna')

Mounted at Kateryna


In [None]:
music = pd.read_csv('/content/Kateryna/MyDrive/genre_music.csv')

In [None]:
music.head(5)

Unnamed: 0,track,artist,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,duration_s,time_signature,chorus_hit,sections,popularity,decade,genre
0,Jealous Kind Of Fella,Garland Green,0.417,0.62,3,-7.727,1,0.0403,0.49,0.0,0.0779,0.845,185.655,173.533,3,32.94975,9,1,60s,edm
1,Initials B.B.,Serge Gainsbourg,0.498,0.505,3,-12.475,1,0.0337,0.018,0.107,0.176,0.797,101.801,213.613,4,48.8251,10,0,60s,pop
2,Melody Twist,Lord Melody,0.657,0.649,5,-13.392,1,0.038,0.846,4e-06,0.119,0.908,115.94,223.96,4,37.22663,12,0,60s,pop
3,Mi Bomba Sonó,Celia Cruz,0.59,0.545,7,-12.058,0,0.104,0.706,0.0246,0.061,0.967,105.592,157.907,4,24.75484,8,0,60s,pop
4,Uravu Solla,P. Susheela,0.515,0.765,11,-3.515,0,0.124,0.857,0.000872,0.213,0.906,114.617,245.6,4,21.79874,14,0,60s,r&b


In [None]:
music.shape

(41099, 20)

# Liveness

Considering the sample size in each decade, we can use the normal approximation

The sample size for each decade is large enough (at least 30).

In [None]:
# Group the data by decade and count the number of songs in each decade
sample_sizes = music.groupby('decade')['track'].count()

# Print the sample sizes for each decade
print(sample_sizes)

decade
00s    5871
10s    6396
60s    8642
70s    7764
80s    6907
90s    5519
Name: track, dtype: int64


**Finding 98% Confidence Intervals of the `liveness` of all the songs per decade.**


To calculate the 98% confidence intervals of the liveness of all songs per decade, we would need to have the sample mean and standard deviation for each decade.

Formula:

X̄ ± Z×σ/√n

SE =σ/√n

Z-value for the chosen confidence level,
X̄ is the sample mean,
σ is the standard deviation
n is the sample size.

In [None]:
# Data -----------------------
alpha = 0.02

# Decades --------------------
decades = ['00s', '10s', '60s', '70s', '80s', '90s']

# Confidence Intervals -------
for decade in decades:
  # Desccriptive ---------------
  liveness = smw.DescrStatsW(music[music['decade'] == decade]['liveness'])

  if liveness.nobs > 30:
    print(f'Using the Normal Approximation for {decade}')
    low, up = liveness.zconfint_mean(alpha)
  else:
    print(f'Using the t-Distribution for {decade}')
    low, up = liveness.tconfint_mean(alpha)

  print(f'The {1-alpha:2.0%} CI for the liveness for all songs in {decade} is\
  [{low:4.4f}, {up:4.4f}]')


Using the Normal Approximation for 00s
The 98% CI for the liveness for all songs in 00s is  [0.1912, 0.2010]
Using the Normal Approximation for 10s
The 98% CI for the liveness for all songs in 10s is  [0.1919, 0.2016]
Using the Normal Approximation for 60s
The 98% CI for the liveness for all songs in 60s is  [0.2093, 0.2178]
Using the Normal Approximation for 70s
The 98% CI for the liveness for all songs in 70s is  [0.1950, 0.2046]
Using the Normal Approximation for 80s
The 98% CI for the liveness for all songs in 80s is  [0.1959, 0.2061]
Using the Normal Approximation for 90s
The 98% CI for the liveness for all songs in 90s is  [0.1918, 0.2027]


# Danceability

It is usually assumed that songs written in the minor modality are sadder songs than those written in the major modality. Let's analyze this from the perspective of the danceability of the songs.

**Finding the 99% confidence intervals of the danceability of all the songs with respect to their mode**

X̄ ± Z×σ/√n

SE =σ/√n

Z-value for the 99% confidence level, X̄ is the mean for modes, σ is the standard deviation of modes, n is the sample size.

In [None]:
# Data MAJOR MODE -----------------------
alpha = 0.01

# Desccriptive ---------------
danceability1 = smw.DescrStatsW(music[music['mode'] == 1]['danceability'])

# Confidence Intervals -------
if danceability1.nobs > 30:
  print('Using the Normal Approximation')
  low, up = danceability1.zconfint_mean(alpha)
else:
  print('Using the t-Distribution')
  low, up = danceability1.tconfint_mean(alpha)

print(f'The {1-alpha:1.0%} CI for the danceability with regards to major mode is \
[{low}, {up}]')


# Data MINOR MODE-----------------------
alpha = 0.01

# Desccriptive ---------------
danceability0 = smw.DescrStatsW(music[music['mode'] == 0]['danceability'])

# Confidence Intervals -------
if danceability0.nobs > 30:
  print('Using the Normal Approximation')
  low, up = danceability0.zconfint_mean(alpha)
else:
  print('Using the t-Distribution')
  low, up = danceability0.tconfint_mean(alpha)

print(f'The {1-alpha:1.0%} CI for the danceability with regards to minor mode is \
[{low}, {up}]')


Using the Normal Approximation
The 99% CI for the danceability with regards to major mode is [0.5331865426912457, 0.5384368702995959]
Using the Normal Approximation
The 99% CI for the danceability with regards to minor mode is [0.5440791010214676, 0.5527945698244303]


The confidence intervals for minor and major mode danceability suggest that the difference in danceability between the two modes is relatively small. Based on this analysis, it is unlikely that the minor modality is inherently sadder than the major modality simply because minor mode songs are less danceable.

# Tempo

Finding p-values of the tests for each decade provided H0 = mean tempo <=120 and H1 = mean tempo is greater than 120

In [None]:
# Data -------------------------------------
decades = ['00s', '10s', '60s', '70s', '80s', '90s']
alpha = 0.05
mu0 = 120

# Test -------------------------------------
for decade in decades:
    tempo = smw.DescrStatsW(music[music['decade'] == decade]['tempo'])
    if tempo.nobs > 30:
        print(f'{decade}: Using the Normal approximation')
        zstat, pval = tempo.ztest_mean(mu0, alternative='larger')
        print(f'Statistic: {zstat:4.4f}')
    else:
        print(f'{decade}: Using the t-distribution')
        tstat, pval, dof = tempo.ttest_mean(mu0, alternative='larger')
        print(f'Statistic: {tstat:4.4f}')

    print(f'Significance Level: {alpha:0.05}')
    print(f'p-value: {pval:4.4%}')
    print('Reject H0' if pval < alpha else 'Fail to Reject H0')
    print('------------------------------------------------')

00s: Using the Normal approximation
Statistic: 4.0709
Significance Level: 0.05
p-value: 0.0023%
Reject H0
------------------------------------------------
10s: Using the Normal approximation
Statistic: 6.3015
Significance Level: 0.05
p-value: 0.0000%
Reject H0
------------------------------------------------
60s: Using the Normal approximation
Statistic: -15.5136
Significance Level: 0.05
p-value: 100.0000%
Fail to Reject H0
------------------------------------------------
70s: Using the Normal approximation
Statistic: -3.0769
Significance Level: 0.05
p-value: 99.8954%
Fail to Reject H0
------------------------------------------------
80s: Using the Normal approximation
Statistic: 2.0046
Significance Level: 0.05
p-value: 2.2505%
Reject H0
------------------------------------------------
90s: Using the Normal approximation
Statistic: -2.9323
Significance Level: 0.05
p-value: 99.8318%
Fail to Reject H0
------------------------------------------------


Considering just 10s decade and suppose that a relevant case is when the everage tempo of a decade is 121 bpm

We are looking to find:
* The probability that you may detect that tempo of 121 bpm if that was truly the case
* The probabilitie of type II error
* The sample size needed to detect this tempo with a probability of 99.99% and significance level of 0.01% 

The decision scheme now becomes

H0:{μ=120},H1:{μ≠120}

In [None]:
# Power Object ----------------------------------
pwr = TTestPower()
dec_sub = smw.DescrStatsW(music[music['decade'] == '10s']['tempo'])


# Data ------------------------------------------
mu0 = 120
mu1 = 121
alpha = 0.0001
nobs = dec_sub.nobs
std = dec_sub.std
effectsize = (mu0 - mu1)/std 
alt = 'two-sided'

# Power -----------------------------------------
power = pwr.solve_power(effect_size = effectsize,
                nobs = nobs,
                alpha = alpha,
                alternative = alt)

print(f"The probability of detecting a tempo of {mu1} bpm if it is truly the case is {power:.2%}")

The probability of detecting a tempo of 121 bpm if it is truly the case is 11.26%


The probability of making a type II error is the probability of failing to reject the null hypothesis when the true mean is actually different from the null hypothesis value (i.e., μ1 ≠ μ0).

In [None]:
# Parameters
mu0 = 120
mu1 = 121
alpha = 0.0001
nobs = dec_sub.nobs
std = dec_sub.std
effectsize = (mu1 - mu0) / std 
alt = 'two-sided'

# Power calculation
power = pwr.solve_power(effect_size=effectsize, nobs=nobs, alpha=alpha, alternative=alt)

# Type II error probability
beta = 1 - power
print(f"The probability of making a type II error is {beta:.2%}")


The probability of making a type II error is 88.74%


In [None]:
# Parameters
mu0 = 120
mu1 = 121
alpha = 0.0001
power = 0.9999
std = dec_sub.std
effectsize = (mu1 - mu0) / std
alt = 'two-sided'

# Sample size calculation
nobs = pwr.solve_power(effect_size=effectsize, alpha=alpha, power=power, alternative=alt)
print(f"The sample size required to detect a tempo of {mu1} bpm with a power of {power:.2%} and a significance level of {alpha} is {nobs:4.0f}")


The sample size required to detect a tempo of 121 bpm with a power of 99.99% and a significance level of 0.0001 is 51602


  return np.clip(_boost._nct_sf(x, df, nc), 0, 1)
  return np.clip(_boost._nct_cdf(x, df, nc), 0, 1)
  return np.clip(_boost._nct_sf(x, df, nc), 0, 1)
  return np.clip(_boost._nct_cdf(x, df, nc), 0, 1)
