## Statistical Validation of Retention Drivers

In this section, we test whether observed EDA patterns reflect real statistical
associations between audio features and chart retention, rather than noise.

Given the heavy-tailed nature of `weeks_on_chart`, we use:
- Log-transformed retention
- Rank-based (Spearman) correlations
- Non-parametric permutation tests


In [5]:
import numpy as np
import pandas as pd
from scipy.stats import spearmanr
import matplotlib.pyplot as plt


In [6]:
DATA_PATH = r"D:\Data_analysis_BI\Project\Spotify\data\spotify_top_songs_audio_features.csv"

df = pd.read_csv(DATA_PATH)

df.head(5)
df["log_weeks"] = np.log1p(df["weeks_on_chart"])


In [7]:
features = [
    "danceability",
    "energy",
    "acousticness",
    "valence",
    "loudness",
    "tempo",
    "speechiness",
    "instrumentalness",
    "liveness"
]


In [8]:
results = []

for feat in features:
    rho, p = spearmanr(df[feat], df["log_weeks"])
    results.append({
        "feature": feat,
        "spearman_rho": rho,
        "p_value": p
    })

stats_df = pd.DataFrame(results).sort_values(
    by="spearman_rho", key=np.abs, ascending=False
)

stats_df


Unnamed: 0,feature,spearman_rho,p_value
3,valence,0.115056,1.2247869999999999e-20
4,loudness,0.114984,1.2945809999999999e-20
6,speechiness,-0.081618,4.208452e-11
0,danceability,0.077593,3.604031e-10
8,liveness,-0.064847,1.624734e-07
1,energy,0.056337,5.382062e-06
5,tempo,-0.022512,0.06927073
7,instrumentalness,-0.01929,0.1195546
2,acousticness,-0.009642,0.436543


## Interpretation: Audio Features vs Retention (Spearman Correlation)

The table below reports Spearman rank correlations between audio features and
log-transformed chart retention (`log(weeks_on_chart + 1)`).

### Key observations

- All statistically significant correlations are **weak** in magnitude
  (|ρ| ≈ 0.05–0.12), indicating that **no single audio feature strongly determines longevity**.
- Despite small effect sizes, several features show **highly significant p-values**,
  confirming that the observed associations are not due to random noise.

### Features with consistent positive association
- **Valence (ρ ≈ 0.115)** and **Loudness (ρ ≈ 0.115)** show the strongest positive relationships.
  Tracks that sound more positive and louder tend to remain on the charts slightly longer.
- **Danceability** and **Energy** also show small but statistically reliable positive effects.

### Features with negative association
- **Speechiness** and **Liveness** are negatively correlated with retention.
  Tracks with more spoken-word content or live-recording characteristics tend to churn faster.

### Features with no detectable effect
- **Acousticness**, **Instrumentalness**, and **Tempo** show no statistically meaningful
  relationship with retention in this dataset.

### Interpretation

These results suggest that:
- Audio features provide **weak but real signals** about longevity.
- Retention is driven by **many interacting factors** (artist popularity, marketing,
  platform dynamics), not audio characteristics alone.
- This motivates **multivariate modeling** and **ranking-based objectives** rather than
  single-feature or mean-based prediction.

This pattern closely mirrors retention dynamics observed in games, streaming platforms,
and other media systems.


## Permutation Test: Are Correlations Beyond Random Chance?
Spearman correlations are reported to quantify effect size and direction,
while permutation tests are used to validate that these associations exceed
what would be expected under random alignment.


In [9]:
target = df["log_weeks"].values
n_perm = 1000
rng = np.random.default_rng(42)

results = []

for feat in features:
    x = df[feat].values

    # Observed Spearman correlation
    obs_rho, _ = spearmanr(x, target)

    # Permutation distribution
    perm_rhos = []
    for _ in range(n_perm):
        permuted_target = rng.permutation(target)
        rho, _ = spearmanr(x, permuted_target)
        perm_rhos.append(rho)

    perm_rhos = np.array(perm_rhos)

    # Two-sided permutation p-value
    p_perm = np.mean(np.abs(perm_rhos) >= np.abs(obs_rho))

    results.append({
        "feature": feat,
        "observed_spearman": obs_rho,
        "perm_p_value": p_perm
    })

perm_df = pd.DataFrame(results).sort_values(
    by="observed_spearman", key=lambda x: np.abs(x), ascending=False
)

perm_df

Unnamed: 0,feature,observed_spearman,perm_p_value
3,valence,0.115056,0.0
4,loudness,0.114984,0.0
6,speechiness,-0.081618,0.0
0,danceability,0.077593,0.0
8,liveness,-0.064847,0.0
1,energy,0.056337,0.0
5,tempo,-0.022512,0.09
7,instrumentalness,-0.01929,0.114
2,acousticness,-0.009642,0.425


### Permutation Test Interpretation

The permutation test confirms that several audio features exhibit correlations
with chart retention that are stronger than random chance.

- **Valence** and **Loudness** show the strongest and most robust associations.
- **Danceability, Energy, Speechiness, and Liveness** also survive permutation,
  indicating real but weak signal.
- **Tempo, Instrumentalness, and Acousticness** fail the permutation test,
  suggesting no meaningful relationship with retention.

Importantly, even the strongest effects remain small (|ρ| ≈ 0.1), reinforcing
that audio features alone cannot explain long-term chart longevity.
Their value lies in **constraining outcomes and supporting ranking or
risk-filtering decisions**, not in precise prediction.


## Transition to Modeling

The statistical analysis shows that audio features contain **real but weak**
signal with respect to chart longevity. While insufficient for precise
prediction, these features may still be useful for **risk-based filtering and
ranking**.

In the next stage, we evaluate whether combining these weak signals in a
multivariate model can:
- distinguish short-lived tracks from those with sustained retention
- identify feature interactions not visible in univariate analysis

This shifts the focus from explaining outcomes to supporting practical
decision-making.