#Spotify Danceability vs Valence Mini Analysis

This project is a beginner-friendly exploratory data analysis of Spotify audio features.  
We focus on the relationship between **danceability**, **valence (happiness)**, and **energy**.

The dataset used is from TidyTuesday and contains audio features of popular songs.

Tools used:
- Python
- Pandas
- Matplotlib
- Google Colab


In [None]:
import pandas as pd
url = "https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2020/2020-01-21/spotify_songs.csv"
df = pd.read_csv(url)
df.head()

In [None]:
display(df.shape)
display(df.dtypes)
display(df.describe())

## Dataset Overview

Before diving into analysis, we check the size, column types, and summary statistics of the dataset.


In [None]:
df["energy"].describe()

In [None]:
df["valence"].describe()

In [None]:
# Import matplotlib for plotting
import matplotlib.pyplot as plt

# Plotting the distribution of energy levels in songs
plt.hist(df["energy"].dropna(), bins=3, color="orchid", edgecolor="black")
plt.title("Energy Distribution of Spotify Songs")
plt.xlabel("Energy Level (0-1)")
plt.ylabel("Number of Songs")
plt.show()

In [None]:
import matplotlib.pyplot as plt

# Plotting the distribution of valence (happiness) levels
plt.figure(figsize=(10, 6))
plt.hist(df["valence"].dropna(), bins=30,  color="teal", edgecolor="black")
plt.title("Valence Distribution of Spotify Songs")
plt.xlabel("Valence (0 = Sad, 1 = Happy)")
plt.ylabel("Number of Songs")
plt.grid(True)
plt.show()

In [None]:
# Displaying summary statistics for danceability and valence
print("Danceability:")
print(df["danceability"].describe())

print("\nValence:")
print(df["valence"].describe())

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(10,6))
plt.scatter(df["danceability"], df["valence"], color ="pink", alpha=0.5)
plt.title("Danceability vs Valence")
plt.xlabel("Danceability (0 = Not Danceable, 1 = Very Danceable)")
plt.ylabel("Valence (0 = Sad, 1 = Happy)")
plt.grid(True)
plt.show()

### Spotify Danceability vs Valence Analysis
In this scatter plot:
- X-axis shows the danceability of songs (how suitable they are for dancing)
- Y-axis shows the valence of songs (how happy or positive they sound)
- Color indicates the energy level of each song:
  - Lighter/brighter colors represent high-energy songs  
  - Darker/purple tones indicate low-energy songs  



In [None]:
plt.figure(figsize=(7,4))
plt.scatter(df["danceability"], df["valence"], c = df["energy"], cmap = "magma", alpha=0.5)
plt.title("Danceabiliity vs Valence (Colored by Energy)")
plt.xlabel("Danceability (0 = Not Danceable, 1 = Very Danceable)")
plt.ylabel("Valence (0 = Sad, 1 = Happy)")
plt.colorbar(label="Energy")
plt.show()


##Project Summary
In this mini project, we explored the relationship between danceability and valence(happiness) in songs.Using a spotify dataset, we vvisualized how these attributes vary and whether high danceabilitycorresponds to high valence.
We also added energy as a third variable to enhance insight through color coding.

### Key Insights:
- Most songs have danceability between 0.4 and 0.8
- High danceability doesn't always mean high happiness.
- Some songs are danceable but low energy (e.g., chill music).

This project was my first attempt at exploring data visually using pandas and matplotlib.
