# Music Genre Classification with Decision Tree Classifier

This file demonstrates the steps to build a simple music genre classification model using a Decision Tree Classifier. The model is trained to predict music genres based on demographic features, such as age and gender, and evaluates its performance on a test set.

## Install or Update Scikit-Learn
This command installs or updates Scikit-Learn, a machine learning library in Python that provides tools for model building, evaluation, and data processing.

In [None]:
pip install pandas

## Import Libraries
These imports include:

- **Pandas** for data handling.
- **DecisionTreeClassifier** for creating and training a decision tree model.
- **train_test_split** to split data into training and testing sets.
- **accuracy_score** to evaluate model performance.

In [None]:
pip install -U scikit-learn

## Load the Dataset
Loads the music.csv file into a DataFrame df and displays its contents to inspect the data.

In [None]:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

## Data Summary
Displays summary statistics for numerical columns in the DataFrame, which helps in understanding data distribution and identifying potential issues.

In [None]:
df = pd.read_csv("data/music.csv")
df

## Data Summary
Displays summary statistics for numerical columns in the DataFrame, which helps in understanding data distribution and identifying potential issues.

In [None]:
df.describe()

## View Data Values
Returns the underlying values of the DataFrame as a NumPy array, which can be useful for certain types of numerical analysis.

In [None]:
df.values

## Define Features
Creates a DataFrame X containing all columns except genre, which will be used as model input features.

In [None]:
X = df.drop(columns=['genre'])
X

## Define Target Variable
Creates a Series y containing the target variable genre, which the model will be trained to predict.

In [None]:
y = df['genre']
y

## Train Decision Tree Model
1. Initialises a DecisionTreeClassifier.
2. Splits the data into training and test sets, with 20% of the data reserved for testing.
3. Fits the model on the training data.
4. Generates predictions for the test set and displays the predicted genre labels.

In [None]:
model = DecisionTreeClassifier()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model.fit(X_train, y_train) 
predictions = model.predict(X_test)
predictions

## View Original Data Again
Displays the DataFrame df again for reference, showing the original data.

In [None]:
df

## Predict Genre for a Sample Input
Predicts the genre for a sample user with an age of 24 and a gender value of 0 (presumably male). This is a test of how the model performs on new data.

In [None]:
maleTwenty = model.predict([ [24, 0] ])
maleTwenty

## Model Accuracy Score
Calculates and displays the model’s accuracy on the test set. The accuracy score helps in assessing how well the model predicts genres for new data.

In [None]:
score = accuracy_score(y_test, predictions)
score