For a given dataset of people with gender, age and genre of music that they prefer to listen to, You have to train a ML model, so that for a new person with a specific age and gender, your model will be able to predict what genre of music he/she will most likely prefer.

The dataset is shared in the csv format.

0 represents female and 1 represents male.

In [216]:
# Importing all the libraries

import pandas as pd
from sklearn import tree
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split

In [217]:
# Reading the dataset

df = pd.read_csv('music.csv')

df.head()

Unnamed: 0,age,gender,genre
0,20,1,HipHop
1,23,1,HipHop
2,25,1,HipHop
3,26,1,Jazz
4,29,1,Jazz


In [218]:
# Checking for null values

df.isnull().sum()

age       0
gender    0
genre     0
dtype: int64

In [219]:
# Getting the shape of the dataset

df.shape

(18, 3)

In [220]:
# Printing the types of Genres

print(df['genre'].value_counts())

Classical    6
Dance        3
Acoustic     3
HipHop       3
Jazz         3
Name: genre, dtype: int64


In [221]:
# Getting the datatypes for the dataset

df.dtypes

age        int64
gender     int64
genre     object
dtype: object

In [222]:
# Creating object for LabelEncoder

labEnc = LabelEncoder()

In [223]:
# Encoding the genres into different labels to train the Model

df['gnr'] = labEnc.fit_transform(df['genre'])

df.head()

Unnamed: 0,age,gender,genre,gnr
0,20,1,HipHop,3
1,23,1,HipHop,3
2,25,1,HipHop,3
3,26,1,Jazz,4
4,29,1,Jazz,4


In [224]:
# Dropping the columns to fit the model with x

x = df.drop(['genre', 'gnr'], axis=1)

x.head()

Unnamed: 0,age,gender
0,20,1
1,23,1
2,25,1
3,26,1
4,29,1


In [225]:
# Getting y

y = df['gnr']

y.head()

0    3
1    3
2    3
3    4
4    4
Name: gnr, dtype: int32

In [226]:
# Creating an object for DecisionTree Classifer

model = tree.DecisionTreeClassifier()

In [227]:
# Spliting the dataset into train and test datasets

x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.2)

In [228]:
# Training the model with x_train and y_train

model.fit(x_train, y_train)

DecisionTreeClassifier()

In [229]:
# Creating a Dictionary to store the Genre Labels and Genres as key-value pairs respectively.

gn = dict()

for i in range(len(df)):
    if(gn.keys() != df.gnr[i]):
        gn[df.gnr[i]] = df.genre[i]

gn

{3: 'HipHop', 4: 'Jazz', 1: 'Classical', 2: 'Dance', 0: 'Acoustic'}

In [230]:
# Taking input of the Age and Gender and predicting the genre from the model

age = input("Enter the Age: ")
gender = input("Enter the Gender: ")

print('Genre Favored by the Person is:',gn[model.predict([[age, gender]])[0]])

Enter the Age: 22
Enter the Gender: 1
Genre Favored by the Person is: HipHop


In [231]:
# Getting the score of the model

model.score(x_test, y_test)

1.0