## Machine Learning

A guided project by [Mosh](https://www.youtube.com/c/programmingwithmosh), an introduction to machine learning with Python.

### Real Problem

Imagine we have an online music store, when our users signed up, we asked their age and gender, and based on their profile, we recommend various music albums their likely to buy.

So in this project, we want to use machine learning to increase sales. So, we want to build a model, we feed this model with some sample data based on the existing users. Our model will learn the patterns in our data, so we can ask it to make predictions. 

When a new user signs up, we tell our model:

"Hey, we have a new user with this profile. What is the kind of music that this user is interested in, our model will say jazz, or hip hop or whatever."

And based on that we can make suggestions to the user. 

### Steps:
1. Import the Data
2. Clean the Data
3. Split the Data into Training/Test Sets
4. Create a Model
5. Train the Model
6. Make Predictions
7. Evaluate and Improve

### Loading Libraries

In [12]:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

### Import data

In [13]:
music_data = pd.read_csv('data/music.csv')
music_data

Unnamed: 0,age,gender,genre
0,20,1,Hiphop
1,23,1,Hiphop
2,25,1,Hiphop
3,26,1,Jazz
4,29,1,Jazz
5,30,1,Jazz
6,31,1,Classical
7,33,1,Classical
8,37,1,Classical
9,20,0,Dance


**Note:<br>**
On gender column<br>
**1** = Male<br>
**0** = Female

### Preparing the data

Split this data into two (2):
- input set - first two columns (age, gender)
- output set - last column (genre)

The output set which is in this case the genre column contains the predictions.

In [14]:
# Create Input set

inputset = music_data.drop(columns = ['genre'])
inputset

Unnamed: 0,age,gender
0,20,1
1,23,1
2,25,1
3,26,1
4,29,1
5,30,1
6,31,1
7,33,1
8,37,1
9,20,0


In [15]:
# Create Output set
outputset = music_data['genre']
outputset

0        Hiphop
1        Hiphop
2        Hiphop
3          Jazz
4          Jazz
5          Jazz
6     Classical
7     Classical
8     Classical
9         Dance
10        Dance
11        Dance
12     Acoustic
13     Acoustic
14     Acoustic
15    Classical
16    Classical
17    Classical
Name: genre, dtype: object

### Learning and Predicting

#### Build a model

In [16]:
model = DecisionTreeClassifier()
model.fit(inputset.values, outputset)

#### Predict music genre

Suggest music genre for:
- Male: 21 years old
- Female: 22 years old


Using table: Assuming we have large amount of new users 

In [17]:
# Create new dataframe

d = {
    'age': [21,22],
    'gender': [1,0]
    }

new_users = pd.DataFrame(d)
new_users

Unnamed: 0,age,gender
0,21,1
1,22,0


In [18]:
age = new_users['age']
gender = new_users['gender']

new_users['genre'] = model.predict([age, gender])
new_users

Unnamed: 0,age,gender,genre
0,21,1,Hiphop
1,22,0,Dance


### Calculating accuracy

In [19]:
i = inputset
o = outputset

i_train, i_test, o_train, o_test = train_test_split(i, o, test_size = 0.2)

model = DecisionTreeClassifier()
model.fit(i_train,o_train)
predictions = model.predict(i_test)

score = accuracy_score(o_test, predictions)
score

1.0

#### Summary

Suggested genre of music
- Hiphop for 21 year old Male
- Dance for 22 year old Female