# Use Machine Learning to increase sales (Music Streaming)

In this project we have an online music store. When our users signup, we ask their **age** and **gender**, and based on their profile we recommand various music albums they're likely to buy. 
The goal of our project is 
 + Increase sales using Machine Learning 

### The process of this project
 + Building a model
 + Feeding this model with some sample data based on existing users
 + Model will learn patterns in data
 + Ask model to make predictions

### What we aks from our model and how the model responses
 + We have a new user with this profile; what is the kind of music this user interested in?
 + Model will likely answer "Jazz", "Hip-Hop", "Classic", or etc.
 + Based on this prediction we can suggest music to this user

### Steps of this project
 1. Import Data
 2. Clean the data
 3. Split the data into Training/Test sets
 4. Create a Model
 5. Train the Model
 6. Make prediction
 7. Evaluate and Improve 

In [53]:
# importing pandas library and reading the music.csv file as our Data

import pandas as pd



# from sklearn (which is a package of scikit-learn library) we import DecisionTreeClassifier class (this class implement DecisioTree algorithm)

from sklearn.tree import DecisionTreeClassifier



music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
y = music_data['genre']


# creating new instance

model = DecisionTreeClassifier()

# train the model

model.fit(X.values, y) #Here x.values will have only values without headers to avoid sklearn error

# ask model to predict

predictions = model.predict([ [21, 1], [22, 0] ])

predictions


array(['HipHop', 'Dance'], dtype=object)

### Measure the accuracy of the data

In order to do this we need to split our data to two different data sets, one for training (80% of our data) and another one for testing (20% of our data). 
So, we add the following code `from sklearn.model_selection import train_test_split`

In [66]:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score


music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
y = music_data['genre']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)


model = DecisionTreeClassifier()
model.fit(X_train, y_train)
predictions = model.predict(X_test)

score = accuracy_score(y_test, predictions)
score

0.75

### Creating a trained-model and saving it 
In order to create a trained-model we `ìmport joblib`which save our trained-model on our machine.

In [None]:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
import joblib


music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
y = music_data['genre']


model = DecisionTreeClassifier()
model.fit(X.values, y)

joblib.dump(model, 'music-recommander.joblib')

# predictions = model.predict([ [21, 1], [22, 0] ])




### Using a trained-model 
By using this code: `joblib.load('music-recommander.joblib')`we could call our trained-model and let it predict the result.

In [54]:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
import joblib


# music_data = pd.read_csv('music.csv')
# X = music_data.drop(columns=['genre'])
# y = music_data['genre']


# model = DecisionTreeClassifier()
# model.fit(X.values, y)

model = joblib.load('music-recommander.joblib') # loading the trained model and making prediciton

predictions = model.predict([ [21, 1], [22, 0] ])
predictions




array(['HipHop', 'Dance'], dtype=object)

### Visualizing Decision Tree

With the fllowing code we create a `.dot`file and we can open it with **vscode**.

In [67]:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree

music_data = pd.read_csv('music.csv')
X = music_data.drop(columns=['genre'])
y = music_data['genre']


model = DecisionTreeClassifier()
model.fit(X_train, y_train)

tree.export_graphviz(model, out_file='music-recommander.dot',
                    feature_names=[ 'age', 'gender'],
                    class_names=sorted(y.unique()), 
                    label='all',
                    rounded = True,
                    filled=True)