**Review**

Hi, my name is Dmitry and I will be reviewing your project.
  
You can find my comments in colored markdown cells:
  
<div class="alert alert-success">
  If everything is done successfully.
</div>
  
<div class="alert alert-warning">
  If I have some (optional) suggestions, or questions to think about, or general comments.
</div>
  
<div class="alert alert-danger">
  If a section requires some corrections. Work can't be accepted with red comments.
</div>
  
Please don't remove my comments, as it will make further review iterations much harder for me.
  
Feel free to reply to my comments or ask questions using the following template:
  
<div class="alert alert-info">
  For your comments and questions.
</div>
  
First of all, thank you for turning in the project! You did an excellent job! I only had a couple of suggestions to improve the sanity check. The project is accepted. Keep up the good work on the next sprint! 

# Introduction to Machine Learning: Megaline Model

## Overview
Mobile carrier Megaline has found out that many of their subscribers use legacy plans. They want to develop a model that would analyze subscribers' behavior and recommend one of Megaline's newer plans: Smart or Ultra.
You have access to behavior data about subscribers who have already switched to the new plans  For this classification task,I will develop a model that will pick the right plan.

### Plan
- Access and acess the data file
- Split the source data into a training set, a validation set, and a test set.
- Investigating the quality of different models by changing hyperparameters, 0.75 accuracy threshold
- Checking the quality of test set
- Sanity check the model
- Reach conclusion

## Initializing 

Loading all the libraries I need first.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.metrics import recall_score
from sklearn.metrics import precision_score

Loading the data

In [3]:
data=pd.read_csv('/datasets/users_behavior.csv')
data.head()

Unnamed: 0,calls,minutes,messages,mb_used,is_ultra
0,40.0,311.9,83.0,19915.42,0
1,85.0,516.75,56.0,22696.96,0
2,77.0,467.66,86.0,21060.45,0
3,106.0,745.53,81.0,8437.39,1
4,66.0,418.74,1.0,14502.75,0


In [4]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3214 entries, 0 to 3213
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   calls     3214 non-null   float64
 1   minutes   3214 non-null   float64
 2   messages  3214 non-null   float64
 3   mb_used   3214 non-null   float64
 4   is_ultra  3214 non-null   int64  
dtypes: float64(4), int64(1)
memory usage: 125.7 KB


In [5]:
data.describe()

Unnamed: 0,calls,minutes,messages,mb_used,is_ultra
count,3214.0,3214.0,3214.0,3214.0,3214.0
mean,63.038892,438.208787,38.281269,17207.673836,0.306472
std,33.236368,234.569872,36.148326,7570.968246,0.4611
min,0.0,0.0,0.0,0.0,0.0
25%,40.0,274.575,9.0,12491.9025,0.0
50%,62.0,430.6,30.0,16943.235,0.0
75%,82.0,571.9275,57.0,21424.7,1.0
max,244.0,1632.06,224.0,49745.73,1.0


We preprocessed this data previously so I was expecting it to look good and it does! we can move ahead with making the model datasets

<div class="alert alert-success">
<b>Reviewer's comment</b>

Indeed, let's get started with modeling!

</div>

## Train,Valid,Test Split Using 80:10:10

To create the proportions of 80:10:10. I will first split the data into two parts the data_train (training dataset) and data_rem (remaining data). Then I will split the remaining data into the validation dataset and the test dataset.

In [7]:
data_train, data_rem = train_test_split(data, test_size = 0.8, random_state = 54321)
data_valid, data_test = train_test_split(data_rem, test_size = 0.5, random_state = 54321)

train_target = data_train['is_ultra']
valid_target = data_validation['is_ultra']
test_target = data_test['is_ultra']

train_features = data_train.drop('is_ultra',axis = 1)
valid_features = data_validation.drop('is_ultra',axis = 1)
test_features = data_test.drop('is_ultra',axis = 1)

<div class="alert alert-success">
<b>Reviewer's comment</b>

The data was split into train, validation and test sets. The proportion makes sense

</div>

## Testing different model qualities, changing the Hyperparameters
Now I will test DecisionTreeClassifier, RandomForestClassifier, and LogisticRegression, changing the hyperparameters for each to see which model can get the best accuracy score

In [18]:
for depth in range(1,6):
    model_DT = DecisionTreeClassifier(random_state=54321, max_depth=depth)
    model_DT.fit(train_features,train_target)
    predict = model_DT.predict(validation_features)
    print( depth,"- depth:",accuracy_score(validation_target, predict))

1 - depth: 0.7433903576982893
2 - depth: 0.7511664074650077
3 - depth: 0.7791601866251944
4 - depth: 0.7807153965785381
5 - depth: 0.776049766718507


In [17]:
for est in range(10,101,10):
    model_RF = RandomForestClassifier(random_state=54321, n_estimators = est)
    model_RF.fit(train_features,train_target)
    predict = model_RF.predict(validation_features)
    print("number of estimators:",est,", accuracy_score:",accuracy_score(validation_target, predict))

number of estimators: 10 , accuracy_score: 0.7916018662519441
number of estimators: 20 , accuracy_score: 0.7962674961119751
number of estimators: 30 , accuracy_score: 0.7993779160186625
number of estimators: 40 , accuracy_score: 0.7962674961119751
number of estimators: 50 , accuracy_score: 0.7884914463452566
number of estimators: 60 , accuracy_score: 0.7900466562986003
number of estimators: 70 , accuracy_score: 0.7884914463452566
number of estimators: 80 , accuracy_score: 0.7947122861586314
number of estimators: 90 , accuracy_score: 0.8009331259720062
number of estimators: 100 , accuracy_score: 0.7978227060653188


In [19]:
model_LR = LogisticRegression(random_state = 12345, solver='liblinear')
model_LR.fit(train_features,train_target)
predict = model_LR.predict(validation_features)
print(accuracy_score(validation_target, predict))

0.7433903576982893


RandomForest is the clear winner here with an accuracy score of 80% which is great because our threshold is 75%. I will move forward with the RandomForestClassifier using 90 estimatiors with the test dataset.

<div class="alert alert-success">
<b>Reviewer's comment</b>

Excellent, you tried a couple of different models and did some hyperparameter tuning using the validation set

</div>

In [27]:
final_model = RandomForestClassifier(random_state = 54321, n_estimators = 90)
final_model.fit(train_features, train_target)
predict = final_model.predict(test_features)
print("Accuracy for final model",accuracy_score(test_target, predict))

Accuracy for final model 0.7737169517884914


When the model was run with 80% of our dataset we were able to get 77% accuracy for our final model. Thats above our threshold and I would feel confident presenting this to my clients. 

<div class="alert alert-success">
<b>Reviewer's comment</b>

The final model was evaluated on the test set! Train, validation and test sets were used correctly.

</div>

## Sanity Check

In [28]:
print("Recall score:",recall_score(test_target, predict, average='macro'))
print("Precision score:",precision_score(test_target, predict, average='macro'))

Recall score: 0.7088716177317214
Precision score: 0.7313890935065515


This sanity check shows that our recall and precision are below our desired threshold.

## Conclusion

After investigating the data, creating datasets for training, testing, and validation. I optimized several models utilizing the hyperparameters and choosing the best accuracy from those results. I created a trained model and tested it with the test dataset comprised of 80% of our provided dataset, we achieved 77% accuracy with this model. n conclusion, our random forest model is trained and ready to use. I would proceed with some caution due to the sub par results from the sanity check returning below threshold precision and recall. However, it is a good start to a promising and useful model. 

<div class="alert alert-warning">
<b>Reviewer's comment</b>

A couple of points:
    
1. While it's a good idea to look at different metrics, it doesn't make sense to apply a threshold chosen for one metric to another metric, they can measure very different things.
    
2. The correct value to use for `average` parameter of `recall_score` and `precision_score` in case of binary classification is the default `'binary'`: it doesn't do any averaging and just returns the result for the positive label, which is what we want: for binary classification it contains all the information we need. Averaging is only needed when there are more than two classes.
    
3. One way to sanity check our model is to compare it to some simple baseline and make sure it does better. For example, in this case we can make a simple constant model always predicting the majority class, it will have accuracy equal to the share of the majority class in the data (about 70% in this case). As our model does better than that, we can conclude that it has probably indeed learned something non-trivial :)

</div>

Thank you for your time reviewing my project!

<div class="alert alert-success">
<b>Reviewer's comment</b>

You're welcome! :)

</div>