## Optional Jupyter Notebooks – Task 4

Be sure to read the [Introduction to Notebooks](IntroductiontoNotebooks.ipynb)

### Setup:
To begin, run the cell below (click inside the cell and press `Ctrl-Enter`) to load the framework necessary for Task 4. While you may import other libraries for experimentation, remember that only the standard Python libraries are supported by the Gradescope autograder.

In [None]:
import numpy as np
import pandas as pd
import pickle
import os
import sys
import warnings
warnings.filterwarnings("ignore")

task_path = os.path.abspath(os.path.join(os.getcwd(),".."))
sys.path.append(task_path)
from src.task4 import ModelMetrics
from tests.utils import compare_submission_to_answer_df



In [None]:
# Load all input and output datasets. These will be used in the validation step
train_features = pd.read_csv(os.path.join(os.getcwd(),"..","task4","NATICUSdroid_train_features.csv"), index_col=0)
#print('train_features : \n',train_features)
test_features  = pd.read_csv(os.path.join(os.getcwd(),"..","task4","NATICUSdroid_test_features.csv"), index_col=0)
#print('test_features : \n',test_features)
train_targets  = pd.read_csv(os.path.join(os.getcwd(),"..","task4","NATICUSdroid_train_targets.csv"), index_col=0)
#print('train_targets : \n',train_targets)
test_targets   = pd.read_csv(os.path.join(os.getcwd(),"..","task4","NATICUSdroid_test_targets.csv"), index_col=0)
#print('test_targets : \n',test_targets)
logreg_importance_df_ans = pd.read_pickle(os.path.join(os.getcwd(),"..","task4","pkl_files","logreg_importance.pkl"))
#print('logreg_importance_df_ans : \n',importance_df_ans)
rf_importance_df_ans = pd.read_pickle(os.path.join(os.getcwd(),"..","task4","pkl_files","rf_importance.pkl"))
#print('rf_importance_df_ans : \n',importance_df_ans)
dt_importance_df_ans = pd.read_pickle(os.path.join(os.getcwd(),"..","task4","pkl_files","dt_importance.pkl"))
#print('dt_importance_df_ans : \n',importance_df_ans)

## `calculate_naive_metrics`

A Naive model is a very simple model/prediction that can help to frame how well a more sophisticated model is doing. At best, such a model has random competence at predicting things. At worst, it's wrong all the time.  

Since a naive model is incredibly basic (often a constant or randomly selected result), we can expect that any more sophisticated model that we train should outperform it. If the naive Model beats our trained model, it can mean that additional data (rows or columns) is needed in the dataset to improve our model. It can also mean that the dataset doesn't have a strong enough signal for the target we want to predict.  

**In this function, you'll implement a simple model that always predicts a constant (function-provided) number, regardless of the input values.** Specifically, you'll use a given constant integer, provided as the parameter `naive_assumption`, as the model's prediction. This means the model will always output this constant value, without considering the actual data. Afterward, you will calculate four metrics—accuracy, recall, precision, and F1-score—for both the training and test datasets.

[1] Refer to the resources below.

##### Useful Resources
* <https://machinelearningmastery.com/how-to-develop-and-evaluate-naive-classifier-strategies-using-probability/>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score>

##### INPUTS
* `train_features` - a dataset split by a function similar to the tts function you created in task2
* `test_features` - a dataset split by a function similar to the tts function you created in task2
* `train_targets` - a dataset split by a function similar to the tts function you created in task2
* `test_targets` - a dataset split by a function similar to the tts function you created in task2
* `naive_assumption` - an integer that should be used as the result from the naive model you will create

##### OUTPUTS 
A completed `ModelMetrics` object with a training and test metrics dictionary with each one of the metrics **rounded to 4 decimal places**

##### Function Skeleton
```python
def calculate_naive_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, naive_assumption:int) -> ModelMetrics:
    train_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0
        }
    test_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0
        }
    naive_metrics = ModelMetrics("Naive",train_metrics,test_metrics,None)
    return naive_metrics
```

In [None]:
def calculate_naive_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, naive_assumption:int) -> ModelMetrics:
    # TODO: Read the function description in https://github.gatech.edu/pages/cs6035-tools/cs6035-tools.github.io/Projects/Machine_Learning/Task4.html and implement the function as described
    train_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0
        }
    test_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0
        }
    naive_metrics = ModelMetrics("Naive",train_metrics,test_metrics,None)
    return naive_metrics


In [None]:
# Run this cell to test your code
naive_assumption = 1
# Answers for Wine Dataset
with open(os.path.join(os.getcwd(),"..","task4","pkl_files","naive_metrics.pkl"), 'rb') as file:
    naive_metrics_ans = pickle.load(file)
train_accuracy_ans = naive_metrics_ans.train_metrics["accuracy"]
train_recall_ans = naive_metrics_ans.train_metrics["recall"]
train_precision_ans = naive_metrics_ans.train_metrics["precision"]
train_fscore_ans = naive_metrics_ans.train_metrics["fscore"]
test_accuracy_ans = naive_metrics_ans.test_metrics["accuracy"]
test_recall_ans = naive_metrics_ans.test_metrics["recall"]
test_precision_ans = naive_metrics_ans.test_metrics["precision"]
test_fscore_ans = naive_metrics_ans.test_metrics["fscore"]
# Calculate Metrics with Student's Function
naive_metrics = calculate_naive_metrics(train_features, test_features, train_targets, test_targets, naive_assumption)

train_accuracy = naive_metrics.train_metrics["accuracy"]
print(f'Train Accuracy: {train_accuracy} - ', 'Correct' if train_accuracy == train_accuracy_ans else 'Incorrect')
train_recall = naive_metrics.train_metrics["recall"]
print(f'Train Recall: {train_recall} - ', 'Correct' if train_recall == train_recall_ans else 'Incorrect')
train_precision = naive_metrics.train_metrics["precision"]
print(f'Train Precision: {train_precision} - ', 'Correct' if train_precision == train_precision_ans else 'Incorrect')
train_fscore = naive_metrics.train_metrics["fscore"]
print(f'Train Fscore: {train_fscore} - ', 'Correct' if train_fscore == train_fscore_ans else 'Incorrect')
test_accuracy = naive_metrics.test_metrics["accuracy"]
print(f'Test Accuracy: {test_accuracy} - ', 'Correct' if test_accuracy == test_accuracy_ans else 'Incorrect')
test_recall = naive_metrics.test_metrics["recall"]
print(f'Test Recall: {test_recall} - ', 'Correct' if test_recall == test_recall_ans else 'Incorrect')
test_precision = naive_metrics.test_metrics["precision"]
print(f'Test Precision: {test_precision} - ', 'Correct' if test_precision == test_precision_ans else 'Incorrect')
test_fscore = naive_metrics.test_metrics["fscore"]
print(f'Test Fscore: {test_fscore} - ', 'Correct' if test_fscore == test_fscore_ans else 'Incorrect')

print('Completed...')

## `calculate_logistic_regression_metrics`

A logistic regression model is a simple and more explainable statistical model that can be used to estimate the probability of an event ([log-odds](https://www.statisticshowto.com/log-odds/)). At a high level, a logistic regression model uses data in the training set to estimate a column's weight in a linear approximation function. Conceptually this is similar to estimating `m` for each column in the line formula you probably know well from geometry: `y = m*x + b`. If you are interested in learning more, [you can read up on the math](https://en.wikipedia.org/wiki/Logistic_regression) behind how this works. For this project, we are more focused on showing you how to apply these models, so you can simply use a scikit-learn Logistic Regression model in your code. 

For this task use scikit-learn's LogisticRegression class and complete the following subtasks:

* Train a Logistic Regression model (initialized using the kwargs passed into the function)
* Predict scores for training and test datasets and calculate the 7 metrics listed below for the training and test datasets using predictions from the fit model. (All rounded to 4 decimal places)
  * `accuracy`
  * `recall`
  * `precision`
  * `fscore`
  * `false positive rate (fpr)`
  * `false negative rate (fnr)`
  * `Area Under the Curve of Receiver Operating Characteristics Curve (roc_auc)`
* Use RFE to select the top 10 features 
* Train a Logistic Regression model using these selected features (initialized using the kwargs passed into the function)
* Create a Feature Importance DataFrame from the model trained on the top 10 features: 
  * Use the top 10 features sort **by absolute value** of the coefficient from biggest to smallest.
  * Make sure you use the same feature and importance column names as set in ModelMetrics in feat_name_col [`Feature`] and imp_col [`Importance`].
  * Round the importances to 4 decimal places (**do this step after you have sorted by Importance**)
  * Reset the index to 0-9. You can do this the same way you did in task1.

**NOTE:** Make sure you use the predicted probabilities for roc auc

##### Useful Resources
* <https://stats.libretexts.org/Bookshelves/Introductory_Statistics/OpenIntro_Statistics_(Diez_et_al)./08%3A_Multiple_and_Logistic_Regression/8.04%3A_Introduction_to_Logistic_Regression>
* <https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score>
* <https://en.wikipedia.org/wiki/Confusion_matrix>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html>

##### INPUTS
The first 4 are similar to the `tts` function you created in Task 2:
* `train_features` - a Pandas Dataframe with training features
* `test_features` - a Pandas Dataframe with test features
* `train_targets` - a Pandas Dataframe with training targets
* `test_targets` - a Pandas Dataframe with test targets
* `logreg_kwargs` - a dictionary with keyword arguments that can be passed directly to the scikit-learn Logistic Regression class

##### OUTPUTS 
* A completed `ModelMetrics` object with a training and test metrics dictionary with each one of the metrics **rounded to 4 decimal places**
* A scikit-learn Logistic Regression model object fit on the training set


##### Function Skeleton
```python
def calculate_logistic_regression_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, logreg_kwargs) -> tuple[ModelMetrics,LogisticRegression]:
    model = LogisticRegression()
    train_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }
    test_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }

    log_reg_importance = pd.DataFrame()
    log_reg_metrics = ModelMetrics("Logistic Regression",train_metrics,test_metrics,log_reg_importance)

    return log_reg_metrics,model
```

##### Example of Feature Importance DataFrame
<html>
<body>

<table>
  <tr><th width=100px></th><th width=100px>Feature</th><th width=100px>Importance</th></tr>
  <tr><td>0</td><td>android.permission.REQUEST_INSTALL_PACKAGES</td><td>-5.5969</td></tr>
  <tr><td>1</td><td>android.permission.READ_PHONE_STATE</td><td>5.1587</td></tr>
  <tr><td>2</td><td>android.permission.android.permission.READ_PHONE_STATE</td><td>-4.7923</td></tr>
  <tr><td>3</td><td>com.anddoes.launcher.permission.UPDATE_COUNT</td><td>-4.7506</td></tr>
  <tr><td>4</td><td>com.samsung.android.providers.context.permission.WRITE_USE_APP_FEATURE_SURVEY</td><td>-4.4933</td></tr>
  <tr><td>5</td><td>com.google.android.finsky.permission.BIND_GET_INSTALL_REFERRER_SERVICE</td><td>-4.4831</td></tr>
  <tr><td>6</td><td>com.google.android.c2dm.permission.RECEIVE</td><td>-4.2781</td></tr>
  <tr><td>7</td><td>android.permission.FOREGROUND_SERVICE</td><td>-4.1966</td></tr>
  <tr><td>8</td><td>android.permission.USE_FINGERPRINT</td><td>-3.9239</td></tr>
  <tr><td>9</td><td>android.permission.INTERNET</td><td>-2.7991</td></tr>
</table>
</body>
</html>


In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import RFE
def calculate_logistic_regression_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, logreg_kwargs) -> tuple[ModelMetrics,LogisticRegression]:
    # TODO: Read the function description in https://github.gatech.edu/pages/cs6035-tools/cs6035-tools.github.io/Projects/Machine_Learning/Task4.html and implement the function as described
    model = LogisticRegression()
    train_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }
    test_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }

    log_reg_importance = pd.DataFrame()
    log_reg_metrics = ModelMetrics("Logistic Regression",train_metrics,test_metrics,log_reg_importance)

    return log_reg_metrics,model


In [None]:
# Run this cell to test your code
logreg_kwargs = {'penalty':'l1','fit_intercept':False,'solver':'liblinear','random_state':0}
# Answers for Wine Dataset
logreg_importance_df_ans = pd.read_pickle(os.path.join(os.getcwd(),"..","task4","pkl_files","logreg_importance.pkl"))
#print('logreg_importance_df_ans : \n',logreg_importance_df_ans)
with open(os.path.join(os.getcwd(),"..","task4","pkl_files","logreg_metrics.pkl"), 'rb') as file:
    logreg_metrics_ans = pickle.load(file)
train_accuracy_ans = logreg_metrics_ans.train_metrics["accuracy"]
train_recall_ans = logreg_metrics_ans.train_metrics["recall"]
train_precision_ans = logreg_metrics_ans.train_metrics["precision"]
train_fscore_ans = logreg_metrics_ans.train_metrics["fscore"]
train_fpr_ans = logreg_metrics_ans.train_metrics["fpr"]
train_fnr_ans = logreg_metrics_ans.train_metrics["fnr"]
train_roc_auc_ans = logreg_metrics_ans.train_metrics["roc_auc"]

test_accuracy_ans = logreg_metrics_ans.test_metrics["accuracy"]
test_recall_ans = logreg_metrics_ans.test_metrics["recall"]
test_precision_ans = logreg_metrics_ans.test_metrics["precision"]
test_fscore_ans = logreg_metrics_ans.test_metrics["fscore"]
test_fpr_ans = logreg_metrics_ans.test_metrics["fpr"]
test_fnr_ans = logreg_metrics_ans.test_metrics["fnr"]
test_roc_auc_ans = logreg_metrics_ans.test_metrics["roc_auc"]

# Calculate Metrics with Student's Function
logreg_metrics,_ = calculate_logistic_regression_metrics(train_features, test_features, train_targets, test_targets, logreg_kwargs)

train_accuracy = logreg_metrics.train_metrics["accuracy"]
print(f'Train Accuracy: {train_accuracy} - ', 'Correct' if train_accuracy == train_accuracy_ans else 'Incorrect')
train_recall = logreg_metrics.train_metrics["recall"]
print(f'Train Recall: {train_recall} - ', 'Correct' if train_recall == train_recall_ans else 'Incorrect')
train_precision = logreg_metrics.train_metrics["precision"]
print(f'Train Precision: {train_precision} - ', 'Correct' if train_precision == train_precision_ans else 'Incorrect')
train_fscore = logreg_metrics.train_metrics["fscore"]
print(f'Train Fscore: {train_fscore} - ', 'Correct' if train_fscore == train_fscore_ans else 'Incorrect')
train_fpr = logreg_metrics.train_metrics["fpr"]
print(f'Train FPR: {train_fpr} - ', 'Correct' if train_fpr == train_fpr_ans else 'Incorrect')
train_fnr = logreg_metrics.train_metrics["fnr"]
print(f'Train FNR: {train_fnr} - ', 'Correct' if train_fnr == train_fnr_ans else 'Incorrect')
train_roc_auc = logreg_metrics.train_metrics["roc_auc"]
print(f'Train ROC AUC: {train_roc_auc} - ', 'Correct' if train_roc_auc == train_roc_auc_ans else 'Incorrect')
test_accuracy = logreg_metrics.test_metrics["accuracy"]
print(f'Test Accuracy: {test_accuracy} - ', 'Correct' if test_accuracy == test_accuracy_ans else 'Incorrect')
test_recall = logreg_metrics.test_metrics["recall"]
print(f'Test Recall: {test_recall} - ', 'Correct' if test_recall == test_recall_ans else 'Incorrect')
test_precision = logreg_metrics.test_metrics["precision"]
print(f'Test Precision: {test_precision} - ', 'Correct' if test_precision == test_precision_ans else 'Incorrect')
test_fscore = logreg_metrics.test_metrics["fscore"]
print(f'Test Fscore: {test_fscore} - ', 'Correct' if test_fscore == test_fscore_ans else 'Incorrect')
test_fpr = logreg_metrics.test_metrics["fpr"]
print(f'Test FPR: {test_fpr} - ', 'Correct' if test_fpr == test_fpr_ans else 'Incorrect')
test_fnr = logreg_metrics.test_metrics["fnr"]
print(f'Test FNR: {test_fnr} - ', 'Correct' if test_fnr == test_fnr_ans else 'Incorrect')
test_roc_auc = logreg_metrics.test_metrics["roc_auc"]
print(f'Test ROC AUC: {test_roc_auc} - ', 'Correct' if test_roc_auc == test_roc_auc_ans else 'Incorrect')
importance_df = logreg_metrics.feat_imp_df
if compare_submission_to_answer_df(importance_df.round(4),logreg_importance_df_ans,"LR Feature Importance DF"):
    print('Feature Importance DF: Correct')
else:
    print('Feature Importance DF: Incorrect')

## `calculate_decision_tree_metrics`

A Decision Tree (DT) is a supervised learning algorithm used for both classification and regression tasks. It works by recursively splitting the data into subsets based on the feature that results in the best separation of classes, typically measured using **Gini impurity** or **entropy**. Decision trees are interpretable, as the learned model can be visualized as a flowchart-like structure.  

If you are interested in learning more, [you can read up on the math](https://en.wikipedia.org/wiki/Decision_tree_learning) behind how decision trees work.  

For this project, we are more focused on showing you how to apply these models, so you can simply use a scikit-learn DecisionTreeClassifier in your code.  

For this task, use scikit-learn's DecisionTreeClassifier class and complete the following subtasks:  

* Train a DT model (initialized using the kwargs passed into the function).  
* Predict scores for training and test datasets and calculate the 7 metrics listed below for the training and test datasets using predictions from the fit model. (All rounded to 4 decimal places).  
  * `accuracy`  
  * `recall`  
  * `precision`  
  * `fscore`  
  * `false positive rate (fpr)`  
  * `false negative rate (fnr)`  
  * `Area Under the Curve of Receiver Operating Characteristics Curve (roc_auc)`  
* Use RFE to select the top 10 features.  
* Train a DT model using these selected features (initialized using the kwargs passed into the function).  
* Create a **Feature Importance DataFrame** from the model trained on the top 10 features:  
  * Use the top 10 features, sorting **by absolute value** of feature importance from biggest to smallest.  
  * Make sure you use the same feature and importance column names as set in `ModelMetrics` in `feat_name_col` [`Feature`] and `imp_col` [`Importance`].  
  * Round the importances to 4 decimal places (**do this step after you have sorted by Importance**).  
  * Reset the index to 0-9. You can do this the same way you did in Task 1.  

**NOTE:** Make sure you use the predicted probabilities for `roc_auc`.  

##### Useful Resources  
* <https://scikit-learn.org/stable/modules/tree.html>  
* <https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html>  
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score>  
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score>  
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score>  
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score>  
* <https://en.wikipedia.org/wiki/Confusion_matrix>  
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html>  

##### INPUTS  
The first 4 are similar to the `tts` function you created in Task 2:  
* `train_features` - a Pandas DataFrame with training features  
* `test_features` - a Pandas DataFrame with test features  
* `train_targets` - a Pandas DataFrame with training targets  
* `test_targets` - a Pandas DataFrame with test targets  
* `tree_kwargs` - a dictionary with keyword arguments that can be passed directly to the scikit-learn `DecisionTreeClassifier` class  

##### OUTPUTS  
* A completed `ModelMetrics` object with a training and test metrics dictionary with each one of the metrics **rounded to 4 decimal places**  
* A scikit-learn `DecisionTreeClassifier` model object fit on the training set  

##### Function Skeleton  
```python
def calculate_decision_tree_metrics(train_features: pd.DataFrame, test_features: pd.DataFrame, train_targets: pd.Series, test_targets: pd.Series, tree_kwargs) -> tuple[ModelMetrics, DecisionTreeClassifier]:
    model = DecisionTreeClassifier(**tree_kwargs)
    train_metrics = {
        "accuracy": 0,
        "recall": 0,
        "precision": 0,
        "fscore": 0,
        "fpr": 0,
        "fnr": 0,
        "roc_auc": 0
    }
    test_metrics = {
        "accuracy": 0,
        "recall": 0,
        "precision": 0,
        "fscore": 0,
        "fpr": 0,
        "fnr": 0,
        "roc_auc": 0
    }

    tree_importance = pd.DataFrame()
    tree_metrics = ModelMetrics("Decision Tree", train_metrics, test_metrics, tree_importance)

    return tree_metrics, model

In [None]:
from sklearn.tree import DecisionTreeClassifier

def calculate_decision_tree_metrics( train_features: pd.DataFrame, test_features: pd.DataFrame, train_targets: pd.Series, test_targets: pd.Series, dt_kwargs) -> Tuple[ModelMetrics, DecisionTreeClassifier]:
    # TODO: Read the function description in https://github.gatech.edu/pages/cs6035-tools/cs6035-tools.github.io/Projects/Machine_Learning/Task4.html and implement the function as described
    model = DecisionTreeClassifier()
    train_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }
    test_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }

    dt_importance = pd.DataFrame()
    dt_metrics = ModelMetrics("Decision Tree", train_metrics, test_metrics, dt_importance)

    return dt_metrics,model

In [None]:
# Run this cell to test your code
dt_kwargs = {'criterion': 'entropy', 'max_depth': 3, 'random_state': 0}

# Answers for Wine Dataset
with open(os.path.join(os.getcwd(), "..", "task4", "pkl_files", "dt_metrics.pkl"), 'rb') as file:
    dt_metrics_ans = pickle.load(file)
train_accuracy_ans = dt_metrics_ans.train_metrics["accuracy"]
train_recall_ans = dt_metrics_ans.train_metrics["recall"]
train_precision_ans = dt_metrics_ans.train_metrics["precision"]
train_fscore_ans = dt_metrics_ans.train_metrics["fscore"]
train_fpr_ans = dt_metrics_ans.train_metrics["fpr"]
train_fnr_ans = dt_metrics_ans.train_metrics["fnr"]
train_roc_auc_ans = dt_metrics_ans.train_metrics["roc_auc"]

test_accuracy_ans = dt_metrics_ans.test_metrics["accuracy"]
test_recall_ans = dt_metrics_ans.test_metrics["recall"]
test_precision_ans = dt_metrics_ans.test_metrics["precision"]
test_fscore_ans = dt_metrics_ans.test_metrics["fscore"]
test_fpr_ans = dt_metrics_ans.test_metrics["fpr"]
test_fnr_ans = dt_metrics_ans.test_metrics["fnr"]
test_roc_auc_ans = dt_metrics_ans.test_metrics["roc_auc"]

# Calculate Metrics with Student's Function
dt_metrics, _ = calculate_decision_tree_metrics(
    train_features, test_features, train_targets, test_targets, dt_kwargs
)

train_accuracy = dt_metrics.train_metrics["accuracy"]
print(f'Train Accuracy: {train_accuracy} - ', 'Correct' if train_accuracy == train_accuracy_ans else 'Incorrect')
train_recall = dt_metrics.train_metrics["recall"]
print(f'Train Recall: {train_recall} - ', 'Correct' if train_recall == train_recall_ans else 'Incorrect')
train_precision = dt_metrics.train_metrics["precision"]
print(f'Train Precision: {train_precision} - ', 'Correct' if train_precision == train_precision_ans else 'Incorrect')
train_fscore = dt_metrics.train_metrics["fscore"]
print(f'Train Fscore: {train_fscore} - ', 'Correct' if train_fscore == train_fscore_ans else 'Incorrect')
train_fpr = dt_metrics.train_metrics["fpr"]
print(f'Train FPR: {train_fpr} - ', 'Correct' if train_fpr == train_fpr_ans else 'Incorrect')
train_fnr = dt_metrics.train_metrics["fnr"]
print(f'Train FNR: {train_fnr} - ', 'Correct' if train_fnr == train_fnr_ans else 'Incorrect')
train_roc_auc = dt_metrics.train_metrics["roc_auc"]
print(f'Train ROC AUC: {train_roc_auc} - ', 'Correct' if train_roc_auc == train_roc_auc_ans else 'Incorrect')

test_accuracy = dt_metrics.test_metrics["accuracy"]
print(f'Test Accuracy: {test_accuracy} - ', 'Correct' if test_accuracy == test_accuracy_ans else 'Incorrect')
test_recall = dt_metrics.test_metrics["recall"]
print(f'Test Recall: {test_recall} - ', 'Correct' if test_recall == test_recall_ans else 'Incorrect')
test_precision = dt_metrics.test_metrics["precision"]
print(f'Test Precision: {test_precision} - ', 'Correct' if test_precision == test_precision_ans else 'Incorrect')
test_fscore = dt_metrics.test_metrics["fscore"]
print(f'Test Fscore: {test_fscore} - ', 'Correct' if test_fscore == test_fscore_ans else 'Incorrect')
test_fpr = dt_metrics.test_metrics["fpr"]
print(f'Test FPR: {test_fpr} - ', 'Correct' if test_fpr == test_fpr_ans else 'Incorrect')
test_fnr = dt_metrics.test_metrics["fnr"]
print(f'Test FNR: {test_fnr} - ', 'Correct' if test_fnr == test_fnr_ans else 'Incorrect')
test_roc_auc = dt_metrics.test_metrics["roc_auc"]
print(f'Test ROC AUC: {test_roc_auc} - ', 'Correct' if test_roc_auc == test_roc_auc_ans else 'Incorrect')

importance_df = dt_metrics.feat_imp_df
if compare_submission_to_answer_df(importance_df.round(4), dt_importance_df_ans, "DT Feature Importance DF"):
    print('Feature Importance DF: Correct')
else:
    print('Feature Importance DF: Incorrect')


## `calculate_random_forest_metrics`

A Random Forest model is a more complex model than the naive and Logistic Regression Models you have trained so far. It can still be used to estimate the probability of an event, but achieves this using a different underlying structure: [a tree-based model](https://en.wikipedia.org/wiki/Decision_tree_learning).  
Conceptually, this looks a lot like many if/else statements chained together into a "tree". A Random Forest expands on this and trains different trees with different subsets of the data and starting conditions. It does this to get a better estimate than a single tree would give. For this project, we are more focused on showing you how to apply these models, so you can simply use the scikit-learn Random Forest model in your code. 

For this task use scikit-learn's Random Forest Classifier class and complete the following subtasks:
* Train a Random Forest model (initialized using the kwargs passed into the function)
* Predict scores for training and test datasets and calculate the 7 metrics listed below for the training and test datasets using predictions from the fit model. (All rounded to 4 decimal places)
  * `accuracy`
  * `recall`
  * `precision`
  * `fscore`
  * `false positive rate (fpr)`
  * `false negative rate (fnr)`
  * `Area Under the Curve of Receiver Operating Characteristics Curve (roc_auc)`
* Create a Feature Importance DataFrame from the trained model: 
  * **Do Not Use RFE for feature selection**
  * Use the top 10 features selected by the built in method (sorted from biggest to smallest) 
  * Make sure you use the same feature and importance column names as ModelMetrics shows in feat_name_col [`Feature`] and imp_col [`Importance`] 
  * Round the importances to 4 decimal places (**do this step after you have sorted by Importance**)
  * Reset the index to 0-9 you can do this the same way you did in task1 

**NOTE:** Make sure you use the predicted probabilities for roc auc

##### Useful Resources
* <https://blog.dataiku.com/tree-based-models-how-they-work-in-plain-english>
* <https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score>
* <https://en.wikipedia.org/wiki/Confusion_matrix>
* <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html>

##### INPUTS
* `train_features` - a dataset split by a function similar to the tts function you created in task2
* `test_features` - a dataset split by a function similar to the tts function you created in task2
* `train_targets` - a dataset split by a function similar to the tts function you created in task2
* `test_targets` - a dataset split by a function similar to the tts function you created in task2
* `rf_kwargs` - a dictionary with keyword arguments that can be passed directly to the scikit-learn RandomForestClassifier class

##### OUTPUTS 
* A completed `ModelMetrics` object with a training and test metrics dictionary with each one of the metrics **rounded to 4 decimal places**
* An scikit-learn Random Forest model object fit on the training set

##### Function Skeleton
```python
def calculate_random_forest_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, rf_kwargs) -> tuple[ModelMetrics,RandomForestClassifier]:

    model = RandomForestClassifier()
    
    train_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }
        
    test_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }

    rf_importance = pd.DataFrame()
    rf_metrics = ModelMetrics("Random Forest",train_metrics,test_metrics,rf_importance)

    return rf_metrics,model
```

##### Example of Feature Importance DataFrame
<html>
<body>

<table>
  <tr><th width=100px></th><th width=100px>Feature</th><th width=100px>Importance</th></tr>
  <tr><td>0</td><td>android.permission.READ_PHONE_STATE</td><td>0.1871</td></tr>
  <tr><td>1</td><td>com.google.android.c2dm.permission.RECEIVE</td><td>0.1165</td></tr>
  <tr><td>2</td><td>android.permission.RECEIVE_BOOT_COMPLETED</td><td>0.1036</td></tr>
  <tr><td>3</td><td>com.android.launcher.permission.INSTALL_SHORTCUT</td><td>0.1004</td></tr>
  <tr><td>4</td><td>android.permission.ACCESS_COARSE_LOCATION</td><td>0.0921</td></tr>
  <tr><td>5</td><td>android.permission.ACCESS_FINE_LOCATION</td><td>0.0531</td></tr>
  <tr><td>6</td><td>android.permission.GET_TASKS</td><td>0.0462</td></tr>
  <tr><td>7</td><td>android.permission.SYSTEM_ALERT_WINDOW</td><td>0.0433</td></tr>
  <tr><td>8</td><td>com.android.vending.BILLING</td><td>0.026</td></tr>
  <tr><td>9</td><td>android.permission.WRITE_SETTINGS</td><td>0.0236</td></tr>
</table>
</body>
</html>


In [None]:
from sklearn.ensemble import RandomForestClassifier
def calculate_random_forest_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, rf_kwargs) -> tuple[ModelMetrics,RandomForestClassifier]:
    # TODO: Read the function description in https://github.gatech.edu/pages/cs6035-tools/cs6035-tools.github.io/Projects/Machine_Learning/Task4.html and implement the function as described
    model = RandomForestClassifier()
    train_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }
    test_metrics = {
        "accuracy" : 0,
        "recall" : 0,
        "precision" : 0,
        "fscore" : 0,
        "fpr" : 0,
        "fnr" : 0,
        "roc_auc" : 0
        }

    rf_importance = pd.DataFrame()
    rf_metrics = ModelMetrics("Random Forest",train_metrics,test_metrics,rf_importance)

    return rf_metrics,model


In [None]:
# Run this cell to test your code
rf_kwargs = {'n_estimators':150,'max_depth':3,'criterion':'log_loss','random_state':0}
# Answers for Wine Dataset
rf_importance_df_ans = pd.read_pickle(os.path.join(os.getcwd(),"..","task4","pkl_files","CLAMP_rf_importance.pkl"))
#print('rf_importance_df_ans : \n',rf_importance_df_ans)
with open(os.path.join(os.getcwd(),"..","task4","pkl_files","rf_metrics.pkl"), 'rb') as file:
    rf_metrics_ans = pickle.load(file)
train_accuracy_ans = rf_metrics_ans.train_metrics["accuracy"]
train_recall_ans = rf_metrics_ans.train_metrics["recall"]
train_precision_ans = rf_metrics_ans.train_metrics["precision"]
train_fscore_ans = rf_metrics_ans.train_metrics["fscore"]
train_fpr_ans = rf_metrics_ans.train_metrics["fpr"]
train_fnr_ans = rf_metrics_ans.train_metrics["fnr"]
train_roc_auc_ans = rf_metrics_ans.train_metrics["roc_auc"]

test_accuracy_ans = rf_metrics_ans.test_metrics["accuracy"]
test_recall_ans = rf_metrics_ans.test_metrics["recall"]
test_precision_ans = rf_metrics_ans.test_metrics["precision"]
test_fscore_ans = rf_metrics_ans.test_metrics["fscore"]
test_fpr_ans = rf_metrics_ans.test_metrics["fpr"]
test_fnr_ans = rf_metrics_ans.test_metrics["fnr"]
test_roc_auc_ans = rf_metrics_ans.test_metrics["roc_auc"]

# Calculate Metrics with Student's Function
rf_metrics,_ = calculate_random_forest_metrics(train_features, test_features, train_targets, test_targets, rf_kwargs)
train_accuracy = rf_metrics.train_metrics["accuracy"]
print(f'Train Accuracy: {train_accuracy} - ', 'Correct' if train_accuracy == train_accuracy_ans else 'Incorrect')
train_recall = rf_metrics.train_metrics["recall"]
print(f'Train Recall: {train_recall} - ', 'Correct' if train_recall == train_recall_ans else 'Incorrect')
train_precision = rf_metrics.train_metrics["precision"]
print(f'Train Precision: {train_precision} - ', 'Correct' if train_precision == train_precision_ans else 'Incorrect')
train_fscore = rf_metrics.train_metrics["fscore"]
print(f'Train Fscore: {train_fscore} - ', 'Correct' if train_fscore == train_fscore_ans else 'Incorrect')
train_fpr = rf_metrics.train_metrics["fpr"]
print(f'Train FPR: {train_fpr} - ', 'Correct' if train_fpr == train_fpr_ans else 'Incorrect')
train_fnr = rf_metrics.train_metrics["fnr"]
print(f'Train FNR: {train_fnr} - ', 'Correct' if train_fnr == train_fnr_ans else 'Incorrect')
train_roc_auc = rf_metrics.train_metrics["roc_auc"]
print(f'Train ROC AUC: {train_roc_auc} - ', 'Correct' if train_roc_auc == train_roc_auc_ans else 'Incorrect')
test_accuracy = rf_metrics.test_metrics["accuracy"]
print(f'Test Accuracy: {test_accuracy} - ', 'Correct' if test_accuracy == test_accuracy_ans else 'Incorrect')
test_recall = rf_metrics.test_metrics["recall"]
print(f'Test Recall: {test_recall} - ', 'Correct' if test_recall == test_recall_ans else 'Incorrect')
test_precision = rf_metrics.test_metrics["precision"]
print(f'Test Precision: {test_precision} - ', 'Correct' if test_precision == test_precision_ans else 'Incorrect')
test_fscore = rf_metrics.test_metrics["fscore"]
print(f'Test Fscore: {test_fscore} - ', 'Correct' if test_fscore == test_fscore_ans else 'Incorrect')
test_fpr = rf_metrics.test_metrics["fpr"]
print(f'Test FPR: {test_fpr} - ', 'Correct' if test_fpr == test_fpr_ans else 'Incorrect')
test_fnr = rf_metrics.test_metrics["fnr"]
print(f'Test FNR: {test_fnr} - ', 'Correct' if test_fnr == test_fnr_ans else 'Incorrect')
test_roc_auc = rf_metrics.test_metrics["roc_auc"]
print(f'Test ROC AUC: {test_roc_auc} - ', 'Correct' if test_roc_auc == test_roc_auc_ans else 'Incorrect')
importance_df = rf_metrics.feat_imp_df
if compare_submission_to_answer_df(importance_df.round(4),rf_importance_df_ans,"RF Feature Importance DF"):
    print('Feature Importance DF: Correct')
else:
    print('Feature Importance DF: Incorrect')

# You have successfully reached the end of this notebook.

**Reminder you need to update the code in the skeleton files we give you in the *src* folder and upload to gradescope for credit in this assignment**