<a href="https://colab.research.google.com/github/Brittanykusi/AutoML-examples/blob/main/AutoML.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Install Packages

In [None]:
!pip install tpot mljar-supervised

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tpot
  Downloading TPOT-0.11.7-py3-none-any.whl (87 kB)
[K     |████████████████████████████████| 87 kB 5.1 MB/s 
[?25hCollecting mljar-supervised
  Downloading mljar-supervised-0.11.3.tar.gz (112 kB)
[K     |████████████████████████████████| 112 kB 38.6 MB/s 
[?25hCollecting stopit>=1.1.1
  Downloading stopit-1.1.2.tar.gz (18 kB)
Collecting deap>=1.2
  Downloading deap-1.3.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (139 kB)
[K     |████████████████████████████████| 139 kB 57.4 MB/s 
Collecting update-checker>=0.16
  Downloading update_checker-0.18.0-py3-none-any.whl (7.0 kB)
Collecting xgboost>=1.1.0
  Downloading xgboost-1.6.2-py3-none-manylinux2014_x86_64.whl (255.9 MB)
[K     |████████████████████████████████| 255.9 MB 43 kB/s 
Collecting lightgbm>=3.0.0
  Downloading lightgbm-3.3.3-py3-none-manylinux1_x86_64.whl

In [None]:
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from supervised.automl import AutoML

# Options Available

- mode — the package ships with four built-in models. 
  - The Explain mode is ideal for explaining and understanding the data. It results in visualizations of feature importance as well as tree visualizations.
  - The Perform is used when building ML models for production. 
  - The Compete is meant to build models used in machine learning competitions. 
  - The Optuna mode is used to search for highly-tuned ML models.
- algorithms — specifies the algorithms you would like to use. They are usually passed in as a list.
- results_path — the path where the results will be stored
- total_time_limit — the total time in seconds for training the model
- train_ensemble — dictates if an ensemble will be created at the end of the training process
- stack_models — determines if a models stack will be created
- eval_metric — the metric that will be optimized. If auto the logloss is used for classification problems while the rmse is used for regression problems

In [None]:
#automl = AutoML(
    # mode="Explain"
    # algorithms=""
    # results_path="AutoML_22",
    # total_time_limit=30 * 60,
    # train_ensemble=True,
    # stack_models="",
    # eval_metric=""
#)

# these are hust the parameters being set for you automl to run.
# you can look deeper into the meaning of parameters at https://supervised.mljar.com/features/modes/

# Health Dataset - Student Mental Health

## Load Dataset

In [None]:
# import package
import pandas as pd
# import csv file
mentalHealth = pd.read_csv('/content/Student Mental health.csv')
#display dataset
mentalHealth

Unnamed: 0,Timestamp,Choose your gender,Age,What is your course?,Your current year of Study,What is your CGPA?,Marital status,Do you have Depression?,Do you have Anxiety?,Do you have Panic attack?,Did you seek any specialist for a treatment?
0,8/7/2020 12:02,Female,18.0,Engineering,year 1,3.00 - 3.49,No,Yes,No,Yes,No
1,8/7/2020 12:04,Male,21.0,Islamic education,year 2,3.00 - 3.49,No,No,Yes,No,No
2,8/7/2020 12:05,Male,19.0,BIT,Year 1,3.00 - 3.49,No,Yes,Yes,Yes,No
3,8/7/2020 12:06,Female,22.0,Laws,year 3,3.00 - 3.49,Yes,Yes,No,No,No
4,8/7/2020 12:13,Male,23.0,Mathemathics,year 4,3.00 - 3.49,No,No,No,No,No
...,...,...,...,...,...,...,...,...,...,...,...
96,13/07/2020 19:56:49,Female,21.0,BCS,year 1,3.50 - 4.00,No,No,Yes,No,No
97,13/07/2020 21:21:42,Male,18.0,Engineering,Year 2,3.00 - 3.49,No,Yes,Yes,No,No
98,13/07/2020 21:22:56,Female,19.0,Nursing,Year 3,3.50 - 4.00,Yes,Yes,No,Yes,No
99,13/07/2020 21:23:57,Female,23.0,Pendidikan Islam,year 4,3.50 - 4.00,No,No,No,No,No


In [None]:
#display column names
mentalHealth.columns

Index(['Timestamp', 'Choose your gender', 'Age', 'What is your course?',
       'Your current year of Study', 'What is your CGPA?', 'Marital status',
       'Do you have Depression?', 'Do you have Anxiety?',
       'Do you have Panic attack?',
       'Did you seek any specialist for a treatment?'],
      dtype='object')

## Potential variables of interest

- GPA
- Anxiety status
- Gender
- Area of study
- Age

In [None]:
# describe column
mentalHealth['What is your CGPA?'].describe()

count             101
unique              6
top       3.50 - 4.00
freq               47
Name: What is your CGPA?, dtype: object

In [None]:
# min
mentalHealth['What is your CGPA?'].min()

'0 - 1.99'

In [None]:
# max
mentalHealth['What is your CGPA?'].max()

'3.50 - 4.00 '

In [None]:
# value counts
mentalHealth['What is your CGPA?'].value_counts()

3.50 - 4.00     47
3.00 - 3.49     43
2.50 - 2.99      4
0 - 1.99         4
2.00 - 2.49      2
3.50 - 4.00      1
Name: What is your CGPA?, dtype: int64

In [None]:
# describe column
mentalHealth['Do you have Anxiety?'].describe()

count     101
unique      2
top        No
freq       67
Name: Do you have Anxiety?, dtype: object

In [None]:
# describe column
mentalHealth['Choose your gender'].describe()

count        101
unique         2
top       Female
freq          75
Name: Choose your gender, dtype: object

In [None]:
# describe column
mentalHealth['What is your course?'].describe()

count     101
unique     49
top       BCS
freq       18
Name: What is your course?, dtype: object

In [None]:
# value counts
mentalHealth['What is your course?'].value_counts()

BCS                        18
Engineering                17
BIT                        10
Biomedical science          4
KOE                         4
BENL                        2
Laws                        2
psychology                  2
Engine                      2
Islamic Education           1
Biotechnology               1
engin                       1
Econs                       1
MHSC                        1
Malcom                      1
Kop                         1
Human Sciences              1
Communication               1
Nursing                     1
Diploma Nursing             1
IT                          1
Pendidikan Islam            1
Radiography                 1
Fiqh fatwa                  1
DIPLOMA TESL                1
Koe                         1
Fiqh                        1
CTS                         1
koe                         1
Benl                        1
Kirkhs                      1
Mathemathics                1
Pendidikan islam            1
Human Reso

In [None]:
# describe column
mentalHealth['Age'].describe()

count    100.00000
mean      20.53000
std        2.49628
min       18.00000
25%       18.00000
50%       19.00000
75%       23.00000
max       24.00000
Name: Age, dtype: float64

## Create simplified binary options --> maybe try and figure out a way to do this

In [None]:
# convert argument to numeric type
mentalHealth['Age'] = pd.to_numeric(mentalHealth['Age'], errors='coerce')
# if statement to simplify binary labels and create new column
mentalHealth['age'] = mentalHealth['Age'].apply(lambda x: 'old' if x > 21 else 'young')
# drop old column from the dataset
mentalHealth.drop('Age', axis=1, inplace=True)
#value counts for new column
mentalHealth['age'].value_counts()
#display table
mentalHealth

Unnamed: 0,Timestamp,Choose your gender,What is your course?,Your current year of Study,What is your CGPA?,Marital status,Do you have Depression?,Do you have Anxiety?,Do you have Panic attack?,Did you seek any specialist for a treatment?,age
0,8/7/2020 12:02,Female,Engineering,year 1,3.00 - 3.49,No,Yes,No,Yes,No,young
1,8/7/2020 12:04,Male,Islamic education,year 2,3.00 - 3.49,No,No,Yes,No,No,young
2,8/7/2020 12:05,Male,BIT,Year 1,3.00 - 3.49,No,Yes,Yes,Yes,No,young
3,8/7/2020 12:06,Female,Laws,year 3,3.00 - 3.49,Yes,Yes,No,No,No,old
4,8/7/2020 12:13,Male,Mathemathics,year 4,3.00 - 3.49,No,No,No,No,No,old
...,...,...,...,...,...,...,...,...,...,...,...
96,13/07/2020 19:56:49,Female,BCS,year 1,3.50 - 4.00,No,No,Yes,No,No,young
97,13/07/2020 21:21:42,Male,Engineering,Year 2,3.00 - 3.49,No,Yes,Yes,No,No,young
98,13/07/2020 21:22:56,Female,Nursing,Year 3,3.50 - 4.00,Yes,Yes,No,Yes,No,young
99,13/07/2020 21:23:57,Female,Pendidikan Islam,year 4,3.50 - 4.00,No,No,No,No,No,old


# MLJar examples

## Experiment 1: Binary Outcome - Student Mental Health

### Create a new model

In [None]:
# table w/o newly added column from above
X = mentalHealth.drop(columns=['Do you have Depression?'])

In [None]:
# seperate age column into a new variable
y = mentalHealth["Do you have Depression?"]

In [None]:
# display x
X

Unnamed: 0,Timestamp,Choose your gender,What is your course?,Your current year of Study,What is your CGPA?,Marital status,Do you have Anxiety?,Do you have Panic attack?,Did you seek any specialist for a treatment?,age
0,8/7/2020 12:02,Female,Engineering,year 1,3.00 - 3.49,No,No,Yes,No,young
1,8/7/2020 12:04,Male,Islamic education,year 2,3.00 - 3.49,No,Yes,No,No,young
2,8/7/2020 12:05,Male,BIT,Year 1,3.00 - 3.49,No,Yes,Yes,No,young
3,8/7/2020 12:06,Female,Laws,year 3,3.00 - 3.49,Yes,No,No,No,old
4,8/7/2020 12:13,Male,Mathemathics,year 4,3.00 - 3.49,No,No,No,No,old
...,...,...,...,...,...,...,...,...,...,...
96,13/07/2020 19:56:49,Female,BCS,year 1,3.50 - 4.00,No,Yes,No,No,young
97,13/07/2020 21:21:42,Male,Engineering,Year 2,3.00 - 3.49,No,Yes,No,No,young
98,13/07/2020 21:22:56,Female,Nursing,Year 3,3.50 - 4.00,Yes,No,Yes,No,young
99,13/07/2020 21:23:57,Female,Pendidikan Islam,year 4,3.50 - 4.00,No,No,No,No,old


In [None]:
# display y
y.head(50)

0     Yes
1      No
2     Yes
3     Yes
4      No
5      No
6     Yes
7      No
8      No
9      No
10     No
11    Yes
12    Yes
13     No
14     No
15     No
16     No
17    Yes
18     No
19    Yes
20     No
21     No
22     No
23     No
24    Yes
25     No
26     No
27    Yes
28    Yes
29     No
30     No
31     No
32     No
33    Yes
34    Yes
35     No
36    Yes
37    Yes
38     No
39    Yes
40     No
41     No
42    Yes
43     No
44     No
45     No
46     No
47     No
48    Yes
49     No
Name: Do you have Depression?, dtype: object

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.25)

In [None]:
# predict 
automl = AutoML(results_path="predDepression", mode="Explain")


In [None]:
automl.fit(X_train, y_train)

AutoML directory: predDepression
The task is binary_classification with evaluation metric logloss
AutoML will use algorithms: ['Baseline', 'Linear', 'Decision Tree', 'Random Forest', 'Xgboost', 'Neural Network']
AutoML will ensemble available models
AutoML steps: ['simple_algorithms', 'default_algorithms', 'ensemble']
* Step simple_algorithms will try to check up to 3 models
1_Baseline logloss 0.659979 trained in 0.51 seconds




2_DecisionTree logloss 1.7814 trained in 11.23 seconds
3_Linear logloss 0.499367 trained in 3.38 seconds
* Step default_algorithms will try to check up to 3 models
4_Default_Xgboost logloss 0.57594 trained in 5.12 seconds
5_Default_NeuralNetwork logloss 0.500703 trained in 1.27 seconds
6_Default_RandomForest logloss 0.426778 trained in 6.87 seconds
* Step ensemble will try to check up to 1 model
Ensemble logloss 0.415842 trained in 0.44 seconds
AutoML fit time: 41.95 seconds
AutoML best model: Ensemble


AutoML(results_path='predDepression')

In [None]:
pred = automl.predict(X_test)
pred

array(['No', 'No', 'No', 'No', 'Yes', 'No', 'No', 'No', 'No', 'No', 'Yes',
       'No', 'No', 'No', 'Yes', 'No', 'No', 'No', 'No', 'Yes', 'Yes',
       'Yes', 'No', 'No', 'No', 'No'], dtype=object)

In [None]:
automl.report()

Best model,name,model_type,metric_type,metric_value,train_time
,1_Baseline,Baseline,logloss,0.659979,1.55
,2_DecisionTree,Decision Tree,logloss,1.7814,12.65
,3_Linear,Linear,logloss,0.499367,4.8
,4_Default_Xgboost,Xgboost,logloss,0.57594,6.77
,5_Default_NeuralNetwork,Neural Network,logloss,0.500703,2.55
,6_Default_RandomForest,Random Forest,logloss,0.426778,8.32
the best,Ensemble,Ensemble,logloss,0.415842,0.44

Unnamed: 0,score,threshold
logloss,0.426778,
auc,0.833333,
f1,0.8,0.333481
accuracy,0.842105,0.333481
precision,1.0,0.607837
recall,1.0,0.160236
mcc,0.676123,0.607837

Unnamed: 0,score,threshold
logloss,0.426778,
auc,0.833333,
f1,0.8,0.333481
accuracy,0.842105,0.333481
precision,0.75,0.333481
recall,0.857143,0.333481
mcc,0.674601,0.333481

Unnamed: 0,Predicted as No,Predicted as Yes
Labeled as No,10,2
Labeled as Yes,1,6

Unnamed: 0,score,threshold
logloss,1.7814,
auc,0.660714,
f1,0.727273,0.37931
accuracy,0.842105,0.37931
precision,1.0,0.37931
recall,0.714286,0.0
mcc,0.676123,0.37931

Unnamed: 0,score,threshold
logloss,1.7814,
auc,0.660714,
f1,0.727273,0.37931
accuracy,0.842105,0.37931
precision,1.0,0.37931
recall,0.571429,0.37931
mcc,0.676123,0.37931

Unnamed: 0,Predicted as No,Predicted as Yes
Labeled as No,12,0
Labeled as Yes,3,4

Unnamed: 0,score,threshold
logloss,0.500703,
auc,0.785714,
f1,0.769231,0.310576
accuracy,0.842105,0.382559
precision,1.0,0.382559
recall,1.0,0.0539663
mcc,0.676123,0.382559

Unnamed: 0,score,threshold
logloss,0.500703,
auc,0.785714,
f1,0.727273,0.382559
accuracy,0.842105,0.382559
precision,1.0,0.382559
recall,0.571429,0.382559
mcc,0.676123,0.382559

Unnamed: 0,Predicted as No,Predicted as Yes
Labeled as No,12,0
Labeled as Yes,3,4

Unnamed: 0,score,threshold
logloss,0.57594,
auc,0.761905,
f1,0.833333,0.59002
accuracy,0.894737,0.59002
precision,1.0,0.59002
recall,1.0,0.137095
mcc,0.782461,0.59002

Unnamed: 0,score,threshold
logloss,0.57594,
auc,0.761905,
f1,0.833333,0.59002
accuracy,0.894737,0.59002
precision,1.0,0.59002
recall,0.714286,0.59002
mcc,0.782461,0.59002

Unnamed: 0,Predicted as No,Predicted as Yes
Labeled as No,12,0
Labeled as Yes,2,5

Unnamed: 0,score,threshold
logloss,0.659979,
auc,0.5,
f1,0.538462,0.305357
accuracy,0.368421,0.305357
precision,0.368421,0.305357
recall,1.0,0.305357
mcc,0.0,0.305357

Unnamed: 0,score,threshold
logloss,0.659979,
auc,0.5,
f1,0.538462,0.305357
accuracy,0.368421,0.305357
precision,0.368421,0.305357
recall,1.0,0.305357
mcc,0.0,0.305357

Unnamed: 0,Predicted as No,Predicted as Yes
Labeled as No,0,12
Labeled as Yes,0,7

Model,Weight
3_Linear,1
6_Default_RandomForest,2

Unnamed: 0,score,threshold
logloss,0.415842,
auc,0.809524,
f1,0.833333,0.382845
accuracy,0.894737,0.382845
precision,1.0,0.382845
recall,1.0,0.112398
mcc,0.782461,0.382845

Unnamed: 0,score,threshold
logloss,0.415842,
auc,0.809524,
f1,0.833333,0.382845
accuracy,0.894737,0.382845
precision,1.0,0.382845
recall,0.714286,0.382845
mcc,0.782461,0.382845

Unnamed: 0,Predicted as No,Predicted as Yes
Labeled as No,12,0
Labeled as Yes,2,5

Unnamed: 0,score,threshold
logloss,0.499367,
auc,0.77381,
f1,0.833333,0.573201
accuracy,0.894737,0.573201
precision,1.0,0.573201
recall,1.0,0.0167229
mcc,0.782461,0.573201

Unnamed: 0,score,threshold
logloss,0.499367,
auc,0.77381,
f1,0.833333,0.573201
accuracy,0.894737,0.573201
precision,1.0,0.573201
recall,0.714286,0.573201
mcc,0.782461,0.573201

Unnamed: 0,Predicted as No,Predicted as Yes
Labeled as No,12,0
Labeled as Yes,2,5

feature,Learner_1
Marital status,1.49327
Did you seek any specialist for a treatment?,0.646115
Do you have Anxiety?,0.539524
age,0.354082
Do you have Panic attack?,0.267781
Your current year of Study,0.0854077
Timestamp,0.0640348
What is your course?,0.0580102
What is your CGPA?,-0.326016
intercept,-0.723588


## Experiement 2: Regression Model - Student Mental Health

In [None]:
import numpy as np
import pandas as pd
from supervised.automl import AutoML

x_cols = [c for c in mentalHealth.columns if c != "age"]
X = mentalHealth[x_cols]
y = mentalHealth["age"]

In [None]:
X

Unnamed: 0,Timestamp,Choose your gender,What is your course?,Your current year of Study,What is your CGPA?,Marital status,Do you have Depression?,Do you have Anxiety?,Do you have Panic attack?,Did you seek any specialist for a treatment?
0,8/7/2020 12:02,Female,Engineering,year 1,3.00 - 3.49,No,Yes,No,Yes,No
1,8/7/2020 12:04,Male,Islamic education,year 2,3.00 - 3.49,No,No,Yes,No,No
2,8/7/2020 12:05,Male,BIT,Year 1,3.00 - 3.49,No,Yes,Yes,Yes,No
3,8/7/2020 12:06,Female,Laws,year 3,3.00 - 3.49,Yes,Yes,No,No,No
4,8/7/2020 12:13,Male,Mathemathics,year 4,3.00 - 3.49,No,No,No,No,No
...,...,...,...,...,...,...,...,...,...,...
96,13/07/2020 19:56:49,Female,BCS,year 1,3.50 - 4.00,No,No,Yes,No,No
97,13/07/2020 21:21:42,Male,Engineering,Year 2,3.00 - 3.49,No,Yes,Yes,No,No
98,13/07/2020 21:22:56,Female,Nursing,Year 3,3.50 - 4.00,Yes,Yes,No,Yes,No
99,13/07/2020 21:23:57,Female,Pendidikan Islam,year 4,3.50 - 4.00,No,No,No,No,No


In [None]:
automl = AutoML()
automl.fit(X, y)

AutoML directory: AutoML_1
The task is binary_classification with evaluation metric logloss
AutoML will use algorithms: ['Baseline', 'Linear', 'Decision Tree', 'Random Forest', 'Xgboost', 'Neural Network']
AutoML will ensemble available models
AutoML steps: ['simple_algorithms', 'default_algorithms', 'ensemble']
* Step simple_algorithms will try to check up to 3 models
1_Baseline logloss 0.666549 trained in 1.12 seconds
2_DecisionTree logloss 6.047488 trained in 12.4 seconds
3_Linear logloss 0.680867 trained in 3.9 seconds
* Step default_algorithms will try to check up to 3 models
4_Default_Xgboost logloss 0.691865 trained in 5.45 seconds
5_Default_NeuralNetwork logloss 0.795969 trained in 1.37 seconds
6_Default_RandomForest logloss 0.697672 trained in 7.61 seconds
* Step ensemble will try to check up to 1 model
Ensemble logloss 0.650457 trained in 0.47 seconds
AutoML fit time: 45.18 seconds
AutoML best model: Ensemble


AutoML()

In [None]:
mentalHealth["predictions"] = automl.predict(X)

In [None]:
print("Predictions")
print(mentalHealth[["age", "predictions"]].head())

In [None]:
automl.report()

Best model,name,model_type,metric_type,metric_value,train_time
,1_Baseline,Baseline,logloss,0.666549,2.09
,2_DecisionTree,Decision Tree,logloss,6.04749,13.84
,3_Linear,Linear,logloss,0.680867,5.39
,4_Default_Xgboost,Xgboost,logloss,0.691865,6.97
,5_Default_NeuralNetwork,Neural Network,logloss,0.795969,2.65
,6_Default_RandomForest,Random Forest,logloss,0.697672,9.12
the best,Ensemble,Ensemble,logloss,0.650457,0.47

Unnamed: 0,score,threshold
logloss,0.697672,
auc,0.5375,
f1,0.761905,0.322857
accuracy,0.653846,0.412874
precision,1.0,0.662259
recall,1.0,0.322857
mcc,0.233074,0.412874

Unnamed: 0,score,threshold
logloss,0.697672,
auc,0.5375,
f1,0.742857,0.412874
accuracy,0.653846,0.412874
precision,0.684211,0.412874
recall,0.8125,0.412874
mcc,0.233074,0.412874

Unnamed: 0,Predicted as old,Predicted as young
Labeled as old,4,6
Labeled as young,3,13

Unnamed: 0,score,threshold
logloss,6.04749,
auc,0.4375,
f1,0.580645,0.0
accuracy,0.5,0.0
precision,0.6,0.0
recall,0.5625,0.0
mcc,0.0,0.0

Unnamed: 0,score,threshold
logloss,6.04749,
auc,0.4375,
f1,0.580645,0.0
accuracy,0.5,0.0
precision,0.6,0.0
recall,0.5625,0.0
mcc,-0.0369274,0.0

Unnamed: 0,Predicted as old,Predicted as young
Labeled as old,4,6
Labeled as young,7,9

Unnamed: 0,score,threshold
logloss,0.795969,
auc,0.56875,
f1,0.789474,0.435504
accuracy,0.692308,0.435504
precision,0.6875,0.652186
recall,1.0,0.0423568
mcc,0.320245,0.435504

Unnamed: 0,score,threshold
logloss,0.795969,
auc,0.56875,
f1,0.789474,0.435504
accuracy,0.692308,0.435504
precision,0.681818,0.435504
recall,0.9375,0.435504
mcc,0.320245,0.435504

Unnamed: 0,Predicted as old,Predicted as young
Labeled as old,3,7
Labeled as young,1,15

Unnamed: 0,score,threshold
logloss,0.691865,
auc,0.50625,
f1,0.761905,0.400577
accuracy,0.615385,0.400577
precision,1.0,0.554711
recall,1.0,0.400577
mcc,0.228218,0.554711

Unnamed: 0,score,threshold
logloss,0.691865,
auc,0.50625,
f1,0.761905,0.400577
accuracy,0.615385,0.400577
precision,0.615385,0.400577
recall,1.0,0.400577
mcc,0.0,0.400577

Unnamed: 0,Predicted as old,Predicted as young
Labeled as old,0,10
Labeled as young,0,16

Unnamed: 0,score,threshold
logloss,0.666549,
auc,0.5,
f1,0.761905,0.564
accuracy,0.615385,0.564
precision,0.615385,0.564
recall,1.0,0.564
mcc,0.0,0.564

Unnamed: 0,score,threshold
logloss,0.666549,
auc,0.5,
f1,0.761905,0.564
accuracy,0.615385,0.564
precision,0.615385,0.564
recall,1.0,0.564
mcc,0.0,0.564

Unnamed: 0,Predicted as old,Predicted as young
Labeled as old,0,10
Labeled as young,0,16

Model,Weight
1_Baseline,2
3_Linear,1
5_Default_NeuralNetwork,1
6_Default_RandomForest,1

Unnamed: 0,score,threshold
logloss,0.650457,
auc,0.55625,
f1,0.8,0.478872
accuracy,0.692308,0.478872
precision,1.0,0.701934
recall,1.0,0.364827
mcc,0.365148,0.478872

Unnamed: 0,score,threshold
logloss,0.650457,
auc,0.55625,
f1,0.8,0.478872
accuracy,0.692308,0.478872
precision,0.666667,0.478872
recall,1.0,0.478872
mcc,0.365148,0.478872

Unnamed: 0,Predicted as old,Predicted as young
Labeled as old,2,8
Labeled as young,0,16

Unnamed: 0,score,threshold
logloss,0.680867,
auc,0.55625,
f1,0.820513,0.407009
accuracy,0.730769,0.407009
precision,1.0,0.857863
recall,1.0,0.262164
mcc,0.456832,0.407009

Unnamed: 0,score,threshold
logloss,0.680867,
auc,0.55625,
f1,0.820513,0.407009
accuracy,0.730769,0.407009
precision,0.695652,0.407009
recall,1.0,0.407009
mcc,0.456832,0.407009

Unnamed: 0,Predicted as old,Predicted as young
Labeled as old,3,7
Labeled as young,0,16

feature,Learner_1
intercept,0.590994
Do you have Depression?,0.289116
Do you have Anxiety?,0.23822
What is your course?,0.0468643
Timestamp,-0.0757303
Did you seek any specialist for a treatment?,-0.14231
Do you have Panic attack?,-0.144068
What is your CGPA?,-0.152512
Choose your gender,-0.249647
Marital status,-0.401601


# Download outputs

In [None]:
!zip -r /content/predDepression.zip.zip . -i /content/predDepression

