<a href="https://colab.research.google.com/github/alivarastepour/diabetes_prediction/blob/master/diabetes_prediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Purpose of this notebook
This notebook aims to build a model that determines whether a person is prone to diabetes or not. Additionally, it seeks to identify a subset of features (risk factors) that can accurately predict the risk of diabetes. The weights of the optimal solution will be utilized in another project, where they will be applied to users' inputs in real time.

## Dataset
This notebook makes use of a subset of a larger dataset which aimed to collect uniform, state-specific data on preventive health practices and risk behaviors that are associated with chronic diseases, injuries, and preventable infectious diseases in the adult population. The subset used in this notebook can be accessed [here](https://www.kaggle.com/datasets/alexteboul/diabetes-health-indicators-dataset?select=diabetes_binary_5050split_health_indicators_BRFSS2015.csv).

In [1]:
import pandas as pd
import numpy as np
from google.colab import drive

from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense,LeakyReLU,Dropout
from keras.optimizers import Adagrad, RMSprop, Adam
from keras.regularizers import l1, l2
from keras.initializers import he_normal
from keras.activations import selu
from keras.metrics import Precision, Recall
from sklearn.decomposition import PCA

In [2]:
drive.mount('/drive')
DATASET_ADDRESS = '/drive/MyDrive/diabetes_info.csv'
raw_dataset = pd.read_csv(DATASET_ADDRESS)

Mounted at /drive


In [24]:
raw_dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 70692 entries, 0 to 70691
Data columns (total 22 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Diabetes_binary       70692 non-null  float64
 1   HighBP                70692 non-null  float64
 2   HighChol              70692 non-null  float64
 3   CholCheck             70692 non-null  float64
 4   BMI                   70692 non-null  float64
 5   Smoker                70692 non-null  float64
 6   Stroke                70692 non-null  float64
 7   HeartDiseaseorAttack  70692 non-null  float64
 8   PhysActivity          70692 non-null  float64
 9   Fruits                70692 non-null  float64
 10  Veggies               70692 non-null  float64
 11  HvyAlcoholConsump     70692 non-null  float64
 12  AnyHealthcare         70692 non-null  float64
 13  NoDocbcCost           70692 non-null  float64
 14  GenHlth               70692 non-null  float64
 15  MentHlth           

In [26]:
raw_dataset = raw_dataset.drop(columns=['Income']) # Income unit used in this research is USD which is not scalable to Rials. So we ignore it.

In [27]:
y = raw_dataset["Diabetes_binary"]
x = raw_dataset.drop(columns=["Diabetes_binary"])

In [28]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.20, random_state=9)

# Model selection
While our data may appear relatively clean, this does not guarantee optimal performance. Therefore, we must leverage a range of machine learning models to assess their effectiveness and identify potential modifications to the original data that can enhance the performance of our models.

## First model: Gradient boost classifier
Boosting algorithms have been widely recognized as effective choices for handling tabular data. Among them, gradient boosting stands out as a prominent technique that leverages decision trees to create a powerful ensemble model. Nonetheless, to ensure its optimal performance, careful consideration should be given to hyperparameter tuning.

In [29]:
def get_data(dataset):
  y = dataset["Diabetes_binary"]
  x = dataset.drop(columns=["Diabetes_binary"])
  return train_test_split(x, y, test_size=0.20, random_state=9)

In [30]:
def gradient_boost_classifier_model(dataset, learning_rate=0.05, n_estimators=150, subsample=0.8):
  x_train, x_test, y_train, y_test = get_data(dataset)
  reg = GradientBoostingClassifier(random_state=90,
                                loss='deviance',
                                learning_rate=learning_rate,
                                n_estimators=n_estimators,
                                subsample=subsample,
                                criterion='friedman_mse',
                                verbose=2,
                                )
  reg.fit(x_train, y_train)
  y_pred = reg.predict(x_test)
  report = classification_report(y_test, y_pred)
  print(report)

In [31]:
gradient_boost_classifier_model(raw_dataset)



      Iter       Train Loss      OOB Improve   Remaining Time 
         1           1.3619           0.0245            8.04s
         2           1.3400           0.0224            7.90s
         3           1.3200           0.0203            7.82s
         4           1.3008           0.0178            7.93s
         5           1.2841           0.0163            7.84s
         6           1.2685           0.0151            7.79s
         7           1.2545           0.0136            7.71s
         8           1.2417           0.0127            7.77s
         9           1.2303           0.0117            7.68s
        10           1.2191           0.0112            7.65s
        11           1.2090           0.0101            7.71s
        12           1.1982           0.0100            7.63s
        13           1.1890           0.0094            7.55s
        14           1.1793           0.0088            7.48s
        15           1.1707           0.0078            7.42s
       

## The deviance loss
Deviance loss is a commonly used loss function in binary classification problems. With a glance at its formula, we can easily unserstand why:

$$
L(y, p) = \left(y \log(p) + (1 - y) \log(1 - p)\right)
$$

where y is true class and p is statistical probability.




## F-1 score
F-1 score uses precision(ratio of true possitives to true possitves and false possitives) and recall(ratio of true possitives to true possitves and false negatives) scores to prvoide a balance between them:

$$F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}$$


In [32]:
# scores = cross_val_score(reg, x_train, y_train, cv=5, scoring='f1_macro')
# print("cross validation scores(F-1) where k=5: ", scores)

In [33]:
# scores = cross_val_score(reg, x_train, y_train, cv=10, scoring='f1_macro')
# print("cross validation scores(F-1) where k=10: ", scores)

## Initial evaluataion result
As demonstrated above, whether employing Gradient Boosting with or without cross-validation, the F1 score hovers around 0.75. While this performance is acceptable, there is room for improvement.

# Second model: Logistic regression
While Logistic Regression is typically considered a more linear model compared to ensemble methods, it remains a highly prevalent choice in classification problems. It offers several distinct advantages, such as strong interpretability, feature importance insights, and the ability to not only make binary classifications but also provide class probabilities. This probabilistic aspect can prove particularly valuable in certain situations."







In [34]:
def logistic_regression_model(dataset):
  x_train, x_test, y_train, y_test = get_data(dataset)
  log_reg = LogisticRegression(random_state=32, solver='sag', multi_class='multinomial', verbose=2, max_iter=500)
  log_reg.fit(x_train, y_train)
  y_pred_log_reg = log_reg.predict(x_test)
  report_log_reg = classification_report(y_test, y_pred_log_reg)
  print(log_reg.coef_)
  print(report_log_reg)

In [35]:
logistic_regression_model(raw_dataset)

convergence after 183 epochs took 8 seconds
[[ 0.37257845  0.28995716  0.66909954  0.0368971  -0.00249678  0.09599408
   0.11823623 -0.01565236 -0.01979998 -0.05413245 -0.37604701  0.00626531
   0.02791119  0.30086121 -0.00214749 -0.00390029  0.08630184  0.11340987
   0.07794295 -0.03393694]]
              precision    recall  f1-score   support

         0.0       0.75      0.73      0.74      7010
         1.0       0.74      0.76      0.75      7129

    accuracy                           0.74     14139
   macro avg       0.74      0.74      0.74     14139
weighted avg       0.74      0.74      0.74     14139



## Evaluation result
Logistic regression exhibited slightly lower performance compared to Gradient Boosting, indicating that additional data preprocessing may be necessary to enhance model outcomes.

### Checking for class imbalance

In [36]:
np.bincount(y)

array([35346, 35346])

## Standardizing features
In this section we standardize featuers that their domain may mislead oue models.

In [37]:
columns_to_standardize = list(x.keys())

In [38]:
scaler = StandardScaler()
standarized_features = scaler.fit_transform(raw_dataset[columns_to_standardize])
standardized_dataset = pd.DataFrame()
standardized_dataset["Diabetes_binary"] = raw_dataset["Diabetes_binary"]
standardized_dataset[columns_to_standardize] = standarized_features

In [39]:
gradient_boost_classifier_model(standardized_dataset)



      Iter       Train Loss      OOB Improve   Remaining Time 
         1           1.3619           0.0245            7.87s
         2           1.3400           0.0224            7.74s
         3           1.3200           0.0203            7.65s
         4           1.3008           0.0178            7.83s
         5           1.2841           0.0163            7.74s
         6           1.2685           0.0151            7.71s
         7           1.2545           0.0136            7.67s
         8           1.2417           0.0127            7.67s
         9           1.2303           0.0117            7.60s
        10           1.2191           0.0112            7.52s
        11           1.2090           0.0101            7.46s
        12           1.1982           0.0100            7.49s
        13           1.1890           0.0094            7.42s
        14           1.1793           0.0088            7.38s
        15           1.1707           0.0078            7.32s
       

In [40]:
logistic_regression_model(standardized_dataset)

convergence after 35 epochs took 1 seconds
[[ 0.18478788  0.14481089  0.10466094  0.26256455 -0.00122275  0.02319897
   0.04193551 -0.007137   -0.00966964 -0.02206856 -0.07615758  0.00134743
   0.00820435  0.33510984 -0.01749871 -0.03925738  0.03750007  0.05652744
   0.22233715 -0.03485117]]
              precision    recall  f1-score   support

         0.0       0.75      0.73      0.74      7010
         1.0       0.74      0.76      0.75      7129

    accuracy                           0.74     14139
   macro avg       0.75      0.74      0.74     14139
weighted avg       0.74      0.74      0.74     14139



## Normalizing features
Standardization helped the convergance of our model, but didn't countribute to the evaluation metrics. Now we try with normalized data.

In [41]:
columns_to_normalize = list(x.keys())

In [42]:
min_max_scaler = MinMaxScaler()
normalized_features = min_max_scaler.fit_transform(raw_dataset[columns_to_standardize])
normalized_dataset = pd.DataFrame()
normalized_dataset["Diabetes_binary"] = raw_dataset["Diabetes_binary"]
normalized_dataset[columns_to_standardize] = normalized_features

In [43]:
gradient_boost_classifier_model(normalized_dataset)

      Iter       Train Loss      OOB Improve   Remaining Time 




         1           1.3619           0.0245            9.25s
         2           1.3400           0.0224            8.53s
         3           1.3200           0.0203            8.36s
         4           1.3008           0.0178            8.38s
         5           1.2841           0.0163            8.32s
         6           1.2685           0.0151            8.12s
         7           1.2545           0.0136            8.06s
         8           1.2417           0.0127            7.98s
         9           1.2303           0.0117            7.90s
        10           1.2191           0.0112            7.79s
        11           1.2090           0.0101            7.69s
        12           1.1982           0.0100            7.62s
        13           1.1890           0.0094            7.55s
        14           1.1793           0.0088            7.45s
        15           1.1707           0.0078            7.36s
        16           1.1641           0.0067            7.31s
        

In [44]:
logistic_regression_model(normalized_dataset)

convergence after 47 epochs took 2 seconds
[[ 3.73490615e-01  2.90048149e-01  6.70926918e-01  3.13847638e+00
  -2.39854135e-03  9.56291752e-02  1.18219842e-01 -1.60231524e-02
  -1.99615981e-02 -5.40853570e-02 -3.76418229e-01  6.72929731e-03
   2.79537211e-02  1.20272703e+00 -6.45089519e-02 -1.16871052e-01
   8.72398803e-02  1.13443496e-01  9.31422712e-01 -1.69375700e-01]]
              precision    recall  f1-score   support

         0.0       0.75      0.73      0.74      7010
         1.0       0.74      0.76      0.75      7129

    accuracy                           0.74     14139
   macro avg       0.75      0.74      0.74     14139
weighted avg       0.74      0.74      0.74     14139



# What happened?
It turned out that algorithms like Logitstic regression and Gradientboost are robust to data scale due to a number of factors like their loss functions, use of decision trees and regularization factors, etc. So we have to find another way to reach our goal.

# Next model: DNN
neural networks are the master of finding complex relations between featurse. In addition to that, they can be combined with various functionalities to improve model's behavoir even further, e.g. optimizers, regularization factors, etc.

In [45]:
def dnn_model(dataset):
  x_train, x_test, y_train, y_test = get_data(dataset)
  model = Sequential()
  model.add(Dense(64, input_dim=x_train.shape[1], activation=LeakyReLU(alpha=0.1), kernel_initializer=he_normal()))
  model.add(Dropout(0.5))
  model.add(Dense(128, activation='relu'))
  model.add(Dense(32, activation='relu', kernel_regularizer=l1(0.1)))
  model.add(Dense(1, activation='sigmoid'))
  adam = Adagrad(learning_rate=0.1)
  model.compile(loss='binary_crossentropy', optimizer=adam, metrics=[Precision(), Recall()])
  model.fit(x_train, y_train, epochs=100, verbose=2, validation_split=0.1, batch_size=100,)
  res = model.evaluate(x_test, y_test)
  print("binary cross-entropy loss : ", res[0], " precision: ", res[1], " recal: ", res[2])

In [46]:
dnn_model(standardized_dataset)

Epoch 1/100
509/509 - 2s - loss: 2.3990 - precision_1: 0.6433 - recall_1: 0.5997 - val_loss: 1.4802 - val_precision_1: 0.7731 - val_recall_1: 0.6284 - 2s/epoch - 4ms/step
Epoch 2/100
509/509 - 1s - loss: 1.3142 - precision_1: 0.7220 - recall_1: 0.7479 - val_loss: 1.1847 - val_precision_1: 0.7292 - val_recall_1: 0.7798 - 1s/epoch - 2ms/step
Epoch 3/100
509/509 - 1s - loss: 1.1248 - precision_1: 0.7175 - recall_1: 0.7837 - val_loss: 1.0481 - val_precision_1: 0.7037 - val_recall_1: 0.8435 - 1s/epoch - 2ms/step
Epoch 4/100
509/509 - 1s - loss: 1.0250 - precision_1: 0.7157 - recall_1: 0.7965 - val_loss: 0.9885 - val_precision_1: 0.6973 - val_recall_1: 0.8593 - 1s/epoch - 2ms/step
Epoch 5/100
509/509 - 1s - loss: 0.9607 - precision_1: 0.7148 - recall_1: 0.8081 - val_loss: 0.9263 - val_precision_1: 0.7064 - val_recall_1: 0.8392 - 1s/epoch - 3ms/step
Epoch 6/100
509/509 - 2s - loss: 0.9196 - precision_1: 0.7128 - recall_1: 0.8163 - val_loss: 0.8906 - val_precision_1: 0.7027 - val_recall_1: 0.8

## Result
As we saw, different models are not showing significant result improvements. So we may need to make some changes to our data

## The correlation matrix and its usage
Correlation matrix simply explains the relationship between columns of a dataset. The correlation coefficient ranges between -1 and 1. A correlation coefficient of 1 indicates a perfect positive correlation, meaning that the two variables increase or decrease together in a linear relationship. A correlation coefficient of -1 indicates a perfect negative correlation, meaning that the two variables move in opposite directions in a linear relationship. A correlation coefficient close to 0 suggests no linear relationship between the variables.

This matrix can be helpful when finding an optimal subset of features.

In [47]:
def sort_correlations(correlation_matrix):
  return sorted(correlation_matrix.items(), key=lambda x:abs(x[1]))

In [48]:
def get_correlations(dataset):
  columns = dataset.keys()
  correlation = dataset[columns].corr()
  return correlation["Diabetes_binary"]

In [49]:
correlation_map = sort_correlations(get_correlations(raw_dataset))

In [50]:
correlation_map

[('AnyHealthcare', 0.02319074853112824),
 ('NoDocbcCost', 0.040976573266643494),
 ('Sex', 0.044412858371260695),
 ('Fruits', -0.05407655628666651),
 ('Veggies', -0.07929314561269872),
 ('Smoker', 0.08599896420800192),
 ('MentHlth', 0.08702877147509416),
 ('HvyAlcoholConsump', -0.09485313995926549),
 ('CholCheck', 0.11538161710270915),
 ('Stroke', 0.12542678468516733),
 ('PhysActivity', -0.15866560486405157),
 ('Education', -0.17048063498806143),
 ('HeartDiseaseorAttack', 0.21152340436022687),
 ('PhysHlth', 0.21308101903810317),
 ('DiffWalk', 0.272646006159808),
 ('Age', 0.27873806628188813),
 ('HighChol', 0.28921280708865016),
 ('BMI', 0.29337274476103575),
 ('HighBP', 0.3815155489073117),
 ('GenHlth', 0.4076115984949182),
 ('Diabetes_binary', 1.0)]

In [51]:
keep_features = map(lambda b: b[0], filter(lambda a: abs(a[1]) > 0.25, correlation_map))

In [52]:
modified_dataset = raw_dataset[list(keep_features)]

In [53]:
logistic_regression_model(modified_dataset)

convergence after 38 epochs took 1 seconds
[[0.0702186  0.08500237 0.29923042 0.03782045 0.39012718 0.30067847]]
              precision    recall  f1-score   support

         0.0       0.75      0.73      0.74      7010
         1.0       0.74      0.76      0.75      7129

    accuracy                           0.74     14139
   macro avg       0.74      0.74      0.74     14139
weighted avg       0.74      0.74      0.74     14139



# Feature selection result: Logistic regression
By reducing the feature count using the correlation matrix and only keeping faetures that have more meaningful relationship with the target featurse, Logistic regression not only converged faster, it also kept its accuracy.

In [54]:
gradient_boost_classifier_model(modified_dataset)

      Iter       Train Loss      OOB Improve   Remaining Time 
         1           1.3619           0.0245            5.40s
         2           1.3400           0.0224            4.98s




         3           1.3200           0.0203            4.92s
         4           1.3008           0.0178            4.82s
         5           1.2841           0.0163            4.73s
         6           1.2685           0.0151            4.68s
         7           1.2545           0.0136            4.68s
         8           1.2417           0.0127            4.63s
         9           1.2303           0.0117            4.58s
        10           1.2191           0.0112            4.55s
        11           1.2090           0.0101            4.60s
        12           1.1982           0.0100            4.54s
        13           1.1890           0.0094            4.55s
        14           1.1793           0.0088            4.54s
        15           1.1707           0.0078            4.49s
        16           1.1641           0.0067            4.45s
        17           1.1571           0.0076            4.42s
        18           1.1502           0.0062            4.37s
        

# Feature selection result: GradientBoost
GradientBoost was also capable of keeping its performance after feature selection. It is worthy of noting that tuning hyperparameters had a mild effect on this model but it was negligible.

In [55]:
dnn_model(modified_dataset)

Epoch 1/100
509/509 - 3s - loss: 3.0580 - precision_2: 0.5315 - recall_2: 0.4983 - val_loss: 1.4246 - val_precision_2: 0.5755 - val_recall_2: 0.9180 - 3s/epoch - 6ms/step
Epoch 2/100
509/509 - 1s - loss: 1.3094 - precision_2: 0.5808 - recall_2: 0.6524 - val_loss: 1.2035 - val_precision_2: 0.7547 - val_recall_2: 0.5861 - 1s/epoch - 2ms/step
Epoch 3/100
509/509 - 1s - loss: 1.1172 - precision_2: 0.6209 - recall_2: 0.6806 - val_loss: 1.0288 - val_precision_2: 0.7063 - val_recall_2: 0.7809 - 1s/epoch - 2ms/step
Epoch 4/100
509/509 - 1s - loss: 1.0026 - precision_2: 0.6494 - recall_2: 0.6794 - val_loss: 0.9491 - val_precision_2: 0.7062 - val_recall_2: 0.8056 - 1s/epoch - 2ms/step
Epoch 5/100
509/509 - 1s - loss: 0.9300 - precision_2: 0.6736 - recall_2: 0.6876 - val_loss: 0.8605 - val_precision_2: 0.6928 - val_recall_2: 0.8407 - 1s/epoch - 2ms/step
Epoch 6/100
509/509 - 1s - loss: 0.8742 - precision_2: 0.6897 - recall_2: 0.6908 - val_loss: 0.8314 - val_precision_2: 0.7063 - val_recall_2: 0.8

# Feature selection result: DNN
DNN also proved to be consistant after feature selection. It even had a mild improvement(which is again, negligible).

# Feature selection overall result
So to conclude, we were able to predict our target feature with an acceptable accuracy even after selecting a subsest of our features. The following is the list of remaining features which proved to be decisive: DiffWalk, Age, HighChol, BMI, HighBP, GenHlth


In [56]:
# rename functions to snake_case

In [57]:
def get_feature_name(count):
  return ["f{c}".format(c=i) for i in range(count)]

In [58]:
component_count = 15

all_features = list(raw_dataset.keys())

all_features.remove("Diabetes_binary")

pca = PCA(n_components = component_count)

pca_columns = pca.fit_transform(standardized_dataset[all_features])

In [59]:
pca_dataset = pd.DataFrame()
pca_dataset["Diabetes_binary"] = standardized_dataset["Diabetes_binary"]
pca_dataset[get_feature_name(component_count)] =  pca_columns

In [60]:
logistic_regression_model(pca_dataset)

convergence after 35 epochs took 1 seconds
[[ 0.34523619 -0.2000573   0.0320172  -0.12004041  0.15654393  0.13973331
  -0.01753243  0.02543494 -0.02066987  0.03206145  0.02802083  0.05011259
  -0.00294837  0.02060696 -0.00771329]]
              precision    recall  f1-score   support

         0.0       0.74      0.73      0.74      7010
         1.0       0.74      0.76      0.75      7129

    accuracy                           0.74     14139
   macro avg       0.74      0.74      0.74     14139
weighted avg       0.74      0.74      0.74     14139



In [61]:
gradient_boost_classifier_model(pca_dataset)

      Iter       Train Loss      OOB Improve   Remaining Time 




         1           1.3608           0.0256           49.52s
         2           1.3377           0.0230           49.12s
         3           1.3170           0.0212           48.21s
         4           1.2966           0.0187           47.60s
         5           1.2799           0.0176           47.13s
         6           1.2633           0.0157           46.84s
         7           1.2483           0.0145           46.40s
         8           1.2349           0.0135           46.00s
         9           1.2225           0.0127           45.82s
        10           1.2112           0.0114           45.52s
        11           1.1991           0.0105           45.09s
        12           1.1910           0.0099           44.77s
        13           1.1813           0.0091           44.37s
        14           1.1718           0.0081           44.00s
        15           1.1634           0.0076           43.68s
        16           1.1561           0.0073           43.32s
        

In [62]:
dnn_model(pca_dataset)

Epoch 1/100
509/509 - 2s - loss: 2.3796 - precision_3: 0.6652 - recall_3: 0.7258 - val_loss: 1.4643 - val_precision_3: 0.6958 - val_recall_3: 0.8428 - 2s/epoch - 4ms/step
Epoch 2/100
509/509 - 1s - loss: 1.3018 - precision_3: 0.7009 - recall_3: 0.8050 - val_loss: 1.1626 - val_precision_3: 0.7156 - val_recall_3: 0.8170 - 1s/epoch - 2ms/step
Epoch 3/100
509/509 - 1s - loss: 1.1179 - precision_3: 0.7057 - recall_3: 0.8109 - val_loss: 1.0455 - val_precision_3: 0.7190 - val_recall_3: 0.8106 - 1s/epoch - 2ms/step
Epoch 4/100
509/509 - 1s - loss: 1.0198 - precision_3: 0.7054 - recall_3: 0.8101 - val_loss: 0.9676 - val_precision_3: 0.7048 - val_recall_3: 0.8342 - 1s/epoch - 2ms/step
Epoch 5/100
509/509 - 1s - loss: 0.9596 - precision_3: 0.7064 - recall_3: 0.8102 - val_loss: 0.9235 - val_precision_3: 0.7047 - val_recall_3: 0.8339 - 1s/epoch - 2ms/step
Epoch 6/100
509/509 - 1s - loss: 0.9125 - precision_3: 0.7080 - recall_3: 0.8127 - val_loss: 0.8833 - val_precision_3: 0.6910 - val_recall_3: 0.8

# Feature reduction: PCA
Principle component analysis is a technique which is used to decrease feature count such that their information is preserved, only in lower dimensions. It mainly aims to keep the variance of the original data in fewer columns. Result however, did not show any significant improvement, suggesting that important relations between features and target are already captured.  



# Final result
for the purpose of our research, we will use the following weights obtained from the Logistic regression model to apply on users' input.

### feature selected model:
$$[0.0702186,  0.08500237, 0.29923042, 0.03782045, 0.39012718, 0.30067847]$$


### original model:
$$[ 0.37257845,  0.28995716,  0.66909954,  0.0368971,  -0.00249678,  0.09599408,
   0.11823623, -0.01565236, -0.01979998, -0.05413245, -0.37604701,  0.00626531,
   0.02791119,  0.30086121, -0.00214749, -0.00390029,  0.08630184,  0.11340987,
   0.07794295, -0.03393694]$$