# ML Pipeline Preparation
Follow the instructions below to help you create your ML pipeline.
### 1. Import libraries and load data from database.
- Import Python libraries
- Load dataset from database with [`read_sql_table`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql_table.html)
- Define feature and target variables X and Y

In [1]:
import pandas as pd
import numpy as np
from sqlalchemy import create_engine

import warnings
warnings.simplefilter('ignore')

In [2]:
# load data from database
engine = create_engine('sqlite:///DisasterMsgTable.db')
print (engine.table_names())

df = pd.read_sql_table('DisasterMsgTable', engine)
df.head()

X = df['message']
Y = df.drop(['message', 'original', 'genre'], axis = 1)
X.head()

['DisasterMsgTable']


0    Weather update - a cold front from Cuba that c...
1              Is the Hurricane over or is it not over
2                      Looking for someone but no name
3    UN reports Leogane 80-90 destroyed. Only Hospi...
4    says: west side of Haiti, rest of the country ...
Name: message, dtype: object

In [3]:
Y.head()

Unnamed: 0,related,request,offer,aid_related,medical_help,medical_products,search_and_rescue,security,military,water,...,aid_centers,other_infrastructure,weather_related,floods,storm,fire,earthquake,cold,other_weather,direct_report
0,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1,0,0,1,0,0,0,0,0,0,...,0,0,1,0,1,0,0,0,0,0
2,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,1,1,0,1,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### 2. Write a tokenization function to process your text data

In [4]:
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
from nltk.stem import PorterStemmer
import re
import nltk
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('stopwords')



# def tokenize(text):

#     tokens = word_tokenize(text)
#     lemmatizer = WordNetLemmatizer()

#     clean_tokens = []
#     for tok in tokens:
#         clean_tok = lemmatizer.lemmatize(tok).lower().strip()
#         clean_tokens.append(clean_tok)

#     return clean_tokens

def tokenize(text):
    """Normalize, tokenize and stem text string
    
    Args:
    text: string. String containing message for processing
       
    Returns:
    stemmed: list of strings. List containing normalized and stemmed word tokens
    """
    # Convert text to lowercase and remove punctuation
    text = re.sub(r"[^a-zA-Z0-9]", " ", text.lower())
    
    # Tokenize words
    tokens = word_tokenize(text)
    
    # Stem word tokens and remove stop words
    stemmer = PorterStemmer()
    stop_words = stopwords.words("english")
    
    stemmed = [stemmer.stem(word) for word in tokens if word not in stop_words]
    
    return stemmed


[nltk_data] Downloading package punkt to
[nltk_data]     /home/brentweiliu/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     /home/brentweiliu/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     /home/brentweiliu/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


### 3. Build a machine learning pipeline
This machine pipeline should take in the `message` column as input and output classification results on the other 36 categories in the dataset. You may find the [MultiOutputClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.multioutput.MultiOutputClassifier.html) helpful for predicting multiple target variables.

In [5]:
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer
from sklearn.multioutput import MultiOutputClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression


# pipeline = Pipeline([
#     ('vect', CountVectorizer(tokenizer=tokenize)),
#     ('tfidf', TfidfTransformer()),
#     ('clf', LogisticRegression(solver='liblinear', penalty = 'l2', C = 10)),
# ])
pipeline = Pipeline([
    ('vect', CountVectorizer(tokenizer = tokenize)),
    ('tfidf', TfidfTransformer()),
    ('clf', MultiOutputClassifier(RandomForestClassifier()))
])


### 4. Train pipeline
- Split data into train and test sets
- Train pipeline

In [6]:
from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(X, Y,  random_state = 1)

np.random.seed(17)
pipeline.fit(X_train, Y_train)

Pipeline(memory=None,
         steps=[('vect',
                 CountVectorizer(analyzer='word', binary=False,
                                 decode_error='strict',
                                 dtype=<class 'numpy.int64'>, encoding='utf-8',
                                 input='content', lowercase=True, max_df=1.0,
                                 max_features=None, min_df=1,
                                 ngram_range=(1, 1), preprocessor=None,
                                 stop_words=None, strip_accents=None,
                                 token_pattern='(?u)\\b\\w\\w+\\b',
                                 tokenizer=<function tokenize at...
                 MultiOutputClassifier(estimator=RandomForestClassifier(bootstrap=True,
                                                                        class_weight=None,
                                                                        criterion='gini',
                                                                  

### 5. Test your model
Report the f1 score, precision and recall for each output category of the dataset. You can do this by iterating through the columns and calling sklearn's `classification_report` on each.

In [7]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, make_scorer

def get_eval_metrics(actual, predicted, col_names):
    """Calculate evaluation metrics for ML model
    
    Args:
    actual: array. Array containing actual labels.
    predicted: array. Array containing predicted labels.
    col_names: list of strings. List containing names for each of the predicted fields.
       
    Returns:
    metrics_df: dataframe. Dataframe containing the accuracy, precision, recall 
    and f1 score for a given set of actual and predicted labels.
    """
    metrics = []
    
    # Calculate evaluation metrics for each set of labels
    for i in range(len(col_names)):
#         print('i = %d, colums name = %s: '% (i, col_names[i]))
#         print(set(actual[:, i].tolist()))
#         print(set(predicted[:, i].tolist()))
              
        accuracy = accuracy_score(actual[:, i], predicted[:, i])
        precision = precision_score(actual[:, i], predicted[:, i])
        recall = recall_score(actual[:, i], predicted[:, i])
        f1 = f1_score(actual[:, i], predicted[:, i])
        
        metrics.append([accuracy, precision, recall, f1])
    
    # Create dataframe containing metrics
    metrics = np.array(metrics)
    metrics_df = pd.DataFrame(data = metrics, index = col_names, columns = ['Accuracy', 'Precision', 'Recall', 'F1'])
      
    return metrics_df

In [8]:

# Calculate evaluation metrics for training set
Y_train_pred = pipeline.predict(X_train)
col_names = list(Y.columns.values)

In [9]:
print(get_eval_metrics(np.array(Y_train), Y_train_pred, col_names))

                        Accuracy  Precision    Recall        F1
related                 0.989395   0.991848  0.994306  0.993075
request                 0.988678   0.997143  0.936773  0.966016
offer                   0.998822   1.000000  0.747253  0.855346
aid_related             0.984990   0.994103  0.969886  0.981845
medical_help            0.989908   0.998555  0.876347  0.933468
medical_products        0.992674   0.997650  0.857576  0.922325
search_and_rescue       0.994109   0.995624  0.801056  0.887805
security                0.995338   1.000000  0.737752  0.849088
military                0.995902   0.998205  0.875591  0.932886
water                   0.994518   1.000000  0.914809  0.955509
food                    0.995082   0.999051  0.957273  0.977716
shelter                 0.992828   0.998742  0.920046  0.957780
clothing                0.998412   0.992982  0.907051  0.948074
money                   0.995287   1.000000  0.800434  0.889157
missing_people          0.996670   1.000

In [10]:
# Calculate evaluation metrics for test set
Y_test_pred = pipeline.predict(X_test)

eval_metrics0 = get_eval_metrics(np.array(Y_test), Y_test_pred, col_names)
print(eval_metrics0)

                        Accuracy  Precision    Recall        F1
related                 0.806670   0.846369  0.913017  0.878431
request                 0.882434   0.794702  0.428189  0.556522
offer                   0.995851   0.000000  0.000000  0.000000
aid_related             0.743968   0.729101  0.606094  0.661932
medical_help            0.921623   0.484848  0.094675  0.158416
medical_products        0.953435   0.708333  0.105263  0.183288
search_and_rescue       0.975565   0.384615  0.032051  0.059172
security                0.980790   0.000000  0.000000  0.000000
military                0.965268   0.486486  0.080000  0.137405
water                   0.958506   0.872449  0.411058  0.558824
food                    0.929768   0.832500  0.460581  0.593054
shelter                 0.931919   0.811159  0.321429  0.460414
clothing                0.987552   0.772727  0.182796  0.295652
money                   0.978946   0.714286  0.069930  0.127389
missing_people          0.986937   0.000

**Although test accuracy is high for all categories, for the majority of categories, the F1 score is unacceptably low. This is likely due to the unbalanced nature of the dataset**

In [11]:
# Calculation the proportion of each column that have label == 1
Y.sum()/len(Y)

related                   0.764821
request                   0.171898
offer                     0.004534
aid_related               0.417259
medical_help              0.080071
medical_products          0.050448
search_and_rescue         0.027817
security                  0.018097
military                  0.033043
water                     0.064241
food                      0.112306
shelter                   0.088908
clothing                  0.015561
money                     0.023207
missing_people            0.011450
refugees                  0.033619
death                     0.045875
other_aid                 0.132401
infrastructure_related    0.065509
transport                 0.046144
buildings                 0.051216
electricity               0.020440
tools                     0.006109
hospitals                 0.010873
shops                     0.004611
aid_centers               0.011872
other_infrastructure      0.044223
weather_related           0.280363
floods              


**In many cases, fewer than 5% of the dataset have a label of 1, making it more difficult for any model to predict these cases than if the data were balanced.**

**Ideally, we should have used stratified sampling to create the train and test sets (this is what we would have done had there just been one column in the y dataset). However, due to the fact that we have multiple labels for each datapoint, this is not practical. We would effectively have to create a separate train and test set for each set of y-labels, which would then mean that we would have to fit a separate model to each of the y-columns. This is not something that we wish to do.**

### 6. Improve your model
Use grid search to find better parameters. 

In [12]:
# from sklearn.model_selection import GridSearchCV

# parameters = {
#     'clf__C':np.logspace(-3, 3, 7),
#     'clf__penalty':["l1", "l2"],
# }

# cv = GridSearchCV(pipeline, param_grid=parameters)

In [13]:
# Define performance metric for use in grid search scoring object
def performance_metric(y_true, y_pred):
    """Calculate median F1 score for all of the output classifiers
    
    Args:
    y_true: array. Array containing actual labels.
    y_pred: array. Array containing predicted labels.
        
    Returns:
    score: float. Median F1 score for all of the output classifiers
    """
    f1_list = []
    for i in range(np.shape(y_pred)[1]):
        f1 = f1_score(np.array(y_true)[:, i], y_pred[:, i])
        f1_list.append(f1)
        
    score = np.median(f1_list)
    return score

**We have chosen to use the median F1 score for all of the output classifiers, rather than the mean, to avoid the situation where we are selecting a set of parameters that result in a small number of the output classifiers having very high test F1 scores, but the majority of the output classifiers having test F1 scores close to zero.**

In [14]:
# Create grid search object
from sklearn.model_selection import GridSearchCV

parameters = {'vect__min_df': [1, 5],
              'tfidf__use_idf':[True, False],
              'clf__estimator__n_estimators':[10, 25], 
              'clf__estimator__min_samples_split':[2, 5, 10]}

scorer = make_scorer(performance_metric)
cv = GridSearchCV(pipeline, param_grid = parameters, scoring = scorer, verbose = 10, n_jobs=-1)

# Find best parameters
np.random.seed(81)
tuned_model = cv.fit(X_train, Y_train)

Fitting 3 folds for each of 24 candidates, totalling 72 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 72 concurrent workers.
[Parallel(n_jobs=-1)]: Done   9 out of  72 | elapsed:   53.1s remaining:  6.2min
[Parallel(n_jobs=-1)]: Done  17 out of  72 | elapsed:   54.7s remaining:  2.9min
[Parallel(n_jobs=-1)]: Done  25 out of  72 | elapsed:   57.4s remaining:  1.8min
[Parallel(n_jobs=-1)]: Done  33 out of  72 | elapsed:  1.0min remaining:  1.2min
[Parallel(n_jobs=-1)]: Done  41 out of  72 | elapsed:  1.2min remaining:   53.8s
[Parallel(n_jobs=-1)]: Done  49 out of  72 | elapsed:  1.3min remaining:   35.6s
[Parallel(n_jobs=-1)]: Done  57 out of  72 | elapsed:  1.3min remaining:   20.9s
[Parallel(n_jobs=-1)]: Done  65 out of  72 | elapsed:  1.4min remaining:    8.9s
[Parallel(n_jobs=-1)]: Done  72 out of  72 | elapsed:  1.6min finished


In [15]:
# Get results of grid search
tuned_model.cv_results_

{'mean_fit_time': array([49.75646981, 41.71367431, 52.10406407, 40.67489314, 79.73380041,
        69.28228609, 81.60107239, 66.34092601, 43.89232047, 38.45403671,
        44.54406667, 35.77454575, 68.82605537, 61.37929074, 69.57968942,
        58.81435362, 39.47179643, 36.14848375, 39.22461669, 34.27843849,
        63.55856673, 59.04025277, 63.3405714 , 54.9704864 ]),
 'std_fit_time': array([0.73410527, 1.02485632, 1.05668388, 2.10480295, 1.59412029,
        0.71234027, 0.72786868, 1.48536871, 2.30182607, 1.25929051,
        0.70387495, 1.3318639 , 0.76183933, 1.40847385, 0.5479497 ,
        0.9573096 , 0.04178935, 0.57179853, 0.97240575, 0.58372636,
        1.02458517, 0.57146631, 1.31269251, 0.3579317 ]),
 'mean_score_time': array([6.71614742, 9.12595487, 6.58848612, 9.34241589, 6.52641622,
        6.30644051, 6.43791699, 6.3919127 , 8.82245994, 9.75359186,
        9.19991104, 9.51265915, 6.96683002, 6.80851165, 6.82072139,
        6.96014365, 9.47890178, 9.64799746, 9.66719723, 9.60

In [16]:
# Best mean test score
np.max(tuned_model.cv_results_['mean_test_score'])

0.2145514063275451

In [17]:
# Parameters for best mean test score
tuned_model.best_params_

{'clf__estimator__min_samples_split': 10,
 'clf__estimator__n_estimators': 10,
 'tfidf__use_idf': False,
 'vect__min_df': 5}

**The best results (with regard to median F1 score) were achieved using the following parameters:**
* CountVectorizer minimum df = 5
* TfidfTransformer use_idf = True
* Random Forest Classifier number of estimators = 10
* Random Forest Classifier minimum samples split = 10

### 7. Test your model
Show the accuracy, precision, and recall of the tuned model.  

Since this project focuses on code quality, process, and  pipelines, there is no minimum performance metric needed to pass. However, make sure to fine tune your models for accuracy, precision and recall to make your project stand out - especially for your portfolio!

In [18]:
# vect = CountVectorizer(tokenizer=tokenize)
# tfidf = TfidfTransformer()
# clf = LogisticRegression(solver='liblinear', penalty = 'l2', C = 1.0)

# # train classifier
# X_train_counts = vect.fit_transform(X_train)
# X_train_tfidf = tfidf.fit_transform(X_train_counts)
# clf.fit(X_train_tfidf, y_train)

# # predict on test data
# X_test_counts = vect.transform(X_test)
# X_test_tfidf = tfidf.transform(X_test_counts)
# y_pred = clf.predict(X_test_tfidf)
# print(classification_report(y_test, y_pred))

In [19]:
# Calculate evaluation metrics for test set
tuned_pred_test = tuned_model.predict(X_test)
eval_metrics1 = get_eval_metrics(np.array(Y_test), tuned_pred_test, col_names)

print(eval_metrics1)

                        Accuracy  Precision    Recall        F1
related                 0.812510   0.842883  0.927883  0.883343
request                 0.893653   0.799163  0.511151  0.623504
offer                   0.995851   0.000000  0.000000  0.000000
aid_related             0.760873   0.717516  0.695652  0.706415
medical_help            0.924082   0.539877  0.173570  0.262687
medical_products        0.954357   0.696970  0.142415  0.236504
search_and_rescue       0.977255   0.750000  0.076923  0.139535
security                0.980329   0.166667  0.008065  0.015385
military                0.966805   0.573770  0.155556  0.244755
water                   0.962348   0.832685  0.514423  0.635958
food                    0.938374   0.786477  0.611342  0.687938
shelter                 0.937759   0.765217  0.448980  0.565916
clothing                0.986322   0.750000  0.064516  0.118812
money                   0.979560   0.750000  0.104895  0.184049
missing_people          0.987091   0.000

In [20]:
# Get summary stats for first model
eval_metrics0.describe()

Unnamed: 0,Accuracy,Precision,Recall,F1
count,35.0,35.0,35.0,35.0
mean,0.942229,0.522256,0.191362,0.247618
std,0.057977,0.325009,0.242447,0.269097
min,0.743968,0.0,0.0,0.0
25%,0.931382,0.339367,0.013772,0.026614
50%,0.958353,0.608696,0.079832,0.142857
75%,0.980867,0.79979,0.349428,0.49233
max,0.995851,0.899807,0.913017,0.878431


In [21]:
# Get summary stats for tuned model
eval_metrics1.describe()

Unnamed: 0,Accuracy,Precision,Recall,F1
count,35.0,35.0,35.0,35.0
mean,0.945325,0.561451,0.234279,0.288786
std,0.054515,0.310008,0.272747,0.28683
min,0.760873,0.0,0.0,0.0
25%,0.938067,0.363971,0.022978,0.044437
50%,0.962195,0.69697,0.104895,0.184049
75%,0.98079,0.775847,0.469808,0.593753
max,0.995851,1.0,0.927883,0.883343


**Tuning the model parameters has resulted in an increase in the median and mean (test) F1 score for the output classifiers. However, it is still the case that 50% of the ouput classifiers have an F1 score of less than 0.24, and 25% have an F1 score of less than 0.064. This is due to low recall values (i.e. the proportion of positive points that were correctly labelled). Ideally, we would like to try to improve on this**

### 8. Try improving your model further. Here are a few ideas:
* try other machine learning algorithms
* add other features besides the TF-IDF

**To try to improve the model further, we will change the Random Forest Classifier in the pipeline to a logistic regression classifier.**

**To keep the number of grid search cases to a minimum, we will keep the tuned parameter values for the CountVectorizer and TfidfTransformer found in the previous secion.**

In [35]:

pipeline2 = Pipeline([
    ('vect', CountVectorizer(tokenizer = tokenize)),
    ('tfidf', TfidfTransformer()),
    ('clf', MultiOutputClassifier(LogisticRegression())),
])

parameters2 = {
    'vect__min_df': [1],
    'tfidf__use_idf':[False],
#   'clf__estimator__multi_class': ['ovr']
    'clf__estimator__random_state': [25],
    'clf__estimator__C': [0.5, 1.0, 10],
    'clf__estimator__penalty':["l1", "l2"],
    #'clf__estimator__solver':['lbfgs','liblinear']
    'clf__estimator__solver':['liblinear'],
}

cv2 = GridSearchCV(pipeline2, param_grid = parameters2, scoring = scorer, verbose = 10, n_jobs=-1)

# Find best parameters
np.random.seed(71)
tuned_model2 = cv2.fit(X_train, Y_train)

Fitting 3 folds for each of 6 candidates, totalling 18 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 72 concurrent workers.
[Parallel(n_jobs=-1)]: Done   3 out of  18 | elapsed:   17.2s remaining:  1.4min
[Parallel(n_jobs=-1)]: Done   5 out of  18 | elapsed:   17.2s remaining:   44.8s
[Parallel(n_jobs=-1)]: Done   7 out of  18 | elapsed:   17.5s remaining:   27.5s
[Parallel(n_jobs=-1)]: Done   9 out of  18 | elapsed:   17.6s remaining:   17.6s
[Parallel(n_jobs=-1)]: Done  11 out of  18 | elapsed:   17.7s remaining:   11.3s
[Parallel(n_jobs=-1)]: Done  13 out of  18 | elapsed:   18.5s remaining:    7.1s
[Parallel(n_jobs=-1)]: Done  15 out of  18 | elapsed:   18.9s remaining:    3.8s
[Parallel(n_jobs=-1)]: Done  18 out of  18 | elapsed:   38.0s finished


In [36]:
# Get results of grid search
tuned_model2.cv_results_

{'mean_fit_time': array([ 9.12806288,  9.01719022,  9.37700446,  9.3848896 , 18.44134355,
        10.33322056]),
 'std_fit_time': array([0.04935794, 0.11206399, 0.05422918, 0.14644104, 8.14050223,
        0.15219437]),
 'mean_score_time': array([3.66478848, 3.66345747, 3.65770229, 3.62783742, 3.52898645,
        3.62877727]),
 'std_score_time': array([0.06111087, 0.04947788, 0.04100299, 0.03896517, 0.17647244,
        0.0610117 ]),
 'param_clf__estimator__C': masked_array(data=[0.5, 0.5, 1.0, 1.0, 10, 10],
              mask=[False, False, False, False, False, False],
        fill_value='?',
             dtype=object),
 'param_clf__estimator__penalty': masked_array(data=['l1', 'l2', 'l1', 'l2', 'l1', 'l2'],
              mask=[False, False, False, False, False, False],
        fill_value='?',
             dtype=object),
 'param_clf__estimator__random_state': masked_array(data=[25, 25, 25, 25, 25, 25],
              mask=[False, False, False, False, False, False],
        fill_value='?'

In [37]:
# Parameters for best mean test score
tuned_model2.best_params_

{'clf__estimator__C': 10,
 'clf__estimator__penalty': 'l1',
 'clf__estimator__random_state': 25,
 'clf__estimator__solver': 'liblinear',
 'tfidf__use_idf': False,
 'vect__min_df': 1}

In [38]:
# Calculate evaluation metrics for test set
tuned_pred_test2 = tuned_model2.predict(X_test)

eval_metrics2 = get_eval_metrics(np.array(Y_test), tuned_pred_test2, col_names)

print(eval_metrics2)

                        Accuracy  Precision    Recall        F1
related                 0.810973   0.869917  0.885295  0.877539
request                 0.886430   0.694898  0.607493  0.648263
offer                   0.995390   0.200000  0.037037  0.062500
aid_related             0.751806   0.707402  0.681903  0.694418
medical_help            0.917935   0.460870  0.313609  0.373239
medical_products        0.949439   0.487069  0.349845  0.407207
search_and_rescue       0.973721   0.407407  0.211538  0.278481
security                0.976641   0.195652  0.072581  0.105882
military                0.967573   0.548611  0.351111  0.428184
water                   0.963117   0.740437  0.651442  0.693095
food                    0.939911   0.756966  0.676349  0.714390
shelter                 0.934225   0.652091  0.583333  0.615799
clothing                0.988167   0.625000  0.430108  0.509554
money                   0.976487   0.455357  0.356643  0.400000
missing_people          0.986322   0.407

In [39]:
# Get summary stats for tuned model
eval_metrics1.describe()

Unnamed: 0,Accuracy,Precision,Recall,F1
count,35.0,35.0,35.0,35.0
mean,0.945325,0.561451,0.234279,0.288786
std,0.054515,0.310008,0.272747,0.28683
min,0.760873,0.0,0.0,0.0
25%,0.938067,0.363971,0.022978,0.044437
50%,0.962195,0.69697,0.104895,0.184049
75%,0.98079,0.775847,0.469808,0.593753
max,0.995851,1.0,0.927883,0.883343


In [40]:
# Get summary stats for tuned model
eval_metrics2.describe()

Unnamed: 0,Accuracy,Precision,Recall,F1
count,35.0,35.0,35.0,35.0
mean,0.941931,0.501246,0.36578,0.415513
std,0.056825,0.221372,0.237429,0.235843
min,0.751806,0.0,0.0,0.0
25%,0.935685,0.407407,0.186819,0.24289
50%,0.961887,0.5,0.321429,0.4
75%,0.980406,0.679364,0.567884,0.619739
max,0.99539,0.869917,0.885295,0.877539


### <span style="color:red">The logistic regression model performs better with regard to F1 score in average than random forest model.</span>


### 9. Export your model as a pickle file

In [44]:
# Pickle best model
import pickle
pickle.dump(tuned_model2, open('disaster_classification_model.sav', 'wb'))

### 10. Use this notebook to complete `train.py`
Use the template file attached in the Resources folder to write a script that runs the steps above to create a database and export a model based on a new dataset specified by the user.