## Checklist

- [ ] app

     - [ ] template

        - [ ] master.html  # main page of web app

        - [ ] go.html  # classification result page of web app

    - [ ] run.py  # Flask file that runs app

- [ ] data

    - [x] process_data.py

    - [ ] InsertDatabaseName.db   # database to save clean data to

- [ ] models

    - [ ] train_classifier.py

    - [ ] classifier.pkl  # saved model 

- [ ] README.md

## File structure

The coding for this project can be completed using the Project Workspace IDE provided. Here's the file structure of the project:

- app

     - template

        - master.html  # main page of web app

        - go.html  # classification result page of web app

    - run.py  # Flask file that runs app

- data

    - disaster_categories.csv  # data to process 

    - disaster_messages.csv  # data to process

    - process_data.py

    - InsertDatabaseName.db   # database to save clean data to

- models

    - train_classifier.py

    - classifier.pkl  # saved model 

- README.md

# Project Components
There are three components you'll need to complete for this project.

## 1. ETL Pipeline

In a Python script, process_data.py, write a data cleaning pipeline that:

Loads the messages and categories datasets
Merges the two datasets
Cleans the data
Stores it in a SQLite database

#### Project Workspace - ETL
The first part of your data pipeline is the Extract, Transform, and Load process. Here, you will read the dataset, clean the data, and then store it in a SQLite database. We expect you to do the data cleaning with pandas. To load the data into an SQLite database, you can use the pandas dataframe .to_sql() method, which you can use with an SQLAlchemy engine.

Feel free to do some exploratory data analysis in order to figure out how you want to clean the data set. Though you do not need to submit this exploratory data analysis as part of your project, you'll need to include your cleaning code in the final ETL script, process_data.py.

#### Project Workspace - Machine Learning Pipeline

For the machine learning portion, you will split the data into a training set and a test set. Then, you will create a machine learning pipeline that uses NLTK, as well as scikit-learn's Pipeline and GridSearchCV to output a final model that uses the message column to predict classifications for 36 categories (multi-output classification). Finally, you will export your model to a pickle file. After completing the notebook, you'll need to include your final machine learning code in train_classifier.py.

### Data Pipelines: Python Scripts

After you complete the notebooks for the ETL and machine learning pipeline, you'll need to transfer your work into Python scripts, process_data.py and train_classifier.py. If someone in the future comes with a revised or new dataset of messages, they should be able to easily create a new model just by running your code. These Python scripts should be able to run with additional arguments specifying the files used for the data and model.

Example:

    python process_data.py disaster_messages.csv disaster_categories.csv DisasterResponse.db

    python train_classifier.py ../data/DisasterResponse.db classifier.pkl

Templates for these scripts are provided in the Resources section, as well as the Project Workspace IDE. The code for handling these arguments on the command line is given to you in the templates.

## 2. ML Pipeline

In a Python script, train_classifier.py, write a machine learning pipeline that:

Loads data from the SQLite database
Splits the dataset into training and test sets
Builds a text processing and machine learning pipeline
Trains and tunes a model using GridSearchCV
Outputs results on the test set
Exports the final model as a pickle file



## 3. Flask Web App

We are providing much of the flask web app for you, but feel free to add extra features depending on your knowledge of flask, html, css and javascript. For this part, you'll need to:

Modify file paths for database and model as needed
Add data visualizations using Plotly in the web app. One example is provided for you



In [None]:
#might be helpful - duplicates
#let's take a look to the duplicates in the dataframe
ids = messages ['message']
display (messages [ids.isin (ids [ids.duplicated ()])].sort_values (by = ['message']).head (n=10))


#let's take a look to the rows with identical IDs
ids = messages ['id']
display (messages [ids.isin (ids [ids.duplicated ()])].sort_values (by = ['id']).head (n=10))


### 5. Test your model
Report the f1 score, precision and recall for each output category of the dataset. You can do this by iterating through the columns and calling sklearn's `classification_report` on each.

In [44]:
def model_metrics(actual, predicted, col_names):
    '''
    Return  f1 score, precision and recall for each output category of the dataset
    
    
    Parameters:
    actual (np.array): Array of actual Y values
    predicted (np.array): Array of predicted Y values  
    col_names (list): List containing names for each of the predicted fields.
    
    Returns:
    df (df): Dataframe f1 score, precision 
    
    '''
    metrics = []
    
    # Calculate evaluation metrics for each set of labels
    for i in range(len(col_names)):
        accuracy = accuracy_score(actual[:, i], predicted[:, i])
        precision = precision_score(actual[:, i], predicted[:, i])
        recall = recall_score(actual[:, i], predicted[:, i])
        f1 = f1_score(actual[:, i], predicted[:, i])
        
        metrics.append([accuracy, precision, recall, f1])
    
    # Create dataframe containing metrics
    metrics = np.array(metrics)
    df = pd.DataFrame(data = metrics, index = col_names, columns = ['Accuracy', 'Precision', 'Recall', 'F1'])
      
    return df

In [24]:
%%time
Y_pred = pipeline.predict (X_test)

In [54]:
%%time
# Train model Prediction
col_names = list(Y.columns.values)
Y_train_pred = pipeline.predict(X_train)
metrics_df = model_metrics(np.array(Y_train), Y_train_pred, col_names)

Wall time: 48.1 s


In [46]:
metrics_df

Unnamed: 0,Accuracy,Precision,Recall,F1
related,0.999129,0.999265,0.999599,0.999432
request,0.999641,1.0,0.997923,0.99896
offer,0.999898,1.0,0.978261,0.989011
aid_related,0.999385,0.999631,0.998895,0.999263
medical_help,0.999539,1.0,0.994163,0.997073
medical_products,0.999744,1.0,0.994824,0.997405
search_and_rescue,0.999795,1.0,0.992481,0.996226
security,0.999846,1.0,0.991124,0.995542
military,0.999846,0.998423,0.99685,0.997636
water,0.999949,1.0,0.999195,0.999597


In [37]:
def multioutput_fscore(y_true,y_pred,beta=1):
    score_list = []
    if isinstance(y_pred, pd.DataFrame) == True:
        y_pred = y_pred.values
    if isinstance(y_true, pd.DataFrame) == True:
        y_true = y_true.values
    for column in range(0,y_true.shape[1]):
        score = fbeta_score(y_true[:,column],y_pred[:,column],beta,average='weighted')
        score_list.append(score)
    f1score_numpy = np.asarray(score_list)
    f1score_numpy = f1score_numpy[f1score_numpy<1]
    f1score = gmean(f1score_numpy)
    return  f1score

In [42]:
multi_f1 = multioutput_fscore(Y_test,Y_pred, beta = 1)
overall_accuracy = (Y_pred == Y_test).mean().mean()

print('Average overall accuracy {0:.2f}% \n'.format(overall_accuracy*100))
print('F1 score (custom definition) {0:.2f}%\n'.format(multi_f1*100))

ValueError: Can only compare identically-labeled DataFrame objects

In [27]:
#converting to dataframe
Y_pred = pd.DataFrame(Y_pred, columns = Y_test.columns)

In [21]:
# Calculate the accuracy for each of them.
for i in range(len(Y.columns)):
    print('Category: {} '.format(Y.columns[i]))
    print(classification_report(Y_test.iloc[:, i].values, Y_pred[:, i]))
    print('Accuracy {}\n\n'.format(accuracy_score(Y_test.iloc[:, i].values, Y_pred[:, i])))
    print('F1 {}\n\n'.format(f1_score(Y_test.iloc[:, i].values, Y_pred[:, i],average='weighted')))

Category: related 


NameError: name 'Y_pred' is not defined

In [17]:
print(classification_report(Y_test.iloc[:, 1:].values, np.array([x[1:] for x in Y_pred]), target_names = Y.columns))

NameError: name 'categories' is not defined

---


In [2]:
def f1_pre_acc_evaluation (y_true, y_pred): 
    """A function that measures mean of f1, precision, recall for each class within multi-class prediction 
       Returns a dataframe with columns: 
       f1-score (average for all possible values of specific class)
       precision (average for all possible values of specific class)
       recall (average for all possible values of specific class)
       kindly keep in mind that some classes might be imbalanced and average values may mislead. 
    """
    #instantiating a dataframe
    report = pd.DataFrame ()
    
    for col in y_true.columns:
        #returning dictionary from classification report
        class_dict = classification_report (output_dict = True, y_true = y_true.loc [:,col], y_pred = y_pred.loc [:,col])
    
        #converting from dictionary to dataframe
        eval_df = pd.DataFrame (pd.DataFrame.from_dict (class_dict))
        
       # print (eval_df)
        
        #dropping unnecessary columns
        eval_df.drop(['micro avg', 'macro avg', 'weighted avg'], axis =1, inplace = True)
        
        #dropping unnecessary row "support"
        eval_df.drop(index = 'support', inplace = True)
        
        #calculating mean values
        av_eval_df = pd.DataFrame (eval_df.transpose ().mean ())
        
        #transposing columns to rows and vice versa 
        av_eval_df = av_eval_df.transpose ()
    
        #appending result to report df
        report = report.append (av_eval_df, ignore_index = True)    
    
    #renaming indexes for convinience
    report.index = y_true.columns
    
    return report

def f1_scorer_eval (y_true, y_pred): 
    """A function that measures mean of F1 for all classes 
       Returns an average value of F1 for sake of evaluation whether model predicts better or worse in GridSearchCV 
    """
    #converting y_pred from np.array to pd.dataframe
    #keep in mind that y_pred should a pd.dataframe rather than np.array
    y_pred = pd.DataFrame (y_pred, columns = y_true.columns)
    
    
    #instantiating a dataframe
    report = pd.DataFrame ()
    
    for col in y_true.columns:
        #returning dictionary from classification report
        class_dict = classification_report (output_dict = True, y_true = y_true.loc [:,col], y_pred = y_pred.loc [:,col])
    
        #converting from dictionary to dataframe
        eval_df = pd.DataFrame (pd.DataFrame.from_dict (class_dict))
        
        #dropping unnecessary columns
        eval_df.drop(['micro avg', 'macro avg', 'weighted avg'], axis =1, inplace = True)
        
        #dropping unnecessary row "support"
        eval_df.drop(index = 'support', inplace = True)
        
        #calculating mean values
        av_eval_df = pd.DataFrame (eval_df.transpose ().mean ())
        
        #transposing columns to rows and vice versa 
        av_eval_df = av_eval_df.transpose ()
    
        #appending result to report df
        report = report.append (av_eval_df, ignore_index = True)    
    
    #returining mean value for all classes. since it's used for GridSearch we may use mean
    #as the overall value of F1 should grow. 
    return report ['f1-score'].mean () 


In [None]:
y_pred = pipeline.predict (X_test)
#converting to dataframe
y_pred = pd.DataFrame (y_pred, columns = y_test.columns)

In [None]:
report = f1_pre_acc_evaluation (y_test, y_pred)

---


In [None]:
y_pred = pipeline.predict(X_test)

In [None]:
print(classification_report(y_test.iloc[:,1:].values, np.array([x[1:] for x in y_pred]), target_names=categories))

---


In [None]:
def display_results(y_test, y_pred):
    labels = np.unique(y_pred)
    confusion_mat = confusion_matrix(y_test, y_pred, labels=labels)
    accuracy = (y_pred == y_test).mean()

    print("Labels:", labels)
    print("Confusion Matrix:\n", confusion_mat)
    print("Accuracy:", accuracy)

In [None]:
def main():
    X, y = load_data()
    X_train, X_test, y_train, y_test = train_test_split(X, y)

    model = model_pipeline()
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)

    display_results(y_test, y_pred)