- Project Overview
- File Structure - ETL Pipeline - data - ML Pipeline - models - Flask Web App - app
- Running Instructions
- Acknowledgement
This project processes messages data from Figure Eight and classifies into 36 categories. It includes a web application that allows an emergency worker to input a message and get categories results and presents visualization charts of data.
File '/data/process_data.py' stores the ETL pipeline that loads, merges and clean 'categories' and 'messages' and stores in a SQLite database.
File '/models/train_classifier.py' stores text processing and machine learning pipeline that:
- loads data from SQLite database
- split train and test datasets
- train and tune models using Random Forest Classifier and GridSearchCV
- predict on the test dataset
- save model in a pickle file
Below is a screen-shot of the web app. It allows an emergency worker to input a message and get the classification results. Below the search bar, it presents several visualization charts of dataset.
-
Run the following command in the project's root directory:
python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
-
Run the following command in the project's root directory:
python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
-
Run the Flask web app in the app's directory 'python run.py'/; Check the web app from http://0.0.0.0:3001/
Thank Figure Eight for providing data and Udacity for the instructions and advice.