Disaster response pipeline

Machine learning pipeline to categorize real messages that were sent during disaster events so that you can send the messages to an appropriate disaster relief agency.

Instructions

Run the following commands in the project's root directory to set up database and model:
- To run ETL pipeline that cleans data and stores in database
  
  python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves it as pickle
  
  python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
- To generate word cloud image with most relevant words in the training sets
  
  python helpers/generate_wordcloud.py
Run the following command in the app's directory to run your web app:

python run.py
Go to http://0.0.0.0:3001/

Installations

Project created using:

Python 3.6.10

Additional packages and versions used:

pandas 1.0.3
sqlalchemy 1.3.16
scikit-learn 0.22.1
nltk 3.4.5
plotly 4.6.0
flask 1.1.2
wordcloud 1.7.0

Project Motivation

Project to practice skills acquired in course 4 (Data Engineering) of Udacity Data Scientist Nanodegree Program (ETL, NLP, and machine learning pipelines).

File Descriptions

disaster-response-pipeline/
├── app/
│   ├── static/
│   │   └── images/
│   │       └── wordcloud.png
│   ├── templates/
│   │   ├── go.html
│   │   └── master.html
│   └── run.py
├── data/
│   ├── disaster_categories.csv
│   ├── disaster_messages.csv
│   ├── DisasterResponse.db
│   └── process_data.py
├── docs/
│   └── images/
│       ├── dataset_overview_1.jpg
│       ├── dataset_overview_2.jpg
│       └── message_classify.jpg
├── helpers/
│       ├── generate_wordcloud.py
│       └── utils.py
└── models/
    ├── classifier.pkl
    └── train_classifier.py

1. ETL Pipeline

data folder contains 2 data files (disaster_messages.csv and disaster_categories.csv) and a data cleaning pipeline (process_data.py) that:

Loads the messages and categories datasets
Merges the two datasets
Cleans the data
Stores it in a SQLite database

2. Machine Learning Pipeline

models folder contains a Python script (train_classifier.py) with a machine learning pipeline that:

Loads data from the SQLite database
Splits the dataset into training and test sets
Builds a text processing and machine learning pipeline
Trains and tunes a model using GridSearchCV
Outputs results on the test set
Exports the final model as a pickle file

3. Flask Web App

app folder contains a Flask Web App to:

Display three visualizations of train data and a word cloud.
Uses the trained model to input text and return classification results.

helpers folder it's for support code. Contains a python script for generating cloud word image and a utils function (tokenize).

docs folder it's for support files for this README. It contains a image folder with some screen captures of the App.

Results

Web App to categorize real messages that were sent during disaster events.

Message classification
Wordcloud of training dataset
Overview of training dataset 1
Overview of training dataset 2

Acknowledgements

Disaster data come from Figure Eight (now appen).

Project idea and guidance comes from Udacity Data Scientist Nanodegree Program.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
app		app
data		data
docs/images		docs/images
helpers		helpers
models		models
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

app

app

data

data

docs/images

docs/images

helpers

helpers

models

models

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Disaster response pipeline

Instructions

Installations

Project Motivation

File Descriptions

1. ETL Pipeline

2. Machine Learning Pipeline

3. Flask Web App

Results

Acknowledgements

About

Releases

Packages

Languages

tomasborrella/disaster-response-pipeline

Folders and files

Latest commit

History

Repository files navigation

Disaster response pipeline

Instructions

Installations

Project Motivation

File Descriptions

1. ETL Pipeline

2. Machine Learning Pipeline

3. Flask Web App

Results

Acknowledgements

About

Resources

Stars

Watchers

Forks

Languages