Disaster Response Project

Overview

As part of the Udacity Data Scientist Nanodegree Program, this multioutput classification project aims to analyze and classify messages to improve communication during disasters, using data provided by Appen (formally Figure 8) that contains real messages that were sent during disaster events.

A web application can be hosted locally to classify new messages.

The result should look like this:

Home

Visualizing data (1)

Visualizing data (2)

Message Classification

About the Project

To run the app locally, as well as any other step of the project, it is recommended to create a new working environment and install the required libraries. For example, if using Anaconda:

conda create -n <env_name> then conda activate <env_name>
conda install -c anaconda pip
pip install -r requirements.txt

Or simply (although it can generate a "PackagesNotFoundError"):

conda create --name <env_name> --file requirements.txt

Directories

Config

Configuration setup is handled in the "config" directory through two files: core.py and config.yml. The different path files needed to run this project (messages and categories data, database file, and model pickle file) are specified in this folder, and a validation is done to ensure everything works as intented. Thus users do not need to write additional name files when running the python script, which can prevent errors and increases efficiency.

To re-train the model, e.g. if new data becomes available, the "config.yml" should be updated with the new file names, or the previous data files should be replaced.

Data

It contains the raw data (messages and categories) in csv files, as well as the cleaned data in a database (.db) file.

It also contains the python script needed to apply the entire ETL process, process_data.py, which extracts data from csv files, transforms them and then loads them into a single SQLite database.

To run this script on the command line, from the project folder:
python data/process_data.py

Models

It contains the python script that handles all the machine learning steps needed for this project, train_classifier.py. It also holds the pickle file containing the best model from the GridSearchCV done on the training set.

To run the python script, train_classifier.py, from the command line:
python models/train_classifier.py

Training

App

It contains the necessary files to run the wep application. This include two python scripts:

run.py which contains the Flask code needed to render the HTML files as well as the Plotly figures
functions.py which contains extra functions needed to execute run.py (for a modular and clean code)

In addition, two additional directories: templates and static, contain the necessary HTML and CSS files.

To access the web application on a local computer, run: python app/run.py and run the given url.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

app

app

config

config

data

data

models

models

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Disaster Response Project

Overview

Directories

Config

Data

Models

App

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
app		app
config		config
data		data
models		models
README.md		README.md
requirements.txt		requirements.txt

pcmaldonado/Disaster_Response

Folders and files

Latest commit

History

Repository files navigation

Disaster Response Project

Overview

Directories

Config

Data

Models

App

About

Resources

Stars

Watchers

Forks

Languages