1. Introduction

During a disaster, typically we will get millions and millions of communications either direct or via social media right at the time when disaster response organizations have the least capacity to filter and then pull out the messages which are the most important. Machine learning is critical to helping different organizations understand which messages are relevant to them, and which messages to prioritize.

In this repo, I am going to be analyzing thousands of real messages of disaster data from Figure Eight, which contains pre-labeled tweets and text messages from real-life disasters, to create a model for an API that classifies disaster messages.

To get a betther understand about creating an ETL Pipeline, NLP Pipelines and Machine Learning Pipeline go through these repositories respectively:

2. Prerequisites

To install the flask app, you need:

python3
python packages in the requirements.txt file

Install the packages with

pip install -r requirements.txt
To create an environment using: conda create --name --file requirements.txt

3. Project Components

There are three components for this project:

ETL Pipeline: First I will repair the data with the ETL pipeline that process messages and category data from CSV file and load them into SQLite database. In the Python script, process_data.py, you will find the data cleaning pipeline that:
- Loads the messages and categories datasets
- Merges the two datasets
- Cleans the data
- Stores it in a SQLite database
ML Pipeline: Use the machine learning pipeline to raed data from the SQLite database to create and save a multi-output supervised learning model. In the Python script, train_classifier.py, you will find the machine learning pipeline that:
- Loads data from the SQLite database
- Splits the dataset into training and test sets
- Builds a text processing and machine learning pipeline
- Trains and tunes a model using GridSearchCV
- Outputs results on the test set
- Exports the final model as a pickle file
Flask Web App: I will create a web application, which use the trained model(the pickle file) to classify incoming messages where an emergency worker can input a new message and get classification results in several categories.

4. Structure

Below you can find the file structure of the project:



      - disaster_app
      | - template
      | |- master.html  # main page of web app
      | |- go.html  # classification result page of web app
      | - static
      | |- imgs
      | | |- githublogo.png 
      | | |- linkedinlogo.png 
      |- __init__.py  # Intial Flask file that runs app
      |- routes.py # Flask route file

      - data
      |- disaster_categories.csv  # data to process 
      |- disaster_messages.csv  # data to process
      |- process_data.py
      |- ETL Pipeline Preparation.ipynb (details about crating the ETL Pipeline)
      |- Database.db   # database 
      
      - models
      |- train_classifier.py
      |- utils.py 
      |- ML Pipeline Preparation.ipynb (details about crating the ML Pipeline)
      |- model.pkl  # (the size of the trained model is to big, therefore I could not load it on the repo, please rerun the train_classifier code to get the trained model) saved model 
      
      - README.md
      - app.py

5. Instructions for running the Python scripts

Run the following commands in the root directory of each file to set up your database and model

To run ETL pipeline that cleans data and stores in database:

       python process_data.py  --f1 disaster_messages.csv  --f2 disaster_categories.csv  --o Database.db

To run ML pipeline that trains classifier and saves it:

       python train_classifier.py  --f1 ../data/Database.db

Run the following command in the app's directory to run your web app:
```
       python app.py
       go to http://0.0.0.0:3001/
```
To get more information about how to deploy this app to a cloud, go through the Deploy the web app to the cloud step in this repository.

6. The screenshots of the web app.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1. Introduction

2. Prerequisites

3. Project Components

4. Structure

5. Instructions for running the Python scripts

6. The screenshots of the web app.

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
data		data
disaster_app		disaster_app
imgs		imgs
models		models
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

License

A2Amir/Analyse-Disaster-Data

Folders and files

Latest commit

History

Repository files navigation

1. Introduction

2. Prerequisites

3. Project Components

4. Structure

5. Instructions for running the Python scripts

6. The screenshots of the web app.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages