nlp-disaster-app

1. Overview

This project was completed as part of the Udacity Data Science Nanodegree program.

Business Requirements

To correctly classify messages into disaster and emergency-related categories.

2. Getting Started

The following sections describe how to install the application structure locally.

2.1. Dependencies

Python 3.6+
Web framework: Flask, gunicorn
Data Visualization: plotly
Database: SQLAlchemy
Data Preparation & Modelling: pandas, NLTK (average_perceptron_tagger, punkt, WordNet), scikit-learn

For a full list of requirements, please see the requirements.txt file.

2.2. Installation

To clone the project, use the following: git clone https://github.com/dagrewal/nlp-disaster-app.git

To install the dependencies using pip, make sure that you are in the root directory of the project. You can check this by using the ls command in the terminal and ensuring that you can see requirements.txt. If you are using Windows, use dir in the command prompt. Then use the following: pip install -r requirements.txt to install the required dependencies.

Once everything has been installed correctly, again using terminal or command prompt, type gunicorn nlp-disaster-app:app and navigate to localhost:8000 in your web browser.

If you experience any issues during the installation process, please log your issue using the Issues tab in the GitHub project.

2.3. Executing Model Scripts

There are two Python files that are used to (a) read in and clean the raw data and store in a SQL database and (b) prepare the data, engineer new features and train a multi-class supervised learning model using a training dataset.

Reading In, Cleaning & Storing Data to SQL

Using the terminal (or command prompt):

Navigate to nlp-disaster-app/data
Run python process_data.py disaster_messages.csv disaster_categories.csv [insert_database_name].db

The Python script will proceed to read in the two .csv files, clean the data as specified in the clean_data function and store the cleaned data into a database that you specified in the arguments above. The database will be saved to the same folder as the Python script.

Preparing, Building & Storing A Supervised Learning Model

Using the terminal (or command prompt):

Navigate to nlp-disaster-app/models
Run python train_classifier.py ../data[insert_database_name].db [insert_saved_model_name]

The Python script will proceed to prepare the data for training, engineer new features and train a supervised learning model using the prepared training data. The script will save the model into a .pkl file in the same folder as the Python script. For specifics on the engineering of new features and the model development, please inspect the functions within the script.

Once everything has finished running (note that the train_classifier.py script will take a while to run), navigate to nlp-disaster-app and run gunicorn nlp-disaster-app:app and navigate to localhost:8000.

3. Future Developments

It could be a task for the reader to improve the accuracy of the model by engineering new features and applying different supervised learning models to improve the model performance. The reader could also create more visualizations to be included on the home page of the application.

4. Author

Daniel Grewal

5. License

https://opensource.org/licenses/MIT

6. Acknowledgements

This project was completed as part of the Udacity Data Science Nanodegree program. The data used for this project was provided by Figure8.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
app		app
data		data
models		models
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
home-1.png		home-1.png
nlp-disaster-app.py		nlp-disaster-app.py
nltk.txt		nltk.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlp-disaster-app

Contents

1. Overview

Business Requirements

2. Getting Started

2.1. Dependencies

2.2. Installation

2.3. Executing Model Scripts

Reading In, Cleaning & Storing Data to SQL

Preparing, Building & Storing A Supervised Learning Model

3. Future Developments

4. Author

5. License

6. Acknowledgements

About

Releases

Packages

Languages

License

dagrewal/nlp-disaster-app

Folders and files

Latest commit

History

Repository files navigation

nlp-disaster-app

Contents

1. Overview

Business Requirements

2. Getting Started

2.1. Dependencies

2.2. Installation

2.3. Executing Model Scripts

Reading In, Cleaning & Storing Data to SQL

Preparing, Building & Storing A Supervised Learning Model

3. Future Developments

4. Author

5. License

6. Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages