Skip to content

roscopecoltran/multi-class-text-classification-cnn-rnn

 
 

Repository files navigation

Project: Classify Kaggle San Francisco Crime Description

Highlights:

  • This is a multi-class text classification (sentence classification) problem.
  • The goal of this project is to classify Kaggle San Francisco Crime Description into 39 classes.
  • This model was built with CNN, RNN (LSTM and GRU) and Word Embeddings on Tensorflow.
  • Input: Descript

  • Output: Category

  • Examples:

    Descript Category
    GRAND THEFT FROM LOCKED AUTO LARCENY/THEFT
    POSSESSION OF NARCOTICS PARAPHERNALIA DRUG/NARCOTIC
    AIDED CASE, MENTAL DISTURBED NON-CRIMINAL
    AGGRAVATED ASSAULT WITH BODILY FORCE ASSAULT
    ATTEMPTED ROBBERY ON THE STREET WITH A GUN ROBBERY

Train:

  • Command: python3 train.py train_data.file train_parameters.json
  • Example: python3 train.py ./data/train.csv.zip ./training_config.json

Predict:

  • Command: python3 predict.py ./trained_results_dir/ new_data.csv
  • Example: python3 predict.py ./trained_results_1478563595/ ./data/small_samples.csv

Server:

  • Command: python3 server.py ./trained_results_dir
  • Open http://127.0.0.1:5000/ (Press CTRL+C to quit)

References:

Docker

Docker Compose Example

======================

This is Docker Compose example. You can boot a container based on this repository image with Jupyter port 8888, its password 'foobar' and sharing files in notebook directory with the container by just running

docker-compose up

To stop it, please press 'Ctrl-C'.

About

Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.4%
  • HTML 2.0%
  • Jupyter Notebook 1.5%
  • TeX 0.9%
  • JavaScript 0.8%
  • Shell 0.2%
  • Other 0.2%