Based on the character level convolutional neural network (X. Zhang et al. 2015) Preproccessing based on https://github.com/NVIDIA/DIGITS/blob/master/examples/text-classification/create_dataset.py
In this example it will be used the DBPedia ontology dataset. The load_csv module in TFLearn utilize categories from 0 to n_classes-1. The preprocessesed dataset will be available in my Google Drive storage.
Download the file 'DBPedia.tar.gz' and extract its content in a folder that we will refer to it as $DBPedia
the bleeding edge version of TFLearn (0.2.2)
tensorflow-gpu (0.12.0rc0)
Numpy
The model uses a preprocess to convert each note into a numpy array of numbers representing each character from a 71 character alphabet using lowercase letters, punctuation simbols and others
$ cd /$DBPedia
$ python CNN.py
Training Accuracy
Validation Accuracy
In the validation set we got up to 96% accuracy
test your model using the Best chekcpoint file with the same model used for training.