Skip to content

wadhwasahil/Relation_Extraction

Repository files navigation

Relation_Extraction

Relation Classification via Convolutional Deep Neural Network

The code is an implementation of the paper http://www.aclweb.org/anthology/C14-1220 using tensorflow.

##Algorithm

  • I almost followed the technique used in the paper mentioned above, only tweaking with some parameters such as dimensions of word vector, position vectors, optimization function and so on.
  • Basic architecture is a convolution layer, max pool and final softamx layer. We can always add/delete the number of conv and max-pool layers b/w the input layer and the final softmax layer. I used only 1 conv and 1 max pool.

##Files

  • text_cnn.py - It is a class which implements the architecture of the model. So it accepts the input, contains all the layers such as conv2d(convolution layer), max_pool etc. which process the input vector and finally gives the output in terms of predictions for each class.
  • data_helpers.py - It is a generic script which contains helpers such as generating batches, loading the training data etc.
  • train.py - This module creates the input vector from the training data, and finally trains the model on the data and saves it on the disk.
  • temp.py - This is a pyspark code used to fetch data from the HBase table and predict the class of each row using the trained model.

##Challenges

  • My training data is around 7K rows. Due to this, the accuracy is around 70.34% on the test set. So as the training set grows, I'm sure the model will perform much better.
  • My data set consists of inter sentencial entities with entities linked with a cause-effect relationship. However, this model can extended to a n class problem.

##TODO

  • Use RNNs maybe LSTMS for the training.
  • Fine tuning the model.

###PYSPARK can now be used with TensorFlow for online training and testing. In my case I am using pysark for online testing.

Releases

No releases published

Packages

No packages published

Languages