this is the code used in the paper "Discrete-State Variational Autoencoders for Joint Discovery and Factorization of Relations"
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


This is the code used in the paper Discrete-State Variational Autoencoders for Joint Discovery and Factorization of Relations by Diego Marcheggiani and Ivan Titov.

If you use this code, please cite us.


Data Processing

To run the model the first thing to do is create a dataset. You need a file like data-sample.txt. The file must be tab-separated an with the following fields:

lexicalized dependency path between arguments (entities) of the relation, first entity second entity entity types of the first and second entity trigger word id of the sentence raw sentence pos tags of the entire sentence relation between the two entities if any (used only for evaluation)

In order to create the dataset you need the script once for each dataset partition: train, dev, and test.

python processing/ --batch-name train data-sample.txt 
python processing/ --batch-name dev data-sample.txt
python processing/ --batch-name test data-sample.txt

Now, your dataset with all the indexed features is in

Training Models

To train the model run the file with all the required arguments:

python learning/ --pickled_dataset --model_name discrete-autoencoder --model AC --optimization 1 --epochs 10 --batch_size 100 --relations_number 10 --negative_samples_number 5 --l2_regularization 0.1 --alpha 0.1 --seed 2 --embed_size 10 --learning_rate 0.1

For any questions, please drop me a mail at marcheggiani [at] uva [dot] nl.