Decide the best ruleset for FB15k #11

pminervini · 2017-02-07T12:59:22Z

For generating candidate rule-sets try e.g.

$ ./tools/amie-to-clauses.py -t 0.9 data/fb15k/rules/fb15k-rules_mins=1000_minis=1000.txt

pminervini · 2017-02-08T12:32:01Z

All the rule sets are available here:

https://github.com/uclmr/inferbeddings/tree/master/data/fb15k/clauses

tdmeeste · 2017-02-10T12:21:18Z

Currently, @pminervini is running experiments on 4 different rule sets:

clauses_highconf_highsupp.pl (34 rules)
clauses_lowconf_highsupp.pl (41 rules)
clauses_highconf_lowsupp.pl (311 rules)
clauses_lowconf_lowsupp.pl (584 rules)
on the different models, with limited hyperparam tuning.

Motivation for these rule sets:

no longer filter large set of amie+ rules by head coverage (which favours more obvious and symmetric rules), but instead by support (reflects impact of rules in terms of number of train facts in the body of the rules) and confidence (ratio of available true facts that satisfy body and head, with number of facts that satisfy the body).
chosen values (to limit max. number of rules):
- highconf: confidence >= 0.99 (assumes the data is very dense)
- lowconf: confidence >= 0.85 (which won't throw valid rules away if their are many missing facts)
- lowsupp: body support >= 50 (allows for very small rules)
- highsupp: body support >= 1000 (only big rules)

@pminervini any idea what the relation /dataworld/gardening_hint/split_to means? It appears in many rules, including high-confidence rules.

We can allow for more rules (if adding confident rules with low support turns out to help), but when we decide on a fixed rule set, maybe we need to filter redundant rules as discussed.

Curious to see what it'll bring, let's discuss in this issue.

pminervini · 2017-02-13T11:31:38Z

@tdmeeste told me to execute the following experiments: https://github.com/uclmr/inferbeddings/blob/master/scripts/fb15k/UCL_FB15K_clauses_v1.py

so far the best results have been obtained either with clauses_highconf_highsupp.pl or with clauses_lowconf_highsupp.pl

$ ./tools/parse_results_filtered.sh logs/ucl_fb15k_clauses_v1/*.log
144
Best MR, Filt: logs/ucl_fb15k_clauses_v1/ucl_fb15k_clauses_v1.adv_batch_size=10_adv_epochs=10_adv_lr=0.1_adv_weight=1_batches=10_clausefile=clauses_lowconf_highsupp.pl_disc_epochs=1_embedding_size=100_epochs=100_lr=0.1_margin=1_model=DistMult_optimizer=adagrad_similarity=dot.log
Test - Best Filt MR: 87.76886

Best MRR, Filt: logs/ucl_fb15k_clauses_v1/ucl_fb15k_clauses_v1.adv_batch_size=10_adv_epochs=0_adv_lr=0.1_adv_weight=1_batches=10_clausefile=clauses_lowconf_highsupp.pl_disc_epochs=10_embedding_size=100_epochs=100_lr=0.1_margin=1_model=ComplEx_optimizer=adagrad_similarity=dot.log
Test - Best Filt MRR: 0.519

Best H@1, Filt: logs/ucl_fb15k_clauses_v1/ucl_fb15k_clauses_v1.adv_batch_size=10_adv_epochs=0_adv_lr=0.1_adv_weight=1_batches=10_clausefile=clauses_highconf_highsupp.pl_disc_epochs=10_embedding_size=100_epochs=100_lr=0.1_margin=1_model=ComplEx_optimizer=adagrad_similarity=dot.log
Test - Best Filt Hits@1: 38.591%

Best H@3, Filt: logs/ucl_fb15k_clauses_v1/ucl_fb15k_clauses_v1.adv_batch_size=10_adv_epochs=0_adv_lr=0.1_adv_weight=1_batches=10_clausefile=clauses_lowconf_highsupp.pl_disc_epochs=10_embedding_size=100_epochs=100_lr=0.1_margin=1_model=ComplEx_optimizer=adagrad_similarity=dot.log
Test - Best Filt Hits@3: 60.408%

Best H@5, Filt: logs/ucl_fb15k_clauses_v1/ucl_fb15k_clauses_v1.adv_batch_size=10_adv_epochs=0_adv_lr=0.1_adv_weight=1_batches=10_clausefile=clauses_lowconf_highsupp.pl_disc_epochs=10_embedding_size=100_epochs=100_lr=0.1_margin=1_model=ComplEx_optimizer=adagrad_similarity=dot.log
Test - Best Filt Hits@5: 68.102%

Best H@10, Filt: logs/ucl_fb15k_clauses_v1/ucl_fb15k_clauses_v1.adv_batch_size=10_adv_epochs=0_adv_lr=0.1_adv_weight=1_batches=10_clausefile=clauses_lowconf_highsupp.pl_disc_epochs=10_embedding_size=100_epochs=100_lr=0.1_margin=1_model=ComplEx_optimizer=adagrad_similarity=dot.log
Test - Best Filt Hits@10: 76.349%

Log files for the experiments are available at http://data.neuralnoise.com/inferbeddings/ucl_fb15k_clauses_v1.tar.gz

@pminervini any idea what the relation /dataworld/gardening_hint/split_to means? It appears in many rules, including high-confidence rules.

@tdmeeste I have really no idea

riedelcastro · 2017-02-13T11:46:43Z

What are the numbers without rules?

pminervini · 2017-02-13T11:51:33Z

What are the numbers without rules?

adv_weight=0 was not in UCL_FB15K_clauses_v1.py - adding it right now ..

riedelcastro · 2017-02-13T12:51:16Z

Isn't it better to remove --adv-lr to get rid of the adversarial properly (or is this now happening with --adv-weight 0?)

pminervini · 2017-02-13T12:53:17Z

Isn't it better to remove --adv-lr to get rid of the adversarial properly (or is this now happening with --adv-weight 0?)

Yes - it happens with --adv-weight 0

tdmeeste · 2017-02-16T13:33:27Z

@pminervini I created several high-support datasets, based on a new high-recall amie+ run on fb15k, for various confidences, only based on fb15k training data. If the latest results are promising (still have to check), I can try to further improve the clause files by manually filtering.

pminervini · 2017-04-11T15:16:40Z

I think we agreed to use Guo et al.'s FB122: https://github.com/uclmr/inferbeddings/tree/master/data/guo-emnlp16/fb122

pminervini mentioned this issue Feb 8, 2017

Results (08/02/2017) #13

Closed

pminervini closed this as completed Apr 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide the best ruleset for FB15k #11

Decide the best ruleset for FB15k #11

pminervini commented Feb 7, 2017 •

edited

Loading

pminervini commented Feb 8, 2017

tdmeeste commented Feb 10, 2017

pminervini commented Feb 13, 2017 •

edited

Loading

riedelcastro commented Feb 13, 2017

pminervini commented Feb 13, 2017

riedelcastro commented Feb 13, 2017

pminervini commented Feb 13, 2017

tdmeeste commented Feb 16, 2017

pminervini commented Apr 11, 2017

Decide the best ruleset for FB15k #11

Decide the best ruleset for FB15k #11

Comments

pminervini commented Feb 7, 2017 • edited Loading

pminervini commented Feb 8, 2017

tdmeeste commented Feb 10, 2017

pminervini commented Feb 13, 2017 • edited Loading

riedelcastro commented Feb 13, 2017

pminervini commented Feb 13, 2017

riedelcastro commented Feb 13, 2017

pminervini commented Feb 13, 2017

tdmeeste commented Feb 16, 2017

pminervini commented Apr 11, 2017

pminervini commented Feb 7, 2017 •

edited

Loading

pminervini commented Feb 13, 2017 •

edited

Loading