forked from ibeltagy/pl-semantics
-
Notifications
You must be signed in to change notification settings - Fork 0
/
issues.txt
506 lines (457 loc) · 26.3 KB
/
issues.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
bin/mlnsem qa qa/resources/cnn/validation/ 2 -log TRACE -withNegT false -soap localhost:9000 -negativeEvd false -softLogic mln -diffRules false -keepUniv false -focusGround true
bin/mlnsem qa qa/resources/cnn/validation/ 1 -log TRACE -withNegT false -negativeEvd false -softLogic psl -diffRules false -keepUniv false -ner gs -groundLimit 10000
//Non-handled case:
T: An ogre is not attacking
defend => not attack
H: An ogre is defending
T should entail H (given the KB), but nothing create an event constant for attacking/defending.
This will not work with the CWA because without an entity for attacking/defending, all their
ground atoms are false.
//Make the system works with STS + DiffRules
//Text has a negation if it has a predicate "nobody"
//Sure rules for SICK-train-contradictions.
motorcycle-n-5 black-a-5 white-a-7 motorcycle-n-8 578
steak-n-5 large-a-6 steak-n-7 1104
lady-n-2 woman-n-2 1555
squirrel-n-2 squirrel-n-2 around-a-5 2066
beating-v-3 stirring-v-4 in-r-6 bowl-n-8 1995
food-n-6 chicken-n-5 2918
ingredients-n-4 vegetables-n-5 3182
black-a-2 dog-n-3 keeping-v-10 mouth-n-12 tongue-n-11 hanging-v-13 out-r-14 3339
grass-n-8 tall-a-8 grass-n-9 3735
looks-v-14 looking-v-15 on-r-16 4353
//Errors in SICK-Train
21: "boy => young boy" is not generated because of the simplified assumption of generating diffRules in case of negation
34,136 : -6
100: complex coref and contradiction
133,259,292,642,984: wrong gs
170: equality in H
245,276,348,451,551,610,727: Phrasal Antonymy rule
343,754,755,797: timeout because of a very very long query
390: wrong handling of notEqual in H in inference
402: probably wrong handling of notEqual in inference
502,509,517,858,905: looks like a problem with the coreference resollution
703: disjunction
796,830,937: long rule
1240: "nobody" translated to "person"
////////////////////////
Important: for rules with long LHS, generating the inference problem timesout. This should be fixed
//
Rules need to be handcoded:
group of
made of
bowl of
pieces of
piece of
slice of
body of
//////
This is why small(x) ^ elephant(x):
T1: Kaline is a small elephant
T2: All elephents are mammals
==>Kaline is a small mammal. (Wrong)
That is why: small(x) ^ elephant(x) is wrong
and it should be: elephent(x) ^ rel(x,y) ^ small(y)
-----------
TODO:
->>New parser
-Weight learning (good way to remove bad rules)
->>Phrasal rules (read Mike Lewis)
-PPDB: rules with exist on the RHS
->>More handwritten rules
-Z(Q+MLN)/Z(MLN) using Vibhav's suggestion
-Formal MCWA (compare it with PSL, try it on STS on MLN)
-Most, Few
->>Use Katrin's robinson's resolution to find differences between two rules, and build vector only for the difference
-Coref and Contradictions errors below
-Bad IR
-In the next paper, talk why CWA is good (because this is how RTE is defined)
Contradiction: tmp/ratio-60-analysis lists problems with doing coreference resolution
Examples:
1)476: the man is denying an interview the man is granting an interview
2)1098: a person is not playing the keyboard a man is playing a keyboard
3)4036: there is no brown dog jumping in the air the brown dog is jumping in the air
IR:
Investigate errors in tmp/ratio-56-2-analysis-investigateLater again. THey are the difference between ratio-55 and ratio-56-2
Hyponem rules from PPDB to.
Use Katrin's resolution algorithm to find the difference between two formulas, then build and IR for that differnece
**Ratio**
-propagation should also propagate for the goal.
-even propagating for the goal will not solve the problem. Eg: T: a man is not driving, H: a man is driving a car with one hand. Propagation will not create "a car with one hand".
-ratio can easily get confused between neutral and entail in case of sentneces with a big overlap.
-introduction of entities should take care of cases where there is a discourse variable, but not an entity like pair 4285,4823
-Delete ground clauses that are not necessery, or that are not affecting the ratio, or that are affecting the ratio but not affecting flipping it.
-H|T, H, notT|H, notT
-Fracas pair 6 is an example of what goes wrong becaues of the missed up handling of existence
-FraCas pair 44,26,16 is an example for why applying the MCW does not work for Ratio (monotonicity missed up)
-FraCas pair 42 is an example for a problem with the generation of Skolem functions. Generated constatns for T|H and H are different
================
-Flipping the direction of the rules in GivenNotTextProbabilisticTheoremProver is wrong. E.g: Pair 2319 (food => pizza). GivenNotTextProbabilisticTheoremProver should be moved above InferenceRuleInjectingProbabilisticTheoremProver
-I have to do repeat in AutoTyping. Example: Pair 501
-Negate or not negate presupposed ? Example for negate: Fracas 26, Example for NotNegate: SICK-rte 3096
-Take presuppposed back into negation. Example: SICK-rte 1125
============================
-Correference resolution is so missedup for FraCas
===========================
-A paper comparing between MLN+MCW vs. PSL. Use my email to the PSL group as a guideline
=======================
-When generating R, how to deal with pred(Univ, Exist) ??
-When generating R, do we need to check if negated or not negated predicates ? e.g. pair 13
-Think how "subset" predicate affect the infernece and see if we can have hard coded rules to fix this issue.
-Better handling of multi=word expressions like: clean up, in front of, ...
Use the index of predicates to correctly pick the subject and solve the bug I have now.
Enhancement (fast system without the GEN phase): Use Lucene to search the vector space for the vectors, instead of linear search
Enhancement (fast system without the GEN phase): User EasyCCG instead of C&C
SOLVED: lemmatizing sentense does not work when called bin/mlnsem sen T H
Pair 85, yet another problem with detecting Contradiction. The idea of removing negation from the negated sentense it WRONG
bug in AutoType. Check pair: 4599. Inference rule that does not propagate evidence
H|T, H|NotT, T|NotH
-It covers the A vs. THE problem, and the negation scope problem
-Negating a formula that already has a negation is by removing that negation
-Do I still need to detect the subject, or the presuposition ? Try it
-Does it cover Contradition because of antonyms ? I hope so
-Suggestion: if only T contains negations, use H|NotT, If only H does, use T|NotH, If both do, then use both T|NotH and H|NotT. If none of them does, ......???
Antonyms rules problems: try pair 538
1-how to negate T ? (same problem as the A and THE)
2-AutoType forces everything to False. This should be taken into account.
3-AutoType should be checked to support the new types of inference rules
New testing and development phase starts end of July 2014
** Summery of the short TODO list:
*)Implment AutoTyping with Skolem functions. Test it on prb 25, sick-rte 791
*)Fix the handling of empty query in alchemy/src/infer/Clause.cpp
*)Use IJGP for ground networks with long formulas (more than 21 )
*)Negation and alignment of negation scope and how to negate T. This should cover the subproblems below: (Examples below)
-)None-0/1 probabilites because of empty domain. Test it on 1003 "A sea turtle is hunting for fish" "The turtle isn't following the fish"
-)Scope of negation: e.g
-)A vs The: eg.
+)Implemented by negating what is at the right of AGENT relation. This is wrong in cases like 307,3301,3436
+)Still many negation scope pairs fail. Some of them are listed below
*)Unknown: 458,2049,2195
---------
Changes of the new Boxer:
-Variables types: entity, event, prop, s
-Hanlding of THERE is slightly different
-Always use MERGE, and ALPHA was never used.
-A very frequent meta predicate called "topic"
------
-Suggestion: hard-coded rules. It would be great if I can get 100% accuracy with the hard-coded rules
-Suggestion: MOST and FEW problem (what about NOT, EXACTLY, MANY): Check entailment as if it is EXIST, then apply modality rules in a separate step.
-Remember to test also on the test-set of SICK, not only the training set
RUntime errors: 363,523,2432,2434,2561,3368,3374,3433,3590,3953,3954,3955,4340,4341,4342,4343,4344,4345,4570,4571
SCRIPT: grep -E "actual:.*-|Pair" condor/afterconf-sick-test-rte-fixpermute-noir-lhsonlyintro/*.out | grep actual -B 1 | grep '\-\-' -v | tr '\n' ' ' | tr ']' '\n' | awk '{print $3 "\t" $6}' | tr ',' ' ' | sort -n
363 0.0:-3.0 Thing->BOX
523 -2.0:-2.0 Parsing error
2432 -2.0:-2.0 Parsing error
2434 -2.0:-2.0 Parsing error
2561 0.06:-5.0 rerun solved it
3368 0.0:-3.0 Longer timeout
3374 0.0:-3.0 Longer timeout
3433 0.0:-1.0 Rerun solved it
3590 0.0:-3.0 Implication with large number of constants
3953 0.0:-1.0 Rerun from local machine solved it, but breaks on Condor
3954 0.0:-1.0 Rerun from local machine solved it, but breaks on Condor
3955 0.0:-1.0 Rerun from local machine solved it, but breaks on Condor
4340 -1.0:-1.0 Undetected parsing errors, empty boxes
4341 -1.0:-1.0 Undetected parsing errors, empty boxes
4342 -1.0:-1.0 Undetected parsing errors, empty boxes
4343 -1.0:-1.0 Undetected parsing errors, empty boxes
4344 0.0:-1.0 Undetected parsing errors, empty boxes
4345 0.0:-1.0 Undetected parsing errors, empty boxes
4570 0.0:-3.0 Longer timeout
4571 -3.0:1.0 Thing->BOX
Inference errors: 86,448,812,964,976,1186,1207,1212,1237,1303,1362,1465,1528,1575,1799,1815,2179,2277,2434,2521,3084,3209,3228,3635,3676,3693,3809,4408,4502,4602 (same errors as in the Training set, like "A" vs "The", scope of negation, misparse, missing rules)
86 1:0 3:1 + 1
448 1:0 3:1 + 1
812 1:0 3:1 + 1
964 1:0 3:1 + 1
976 2:0.5 3:1 + 1
1186 2:0.5 3:1 + 1
1207 1:0 3:1 + 1
1212 2:0.5 3:1 + 1
1237 2:0.5 1:0 + 1
1303 2:0.5 1:0 + 1
1362 2:0.5 1:0 + 1
1465 2:0.5 3:1 + 1
1528 1:0 3:1 + 1
1575 2:0.5 3:1 + 1
1799 2:0.5 1:0 + 1
1815 2:0.5 1:0 + 1
2179 2:0.5 1:0 + 1
2277 2:0.5 1:0 + 1
2434 2:0.5 3:1 + 1
2521 1:0 3:1 + 1
3084 1:0 3:1 + 1
3209 1:0 3:1 + 1
3228 2:0.5 3:1 + 1
3635 1:0 3:1 + 1
3676 1:0 3:1 + 1
3693 2:0.5 3:1 + 1
3809 3:1 1:0 + 1
4408 1:0 3:1 + 1
4502 1:0 3:1 + 1
4602 1:0 3:1 + 1
-Check the error code -7 in alchemy/src/logic/clause.cpp becuase it looks like it is not needed. For example, pair 9 in prb after disabling groundExist and negating hypothesis
-Check the effect of combination-permutation vs combination only. combination only is wrong. It is applied in HardAssumptionAsEvidenceProbabilisticTheoremProver and NoExistProbabilisticTheoremProver. Disable it, run experiment, then enable it and test again. Done. Works better with same accuracy
-run experiment with -lhsOnlyIntro true. Done. Works extreamly faster with the same accuracy.
-run experiment with -lhsOnlyIntro true. Done. Works extreamly faster with the same accuracy.
-Implement the AutoTyping on the HardAssumptionAsEvidenceProbabilisticTheoremProver then run pair 25 in prb.
-Cases of domain with no constants but the default need a second look because they give none 0/1 answers. For exmple, sick-rte 4,6 and prb 29,31. Also, add to these examples the non-0/1 results of the run afterconference-sick-rte-noIR-fixPermute-lhsOnlyIntro
-examples with runtime errors: 167,288,366,412,788,790,791,792,949,2707,2709,2710,3618,3767,4476,4723,4725 (all of them are classified correctly)
Some answered, remanining: 288,366,412,788,791,792,3618,4476,4723,4725
167 0.0:-3.0 Need longer timeout
288 -1.0:0.0 At least one VEC iteration
366 -1.0:0.0 At least one VEC iteration
412 -3.0:0.0 Need longer timeout
788 0.0:-3.0 Insain number of constants because of thing->BOX. Needs the auto-typing with the skolemization
790 1.0:-1.0 Need longer timeout
791 1.0:-3.0 Insain number of constatns because of thing->BOX. Needs the auto-typing with skolemization
792 0.0:-3.0 Insain number of constatns because of thing->BOX. Needs the auto-typing with skolemization
949 -2.0:-2.0 Parsing error
2707 -2.0:-2.0 Parsing error
2709 -2.0:-2.0 Parsing error
2710 -2.0:-2.0 Parsing error
3618 -1.0:0.0 At least one VEC iteration
3767 -1.0:0.0 Need longer timeout
4476 -1.0:-1.0 Undetected Parsing error produces empty boxing
4723 -1.0:-1.0 Undetected Parsing error produces empty boxing
4725 -1.0:-1.0 Undetected Parsing error produces empty boxing
-------
Same as above with the new Boxer:
412 -3.0:0.0, Very long query (27 predicates). SampleSearch needs to call IJGP instead of VEC
667 0.0:-1.0, Works locally
703 -3.0:-3.0, Disjunction gets insane
754 0.0:-1.0, Works locally
756 0.0:-1.0, Works locally
757 0.0:-1.0, Works locally
788 0.0:-1.0, THING->BOX. Needs Skolem functions with AutoType
790 1.0:-1.0, Works locally
791 1.0:-3.0, THING->BOX. Needs Skolem functions with AutoType
792 0.0:-3.0, THING->BOX. Needs Skolem functions with AutoType
949 -2.0:-2.0, Misparse
2707 -2.0:-2.0, Misparse
2709 -2.0:-2.0, Misparse
2710 -2.0:-2.0, Misparse
3618 -3.0:0.0, Use IJGP not VEC
3624 0.0:-1.0, Very weird boxing creats a very large ground network. Memory exhausted
3626 0.0:-1.0, Memory exhausted
4111 0.0:-3.0, THING->BOX. Needs Skolem functions with AutoType
4135 0.0:-3.0, Use IJGP not VEC
4137 0.0:-1.0, Works locally
-examples that are predected to be Neutral but supposed to be otherwise are probably because lake of IR. Few examples are not in this class, and they should be carefully studied. They are: 458,642,974,1003,1145,1196,1227,1240,1371,1531,1653,1991,2049,2195,2521,3346,3738,3897,3899,4505
Details:
458 2:0.5 1:0 + 1 Unknown. "There is no cat eating corn on the cob A cat is eating some corn. Inference makes sense", annotation makes sense, do not know where the problem is.
642 2:0.5 1:0 + 1 Wrong annotation, our system gets it right
974 1:0 3:1 + 1 Misparse. What is peacefull, the girl or the lying ?
1003 2:0.5 3:1 + 1 None 0/1 Pr. No constants generated except the default, and one default is not enough to get the right result.
1145 1:0 3:1 + 1 Misparse or missing rule: r_over_dh(x, y) <=> (over_a_dh(x) ^ r_patient_dh(x, y)).
1196 1:0 3:1 + 1 Weird interaction between "A" and "The" + missing rule. With the rule, problem solved
1227 2:0.5 3:1 + 1 1)Missing rule r_onto_dh(x,y)<=>r_into_dh(x,y), 2)wrong handling of inconsistent mln, 3)Weird interaction between "A" and "The"
1240 1:0 3:1 + 1 1)Missing rule man_n_dh(x)<=>person_n_dh(x), 2)Scope of negation, the same like the weird "A", "The"
1371 1:0 3:1 + 1 1)Misparse or missing rule: man(x) ^ intensely(x) ^ agent(y,x) ^ play(y) = man(x) ^ agent(y,x) ^ play(y)^ intensely(y) 2)Negation scope like the weird interaction mentioned above
1531 1:0 3:1 + 1 1)Missing rule: wakeboard_v_dt(x)<=>waterskie_v_dh(x) 2)None 0/1 pr that can be solved by lhsIntro for negation
1653 1:0 3:1 + 1 1)Missing rule: dog_n_dh(x) => exist y r_of_dh(x,y) ^ dog_n_dh(x) ^ bull_n_dh(y), 2)Negation scope
1991 1:0 3:1 + 1 1)Misparse 2)None 0/1 pr that can be solved by lhsIntro for negation
2049 2:0.5 1:0 + 1 Unknown like 458
2195 2:0.5 1:0 + 1 Unknown like 458. Or may be it is the negation scope
2521 2:0.5 1:0 + 1 Generalized quantifier: Exaclty 2 vs Some
3346 2:0.5 3:1 + 1 Unknown
3738 1:0 3:1 + 1 Scope of negation
3897 1:0 3:1 + 1 Scope of negation
3899 1:0 3:1 + 1 Scope of negation
4505 2:0.5 1:0 + 1 Wrong annotation, our system gets it right
-----------------
With new BOXER + introduction to LHS of Negated Text, problmes are:
*)Unknonw:
458
2049
2195
3897
3899
Representative example:
-A person is not writing with a pencil A person is writing
*)A vs The
1003
1145
1227
1240
3738
565
1719
2899
Representative examples:
-A man is driving a car A man is NOT driving THE car
-A man is driving a new car A man is driving a car that is NOT new
*)Un-even Negation scope
4325
4610
4326
---------------------------
===========================
Things to try: fixed AutoType, repeat, noRelIntro, IR W * 3
-"repeat" in AutoConst is wrong. Fix it <<<<<<<<<<<
-Introduction in SetGoal should reduce number of constants somehow <<<<<<
-AutoConst is missing a case <<<<<<<<
-There are more "there" to be removed <<<<<<<
-pair 790 throws StackOverflow exception <<<<<<<<<<,
-pairs 67,260,368,3197 throw StackOverflow exception
-A false sub-formula in Q should not be removed. Try pair 1356 with and without NoExist. Done. Alchemy breaks in this case, and NoExist supposed to be avoiding these cases.
-Check the case in AutoConst where agent propagates evidence to predicates then these predicates mistakenly propagate more evidnece back to agent <<<
-Get latest vesrion of Boxer for handling of "there"
-Again, it seems that the handling of negation is wrong again. Try pair 2324
-There is a problem similar to the unicorn case in (prb-23-h|NotT)
-Output of pairs: 20-32 showing errors: [1.0000,0.0000 1.0000,0.0000 1.0000,0.0000 1.0000,-1.0000 -3.0000,0.0000 -3.0000,-1.0000 1.0000,0.0000 1.0000,-3.0000 1.0000,-3.0000 1.0000,0.0000 1.0000,0.0000 1.0000,0.0000 1.0000,0.0000]
-I think equalities are messed up in NoExistProbabilisticTheoremProver
-PositiveEqEliminatingProbabilisticTheoremProver has to be removed. THen, equality handling goes into SetGoal
-P(H|-T) takes forever for some examples like : 34,190,502,682,717
--run the following and explain why hundereds of skolem_1 are generated
alchemy/bin/infer -i /tmp/temp-3270558921426446116.mln -e /tmp/temp-3666499365217559437.db -r /tmp/temp-8210559381267654219.res -ss -maxSeconds 5 -ssq /tmp/temp-2905646586885501927.q -q entailment_h,compnay_n_dh,guy_n_dh,own_v_dh,r_patient_dh,r_agent_dh,skolem_1 -focusGround
-Add the semantic role uniqueness constraint. PSL and MLN.
-Semantic role uniqueness constraint is not enough. We actually need a NotEqual in many places in the hypothesis, in both RTE and STS.
-Semantic role uniqueness constrain added to PSL but it is not working because weight of relation predicates is zero. Fix this
-add a negation predicate in PSL-STS.
-Try multiple weights between relation predicates and entities predicates
Make PSL works for RTE. Negation in Premise is fine. Negation in hypothesis is very wrong
-----
Alchemy:
-probably I will have to generate constants in the function "propagate"
-predicates of more than two args are possible
-intersection of possible constats is wrong. It does not handle ANY right
-take source of consts into acount
-why "all birds do not fly no bird flies" should equal 1
-generate negative evidence also for the negated existentially quantified variables.
-more negative evd can be generated or a^r1^b=>c^r2^d. Arrows have conjunction.
-make the application of the negative evidence prior to converting to CNF. May be this can be done with some manipulation of the typing system.
-make sure it is strict subset
-There is a case where picking the smaller-sized type is not enough.
-Do not forget the typing problem when handling existential quantifier
-in progress two exp, condor/sick-rte-ir0-noIntro and condor/sick-rte-ir0-noIntroVerylong. Both Introduction is disabled(this messes negation and universal)
-count univ and exist to decide negate or not
-find the rule of everything, where NotExist is treated as Univ
-Do not forget the unicorn. It has examples in the SICK ds
-In sample search, it is better to go back to log10. No
-What to do with relation predicates with one univ, one exist var. Ignore it.
-Contradicting MLNs return Z=1 but it should return Z=0
-How to do negation in text and hypothesis ?
-When to negate H and when not ? (for computational efficieny). Looks like always negate
-Parsing bug in deallocation alchemy/src/parser/listobj.h. For now, deallocation disabled
-FIXED: Why IR changes affects p(-H) and p(H) differently ? (They do not).
-FIXED: Why p(H) not equal 1-p(-H) ? Beucase of bug in parsing. Bug fixed
-FIXED: try examples 24,25,27,28 for more bugs in alchemy. Done
-FIXED: Infinite loop in HardAssumptionAsEvidenceProbabilisticTheoremProver in example 13.
-FIXED: wierd handling of a formula like a(x)^b(y).
-FIXED: wrong handling of weights: infinity + w = 10 + w
-----------
SampleSearch:
-GM::reduceDomains can be better by removing functions that are always true
-what to do with LogFunction.h
-what to do with CPT.h
-what to do with SF.h
-------------------
-still, there are types problems in inference rules. I will disable recovering types for the current experiments
-paraphrase to logic
-prenex normal form
-multiparses and svm
-get latest candc and boxer
-try to tune candc parameters to reduce no-parse
-Can we use !H instead of H<->ent to avoid the problem of H- ?
-H- now is bad bad bad.
-OR in PositiveEqEliminating could be wrong
-Negation in PositiveEqEliminating and HardAssumptionAsEvidence is absolutely wrong.
1-Naming of HardAssumptionAsEvidence and PositiveEqEliminating is not decriptive
5-fixDCA or noFixDCA is not implemented in PositiveEqEliminating (well, it automatically works if "-keepUniv false")
3-Existance is done for the outer most group of universals, not for embeded univs under exists
SUggestion:
-STS is running RTE twice then average. Do not do both of them in one big MLN
how to parse multi-sentences T or H
//==========================================
Merging Cuong's fork with main(ToDo)
------------------------------
* Code of running condor is replaced with my scripts for condor. What is still missing is organizging the output and running the scripts on it.
-Different sets of scripts are required for different data sets, STS, RTE, and Baroni
* Run the system on Torrento's dataset with their phrases.
* In AlchemyTheoremProver.scala, there is a function to generate rules based on wordnet, it is commented for now but it should not
//==========================================
-Function: InferenceRuleInjectingProbabilisticTheoremProver.convertParaphraseToFOL is completely wrong.
espcially, How to map variables from LHS to RHS ? do we use a new Existential Quantifier ?
Now, the mapping is so messed up. It is as it was implemented by coung
It is even worse. Now, it assumes that the phrases are a list of conjuncted predicates. This is wrong, we have rules contain negations.
-How to generate rules for phrases. We need a principled way to extract phrases and map them to logical expressions.
Assuming we follow the technique we have now, there still a problem with it. e.g:
A little boy is playing. A fat girl is eating.
Phrases are:
little boy (correct)
fat girl (correct)
little boy play (not sure)
little play (not sure)
boy play (not sure)
fat girl eat (not sure)
girl eat (not sure)
fat eat (not sure)
For now, all of them are possible phrases.
-Locking in Lucene will prevent parallalization in condor.
-Tokenization:
Tokenization details and different modes of tokenization are in utcompling/mlnsemantics/datagen.Tokenize.scala
-Removing special characters: It is two steps,
(a) removing quotes if the text is quoted before generating inference rules.
(b) replace special characters with _ before convert to FOL in PredicateCleaningBoxerExpressionInterpreterDecorator
In all cases, candc gets somehow confused when parsing special characters.
//==========================================Future enhancements
-When supporting embeded proposition, UnnecessarySubboxRemovingBoxerExpressionInterpreter has to change (unremove theme and propositions)
-When parsing, WH questions are removed. This can be fixed here: BoxerExpressionParser
-Weighted universal rule to simulate generalized quantifiers.
//==========================================
830: takes 18 minutes: it is slightly changed to avoid this loong wait
597: type
610: type
679: type
803: type
1399: type
825: parsing (%) should be added to the forbidden char
904: parsing (#) should be added to the forbidden char
905: NotEqual: two variables unmatching types
1067: parsing (%) should be added to the forbidden char
1171: parsing (%) should be added to the forbidden char
1341: parsing (@) should be added to the forbidden char
1446: parsing (@) should be added to the forbidden char
remove predicates named topic_
remove _loc_, _nam_ ...
133: wrong parse
564: need a long paraphrasing rule
918: add a rule between two predicates if their characters are matching. E.g sew and sewing.
940: we do not fix spelling mistakes. Maybe the hack above could help
152: kengaroo. Why do not we remove _POS_ ?? THink carefully about it.
1321: people vs group of people
1024: wrong parse. pizza vs slice of pizza
245: wrong parse
852: wrong parse
951: wrong spelling of kangaroo. Slightly low weights. Solution on 918 may help a bit
DO WE ALLOW rules between non-matching POS predicates ???
963: terrible wrong parse for a very easy sentence. easy paraphrasing rule: chop up vs dice
1305: terrible wrong parse. No workaround can fix it.
95: wrong parse
1488: boxer's nasty way of doing AND
1040: wrong parse
9: crappy boxing. It needs clever code.
1131: paraprasing: apply butter = puttering, peice of bread vs bread
1449: missed correference resolution. how do handle "neuter" ?
223: wrong parse.
7: verctor space weight is very very low.
11: very low combination weights in case of very few lines
1072: try to replace all relations with REL instead of the name
r_of_dt(x, y) <=> r_on_dt(y, x)
r_and_dh(x, y) => r_and_dh(y,x)
Try BP
Try removing all rel except agent, patient
terrible overestimated in PAR: 674, 531, 541, 712
=========================================================
PSL
2-Is it better to limit number of nulls ?
3-replace "all" with "default".
4-grounding of avg, entailment_h & entailment_t >> entailment does not work correctly if one of the two predicates in the LHS is zero. It works as AND not AVG
5-Timeout: 498, 517, 664
Ill-posed optimization: 960, 1431
Zero because of error 4 above: 592, 924, 1418
============================================================
Logic:
-Handling negation is wrong. For now, it is replaced with dummy predicate
-Handling some of the equality contstraints is also wrong