interpretation of TOP property #25

oepen · 2020-01-05T12:22:18Z

when comparing SMATCH results with the scorer from the 2019 CoNLL Shared Task on Cross-Framework Meaning Representation Parsing (MRP), we discovered that SMATCH will only consider the TOP property correct if the node labels also match. this appears to double-penalize for label mismatches and is maybe not the intended behavior? for more technical detail and a minimal test case, please see the MRP mtool issue.

The text was updated successfully, but these errors were encountered:

goodmami · 2020-01-06T08:16:39Z

Good catch. Here is an even-more-minimal test case:

$ cat i25a
(a / alpha
   :ARG0 (b / beta))
$ cat i25b  # only top concept different
(a / alternative
   :ARG0 (b / beta))
$ cat i25c  # only non-top concept different
(a / alpha
   :ARG0 (b / b-side))
$ python smatch.py --pr -f i25a i25b
Precision: 0.50
Recall: 0.50
F-score: 0.50
$ python smatch.py --pr -f i25a i25c
Precision: 0.75
Recall: 0.75
F-score: 0.75

When smatch is run with -v it shows how it considers the top triple an "attribute":

[...]
Instance triples of AMR 1: 2
[('instance', 'a0', 'alpha'), ('instance', 'a1', 'beta')]
Attribute triples of AMR 1: 1
[('TOP', 'a0', 'alpha')]
[...]

Note that the concept alpha is part of that TOP triple, when all it should probably do is indicate the top node. I tried making the attribute triple into a simple cycle on the top node (i.e., so the triple looks like ('TOP', 'a0', 'a0')) and seem to have resolved the ~~error~~ difference:

--- a/amr.py
+++ b/amr.py
@@ -418,7 +418,7 @@ class AMR(object):
             relation_list.append(node_rel_list)
             attribute_list.append(node_attr_list)
         # add TOP as an attribute. The attribute value is the top node value
-        attribute_list[0].append(["TOP", node_value_list[0]])
+        relation_list[0].append(["TOP", node_name_list[0]])
         result_amr = AMR(node_name_list, node_value_list, relation_list, attribute_list)
         return result_amr

Results in:

$ python smatch.py --pr -f i25a i25b
Precision: 0.75
Recall: 0.75
F-score: 0.75

Now it gets the score of 0.75 for both when the top label is different (i25b) and when the non-top label is different (i25c). In addition, it gets 0.75 when i25a is compared to the following variant which has the same triples but a different top:

(b / beta
   :ARG0-of (a / alpha))

oepen · 2020-01-12T21:20:30Z

i believe it has not consequences for the overall F1 score, but in my view the TOP property would better be considered a node attribute rather than a relation between a node and itself. treating it as an attribute would avoid (vacuously) stipulating a self-loop in the graph, and it preserves a one-to-one correspondence between relation triples and actual graph edges.

goodmami · 2020-01-13T01:02:41Z

If that matters to you, then here's a diff that creates an attribute triple instead. It is probably equally unlikely to be confused with an actual graph triple (since we have no role called TOP). It leads to the same scores reported above for the TOP relation triple, at least for the 4 test files defined above.

--- a/amr.py
+++ b/amr.py
@@ -421,7 +421,7 @@ class AMR(object):
             relation_list.append(node_rel_list)
             attribute_list.append(node_attr_list)
         # add TOP as an attribute. The attribute value is the top node value
-        attribute_list[0].append(["TOP", node_value_list[0]])
+        attribute_list[0].append(["TOP", 'top'])
         result_amr = AMR(node_name_list, node_value_list, relation_list, attribute_list)
         return result_amr

oepen · 2020-05-28T17:04:10Z

hi again, @goodmami: the above looks just right to me (treating the TOP property as an attribute with a vacuous value). would you be set up to submit that as a pull request?

i am wondering whether we could try to work through some more of the known issues and hope to arrive at a re-release of an improved SMATCH sometime over the next few weeks?

goodmami · 2020-05-29T01:22:28Z

Yes, I can put together a PR for that, but for a re-release what do people think of replacing amr.py with Penman? Doing so would solve several existing issues here. It's on PyPI, and I've recently released a 1.0 version to go with my ACL 2020 demo of the library. The 1.0 version isn't so much a big change from the previous version as it is a commitment to support its current API.

Fixes snowblink14#25

* Add tests/ subdirectory with one test file * Make TOP relation value constant Fixes #25 * Clarify CHANGELOG entry

goodmami added a commit to goodmami/smatch that referenced this issue May 29, 2020

Make TOP relation value constant

57b61f6

Fixes snowblink14#25

goodmami mentioned this issue May 29, 2020

Make TOP relation's value constant #30

Merged

snowblink14 closed this as completed in #30 May 30, 2020

snowblink14 pushed a commit that referenced this issue May 30, 2020

Make TOP relation's value constant (#30)

842bc9f

* Add tests/ subdirectory with one test file * Make TOP relation value constant Fixes #25 * Clarify CHANGELOG entry

rafaelanchieta mentioned this issue Jun 10, 2020

TOP problem #36

Closed

goodmami mentioned this issue Nov 28, 2023

difference with Smatch flipz357/smatchpp#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

interpretation of TOP property #25

interpretation of TOP property #25

oepen commented Jan 5, 2020 •

edited

Loading

goodmami commented Jan 6, 2020 •

edited

Loading

oepen commented Jan 12, 2020

goodmami commented Jan 13, 2020

oepen commented May 28, 2020

goodmami commented May 29, 2020

interpretation of TOP property #25

interpretation of TOP property #25

Comments

oepen commented Jan 5, 2020 • edited Loading

goodmami commented Jan 6, 2020 • edited Loading

oepen commented Jan 12, 2020

goodmami commented Jan 13, 2020

oepen commented May 28, 2020

goodmami commented May 29, 2020

oepen commented Jan 5, 2020 •

edited

Loading

goodmami commented Jan 6, 2020 •

edited

Loading