Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.11 F1 score #15

Open
miguelballesteros opened this issue Sep 27, 2018 · 3 comments

Comments

@miguelballesteros
Copy link

commented Sep 27, 2018

I tried evaluating a single sentence against itself and I got a Smatch score greater than one (!!), any idea why?
Thank you.

Details below:

python smatchnew/smatch/smatch.py -f q3.txt q3.txt
F-score: 1.11

cat q3.txt
# ::snt How many white settlers were living in Kenya in the 1950's ?
(l / live-01
      :ARG0 (p / person
            :ARG1-of (s / settle-03
                  :ARG1 p
                  :ARG4 c)
            :ARG1-of (w / white-02)
            :quant (a / amr-unknown))
      :location (c / country :name "Kenya")
      :time (d / date-entity :decade 1950))
@snowblink14

This comment has been minimized.

Copy link
Owner

commented Sep 30, 2018

@miguelballesteros I think it's because smatch has the assumption that the same triple can only occur once. In your example, you have:ARG0 (p / person :ARG1-of (s / settle-03 :ARG1 p, which results in two same triples <ARG1, settle, person>.

I am not sure if this duplication is a mistake or intended behavior, but we could add something to fix if there are more than one same triples.

If you remove :ARG1 p from your example, the score will be 1.0.

@miguelballesteros

This comment has been minimized.

Copy link
Author

commented Sep 30, 2018

I see, makes sense. I understand that this needs to occur both in the gold graph and in the predicted graph; if it only happens in the predicted graph it wont have that effect, is this right?

@snowblink14

This comment has been minimized.

Copy link
Owner

commented Oct 1, 2018

Currently smatch treats gold graph and predicted graph equally, so if the duplication happens in the predicted graph it will also cause some overcounting. Before a fix is applied, a workaround is to check if there are duplicate triples in your graphs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.