Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different results #1

Closed
kovvalsky opened this issue Jun 15, 2016 · 1 comment
Closed

Different results #1

kovvalsky opened this issue Jun 15, 2016 · 1 comment
Assignees

Comments

@kovvalsky
Copy link

Hi, thanks for the nice work!
I installed ccg2lambda but it gives lower results. I am using candc 1.00.
The solving and evaluation was done without any error reports.

                              all premi.          single           multi
generalized_quantifiers   |      0.68      |      0.70      |      0.63     
plurals                   |      0.58      |      0.54      |      0.67     
adjectives                |      0.68      |      0.87      |      0.29     
comparatives              |      0.48      |      0.62      |      0.33     
attitudes                 |      0.69      |      0.78      |      0.50     
verbs                     |      0.62      |      0.62      |      ----     
total                     |      0.62      |      0.68      |      0.52 

Could you include the fracas.xml_results, fracas.xml_plain and fracas.xml_parsed directories with the reported results that one can compare his/her own results and find out where the systems differ.

For example, there is one multipremised problem missed in attitudes. This is what I got for multipremised attitude problems.

fracas_340_attitudes    unk     unk
fracas_341_attitudes    unk     unk
fracas_343_attitudes    yes     unk
fracas_344_attitudes    yes     unk

the .err files in results are mostly empty but, for instance, for 340 and 341 they are not empty.
Although the .err files for 343 and 344 are empty.
fracas_340_attitudes.err is:

WARNING:root:There is probably a problem in the typecheck resolution of expression _'(\x._heart(x)) with signature {'x': e, '_heart': <e,t>, "_'": e}
WARNING:root:There is probably a problem in the typecheck resolution of expression _'(\x._heart(x),\F1 F2.exists x.((x = _jones) & F1(_jones) & F2(_jones))) with signature {'F2': <e,t>, '_heart': <e,t>, 'F1': <e,t>, 'x': e, '_jones': e, "_'": e}
WARNING:root:There is probably a problem in the typecheck resolution of expression _'(\x._heart(x),\F1 F2.exists x.((x = _jones) & F1(_jones) & F2(_jones)),\w.TrueP) with signature {'F2': <e,t>, '_heart': <e,t>, 'F1': <e,t>, 'x': e, '_jones': e, "_'": e}
WARNING:root:There is probably a problem in the typecheck resolution of expression _'(\x._heart(x),\F1 F2.exists x.((x = _jones) & F1(_jones) & F2(_jones)),\w.TrueP,\x.X0(\w.TrueP,\y._beat(x,y))) with signature {'F2': <e,t>, '_heart': <e,t>, '_beat': <e,<e,t>>, 'X0': <<t,e>,<<e,t>,t>>, 'F1': <e,t>, 'x': e, 'y': e, '_jones': e, "_'": e}
WARNING:root:There is probably a problem in the typecheck resolution of expression _see(_smith,_'(\x._heart(x),\F1 F2.exists x.((x = _jones) & F1(_jones) & F2(_jones)),\w.TrueP,\x.X0(\w.TrueP,\y._beat(x,y)))) with signature {'F2': <e,t>, '_heart': <e,t>, '_see': <e,t>, 'X0': <<t,e>,<<e,t>,t>>, '_beat': <e,<e,t>>, '_smith': e, 'F1': <e,t>, 'x': e, 'y': e, '_jones': e, "_'": e}

fracas_341_attitudes.err is:

WARNING:root:There is probably a problem in the typecheck resolution of expression _'(\x._heart(x)) with signature {'_heart': <e,t>, 'x': e, "_'": e}
WARNING:root:There is probably a problem in the typecheck resolution of expression _'(\x._heart(x),\F1 F2.exists x.((x = _jones) & F1(_jones) & F2(_jones))) with signature {'_jones': e, '_heart': <e,t>, 'x': e, 'F2': <e,t>, "_'": e, 'F1': <e,t>}
WARNING:root:There is probably a problem in the typecheck resolution of expression _'(\x._heart(x),\F1 F2.exists x.((x = _jones) & F1(_jones) & F2(_jones)),\w.TrueP) with signature {'_jones': e, '_heart': <e,t>, 'x': e, 'F2': <e,t>, "_'": e, 'F1': <e,t>}
WARNING:root:There is probably a problem in the typecheck resolution of expression _'(\x._heart(x),\F1 F2.exists x.((x = _jones) & F1(_jones) & F2(_jones)),\w.TrueP,\x.X0(\w.TrueP,\y._beat(x,y))) with signature {'X0': <<t,e>,<<e,t>,t>>, '_jones': e, 'y': e, '_heart': <e,t>, 'x': e, 'F2': <e,t>, "_'": e, 'F1': <e,t>, '_beat': <e,<e,t>>}
WARNING:root:There is probably a problem in the typecheck resolution of expression _see(_smith,_'(\x._heart(x),\F1 F2.exists x.((x = _jones) & F1(_jones) & F2(_jones)),\w.TrueP,\x.X0(\w.TrueP,\y._beat(x,y)))) with signature {'_beat': <e,<e,t>>, 'F2': <e,t>, 'X0': <<t,e>,<<e,t>,t>>, '_jones': e, 'y': e, '_heart': <e,t>, '_smith': e, '_see': <e,t>, "_'": e, 'x': e, 'F1': <e,t>}
@pasmargo
Copy link
Contributor

Hi! Thank you for taking a look to ccg2lambda!

I have re-run the pipeline on a fresh installation (using candc 1.00) and I am still obtaining the same results as we reported in the README. I have just uploaded a new directory named fracas_intermediate_results (there is another README inside, telling what file extensions mean) that you can use to compare your output. Please note that fracas.xml_results/main.html has a summary of results which can be of help.

In my run, I have the fracas_343_attitudes in the results directory with the entailment result "yes" and the HTML file. However, I also get the same errors for fracas_340_attitudes.err and fracas_341_attitudes.err. Those errors seem semantic parsing errors (inaccuracies of our templates or wrong type resolutions) that we would like to fix someday.

Sometimes I also get lower results than usual. In those cases, the reason is that I forget to turn on the python 3 virtual environment, or to compile the coqlib.v. Please check if that also happened to you. If that is not the case, send us the files related to the problem for which you get different results, and I will see if I find the cause.

Thank you again!
Pascual

@pasmargo pasmargo self-assigned this Jun 16, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants