Syntactic behaviour should be better modelled #8

jmccrae · 2019-11-06T08:45:23Z

Why don't we add syntactic behavior to senses (and possibly synsets), which is where it is in PWN.
It should not be on the lexical entry, ...

jmccrae · 2020-07-30T10:12:30Z

I would propose the following

The tag <SyntacticBehaviour> can now also appear under the <Lexicon> tag
<Sense> and <LexicalEntry> can refer to syntactic behaviours by ID

For example

<Lexicon>
  <SyntacticBehaviour id="transitive" subcategorizationFrame="Someone %s something"/>
  <LexicalEntry id="ewn-do-v" syntacticBehaviour="transitive">
     ...
    <!-- Ideally we either indicate syntactic behaviour on the entry OR the sense... no need to do both -->
    <Sense id="sense1" syntacticBehaviour="transitive"/>
  </LexicalEntry>
</Lexicon>

fcbond · 2020-07-31T06:04:24Z

I think <Synset> and <Sense> should have syntacticBehaviour, not <Sense> and <LexicalEntry>,

otherwise I agree (although can we call it subCat to make it easier to fit things in our screens)?

lmorgadodacosta · 2020-07-31T07:08:38Z

Correct me if I am wrong, but a single sense should be able to have multiple values for SyntacticBehaviour.
See, for example, here: 'give' in 02199590-v (OMW).
This being the case, wouldn't it be preferable to use nested elements instead an attribute?

1313ou · 2020-07-31T08:54:05Z

Use IDREFS (note the S), meaning a sense can have multiple verb frames

jmccrae · 2020-07-31T09:07:11Z

Yes, I was proposing using IDREFS to give multiple links.

We could certainly use subCat as the attribute name... shorter can be better

goodmami · 2020-08-24T09:21:13Z

Two things here:

(emphasis added)

The tag <SyntacticBehaviour> can now also appear under the <Lexicon> tag

The <LexicalEntry> element can now (in Improve representation of sense subcategorizations #29) take a subcat attribute. Why should we continue to allow <SyntacticBehaviour> elements to be defined in <LexicalEntry> elements? If you're concerned about backward compatibility, can we at least deprecate the old pattern (e.g., document it as such, tools can generate a warning) and then properly remove it in the future?
Actually, what is the purpose of allowing subcat frames on both lexical entries and senses? Is the intuition that a frame on a lexical entry is shared by all its senses? If so, instead of adding this layer of interpretation onto the data, why don't we just be explicit and specify the frames on senses only?

fcbond · 2020-08-28T03:17:13Z

I agree that they should only be specified on senses or synsets (where all senses in the synset share the same syntactic behaviour).

…

On Mon, Aug 24, 2020 at 5:36 PM Michael Wayne Goodman < ***@***.***> wrote: Two things here: 1. (emphasis added) The tag <SyntacticBehaviour> can now *also* appear under the <Lexicon> tag The <LexicalEntry> element can now (in #29 <#29>) take a subcat attribute. Why should we continue to allow <SyntacticBehaviour> elements to be defined in <LexicalEntry> elements? If you're concerned about backward compatibility, can we at least deprecate the old pattern (e.g., document it as such, tools can generate a warning) and then properly remove it in the future? 2. Actually, what is the purpose of allowing subcat frames on both lexical entries and senses? Is the intuition that a frame on a lexical entry is shared by all its senses? If so, instead of adding this layer of interpretation onto the data, why don't we just be explicit and specify the frames on senses only? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIPZRTQVIYGXBNXEZRST5LSCIWJRANCNFSM4JJRRQEA> .

-- Francis Bond <http://www3.ntu.edu.sg/home/fcbond/> Division of Linguistics and Multilingual Studies Nanyang Technological University

arademaker · 2020-08-28T03:56:16Z

Why not only in senses to avoid extra confusion? Even if all senses of a given synset have the same syntactic behavior

goodmami · 2020-08-28T04:20:18Z

[...] or synsets (where all senses in the synset share the same syntactic behaviour).

Whether on <LexicalEntry> or <Synset>, this kind of interpretation needs to be implemented by the software and isn't explicit in the data.

To be more precise, here's what I (and @arademaker, it seems) are proposing (subcat only on <Sense>) for LMF:

--- a/WN-LMF-1.0.dtd
+++ b/WN-LMF-1.0.dtd
@@ -2,7 +2,7 @@
 <!ELEMENT LexicalResource (Lexicon+)>
 <!ATTLIST LexicalResource
     xmlns:dc CDATA #FIXED "http://purl.org/dc/elements/1.1/">
-<!ELEMENT Lexicon (LexicalEntry+, Synset*)>
+<!ELEMENT Lexicon (LexicalEntry+, Synset*, SyntacticBehaviour*)>
 <!ATTLIST Lexicon
     id ID #REQUIRED
     label CDATA #REQUIRED
@@ -29,7 +29,7 @@
     status CDATA #IMPLIED
     note CDATA #IMPLIED
     confidenceScore CDATA "1.0">
-<!ELEMENT LexicalEntry (Lemma, Form*, Sense*, SyntacticBehaviour*)>
+<!ELEMENT LexicalEntry (Lemma, Form*, Sense*)>
 <!ATTLIST LexicalEntry
     id ID #REQUIRED
     dc:contributor CDATA #IMPLIED
@@ -83,7 +83,8 @@
     note CDATA #IMPLIED
     confidenceScore CDATA #IMPLIED
     lexicalized (true|false) "true"
-    adjposition (a|ip|p) #IMPLIED>
+    adjposition (a|ip|p) #IMPLIED
+    subcat IDREFS #IMPLIED>
 <!ELEMENT Synset (Definition*, ILIDefinition?, SynsetRelation*, Example*)>
 <!ATTLIST Synset
     id ID #REQUIRED
@@ -211,6 +212,7 @@
     confidenceScore CDATA #IMPLIED>
 <!ELEMENT SyntacticBehaviour EMPTY>
 <!ATTLIST SyntacticBehaviour
+  id ID #REQUIRED
   subcategorizationFrame CDATA #REQUIRED
   senses IDREFS #IMPLIED>
 <!ELEMENT Count (#PCDATA)>

jmccrae · 2020-09-01T09:28:32Z

There are other models like OntoLex/LMF, which model syntactic behaviour solely on the entry level. However, for the moment, I only know of wordnets that model this on the sense level so we can introduce this modelling in v1.1. If there is a demand for modelling at the entry level too later, we can easily add this.

I have updated the PR.

jmccrae · 2020-09-01T09:29:45Z

NB. Small note on @goodmami's version. I think to keep backwards compatibility we should still allow <SyntacticBehaviour> to appear under <LexicalEntry>

goodmami · 2020-09-01T09:33:30Z

I think to keep backwards compatibility we should still allow <SyntacticBehaviour> to appear under <LexicalEntry>

Fair enough. Is this equivalent to putting it under <Lexicon>? That is, it only introduces a syntactic behavior that we can refer to, and doesn't carry any meaning about it being associated with the <LexicalEntry>?

And relatedly, do we have a process for breaking backward compatibility (e.g., "deprecate, then remove after a year")? If we keep everything backward compatible, the format will accumulate a lot of cruft.

jmccrae · 2020-10-05T09:59:22Z

On backwards compatibility, I think we should go by version numbering. e.g., 1.x is fully backwards compatible with 1.y (where x > y) but 2.0 can introduce breaking changes.

fcbond · 2020-10-06T01:44:00Z

That sounds good to me.

…

On Mon, Oct 5, 2020 at 6:04 PM John McCrae ***@***.***> wrote: On backwards compatibility, I think we should go by version numbering. e.g., 1.x is fully backwards compatible with 1.y (where x > y) but 2.0 can introduce breaking changes. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIPZRWR24HG3KHZJ2N64STSJGKITANCNFSM4JJRRQEA> .

-- Francis Bond <http://www3.ntu.edu.sg/home/fcbond/> Division of Linguistics and Multilingual Studies Nanyang Technological University

jmccrae · 2021-04-20T13:49:07Z

Closed by #38

jmccrae mentioned this issue Feb 27, 2020

2.10 #13

Closed

jmccrae added the enhancement label May 25, 2020

jmccrae mentioned this issue May 26, 2020

Order of words inside a synset #17

Closed

jmccrae mentioned this issue Jul 14, 2020

Validation Schema (dc + foreign keys + namespaces + sensekeys) #5

Closed

jmccrae added this to the v1.1 milestone Jul 30, 2020

jmccrae mentioned this issue Aug 4, 2020

Improve representation of sense subcategorizations #29

Merged

jmccrae closed this as completed Apr 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syntactic behaviour should be better modelled #8

Syntactic behaviour should be better modelled #8

jmccrae commented Nov 6, 2019

jmccrae commented Jul 30, 2020

fcbond commented Jul 31, 2020 •

edited

Loading

lmorgadodacosta commented Jul 31, 2020

1313ou commented Jul 31, 2020

jmccrae commented Jul 31, 2020

goodmami commented Aug 24, 2020

fcbond commented Aug 28, 2020 via email

arademaker commented Aug 28, 2020

goodmami commented Aug 28, 2020

jmccrae commented Sep 1, 2020

jmccrae commented Sep 1, 2020

goodmami commented Sep 1, 2020

jmccrae commented Oct 5, 2020

fcbond commented Oct 6, 2020 via email

jmccrae commented Apr 20, 2021

Syntactic behaviour should be better modelled #8

Syntactic behaviour should be better modelled #8

Comments

jmccrae commented Nov 6, 2019

jmccrae commented Jul 30, 2020

fcbond commented Jul 31, 2020 • edited Loading

lmorgadodacosta commented Jul 31, 2020

1313ou commented Jul 31, 2020

jmccrae commented Jul 31, 2020

goodmami commented Aug 24, 2020

fcbond commented Aug 28, 2020 via email

arademaker commented Aug 28, 2020

goodmami commented Aug 28, 2020

jmccrae commented Sep 1, 2020

jmccrae commented Sep 1, 2020

goodmami commented Sep 1, 2020

jmccrae commented Oct 5, 2020

fcbond commented Oct 6, 2020 via email

jmccrae commented Apr 20, 2021

fcbond commented Jul 31, 2020 •

edited

Loading