-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syntactic behaviour should be better modelled #8
Comments
I would propose the following
For example <Lexicon>
<SyntacticBehaviour id="transitive" subcategorizationFrame="Someone %s something"/>
<LexicalEntry id="ewn-do-v" syntacticBehaviour="transitive">
...
<!-- Ideally we either indicate syntactic behaviour on the entry OR the sense... no need to do both -->
<Sense id="sense1" syntacticBehaviour="transitive"/>
</LexicalEntry>
</Lexicon> |
I think otherwise I agree (although can we call it |
Correct me if I am wrong, but a single sense should be able to have multiple values for |
Use IDREFS (note the S), meaning a sense can have multiple verb frames |
Yes, I was proposing using We could certainly use |
Two things here:
|
I agree that they should only be specified on senses or synsets (where all
senses in the synset share the same syntactic behaviour).
…On Mon, Aug 24, 2020 at 5:36 PM Michael Wayne Goodman < ***@***.***> wrote:
Two things here:
1.
(emphasis added)
The tag <SyntacticBehaviour> can now *also* appear under the <Lexicon>
tag
The <LexicalEntry> element can now (in #29
<#29>) take a subcat
attribute. Why should we continue to allow <SyntacticBehaviour>
elements to be defined in <LexicalEntry> elements? If you're concerned
about backward compatibility, can we at least deprecate the old pattern
(e.g., document it as such, tools can generate a warning) and then properly
remove it in the future?
2.
Actually, what is the purpose of allowing subcat frames on both
lexical entries and senses? Is the intuition that a frame on a lexical
entry is shared by all its senses? If so, instead of adding this layer of
interpretation onto the data, why don't we just be explicit and specify the
frames on senses only?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIPZRTQVIYGXBNXEZRST5LSCIWJRANCNFSM4JJRRQEA>
.
--
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University
|
Why not only in senses to avoid extra confusion? Even if all senses of a given synset have the same syntactic behavior |
Whether on To be more precise, here's what I (and @arademaker, it seems) are proposing ( --- a/WN-LMF-1.0.dtd
+++ b/WN-LMF-1.0.dtd
@@ -2,7 +2,7 @@
<!ELEMENT LexicalResource (Lexicon+)>
<!ATTLIST LexicalResource
xmlns:dc CDATA #FIXED "http://purl.org/dc/elements/1.1/">
-<!ELEMENT Lexicon (LexicalEntry+, Synset*)>
+<!ELEMENT Lexicon (LexicalEntry+, Synset*, SyntacticBehaviour*)>
<!ATTLIST Lexicon
id ID #REQUIRED
label CDATA #REQUIRED
@@ -29,7 +29,7 @@
status CDATA #IMPLIED
note CDATA #IMPLIED
confidenceScore CDATA "1.0">
-<!ELEMENT LexicalEntry (Lemma, Form*, Sense*, SyntacticBehaviour*)>
+<!ELEMENT LexicalEntry (Lemma, Form*, Sense*)>
<!ATTLIST LexicalEntry
id ID #REQUIRED
dc:contributor CDATA #IMPLIED
@@ -83,7 +83,8 @@
note CDATA #IMPLIED
confidenceScore CDATA #IMPLIED
lexicalized (true|false) "true"
- adjposition (a|ip|p) #IMPLIED>
+ adjposition (a|ip|p) #IMPLIED
+ subcat IDREFS #IMPLIED>
<!ELEMENT Synset (Definition*, ILIDefinition?, SynsetRelation*, Example*)>
<!ATTLIST Synset
id ID #REQUIRED
@@ -211,6 +212,7 @@
confidenceScore CDATA #IMPLIED>
<!ELEMENT SyntacticBehaviour EMPTY>
<!ATTLIST SyntacticBehaviour
+ id ID #REQUIRED
subcategorizationFrame CDATA #REQUIRED
senses IDREFS #IMPLIED>
<!ELEMENT Count (#PCDATA)> |
There are other models like OntoLex/LMF, which model syntactic behaviour solely on the entry level. However, for the moment, I only know of wordnets that model this on the sense level so we can introduce this modelling in v1.1. If there is a demand for modelling at the entry level too later, we can easily add this. I have updated the PR. |
NB. Small note on @goodmami's version. I think to keep backwards compatibility we should still allow |
Fair enough. Is this equivalent to putting it under And relatedly, do we have a process for breaking backward compatibility (e.g., "deprecate, then remove after a year")? If we keep everything backward compatible, the format will accumulate a lot of cruft. |
On backwards compatibility, I think we should go by version numbering. e.g., 1.x is fully backwards compatible with 1.y (where x > y) but 2.0 can introduce breaking changes. |
That sounds good to me.
…On Mon, Oct 5, 2020 at 6:04 PM John McCrae ***@***.***> wrote:
On backwards compatibility, I think we should go by version numbering.
e.g., 1.x is fully backwards compatible with 1.y (where x > y) but 2.0 can
introduce breaking changes.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIPZRWR24HG3KHZJ2N64STSJGKITANCNFSM4JJRRQEA>
.
--
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University
|
Closed by #38 |
@fcbond
The text was updated successfully, but these errors were encountered: