-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Access to tagset in ShiftReduceParser #41
Comments
Is exposing the tagset() method the most useful one to expose? What kind of information do you need to properly understand/interpret the information in the tagset, and is it available (already exposed) ? Will exposing the tagset require exposing other information simply to make it useful? |
I just need access to the raw tags in the model. I already know how to use the LanguagePack to convert them into basic categories, that I have to ignore the "@" tags because these are part of the internal binarized trees, and that I may have to strip the grammatical function from the tag (also using the LanguagePack). Knowing the tagset gives us a hint (not certainty!) on whether a model is semantically compatible with another model. In DKPro Core, we try to extract tagset information from all models. Cf. the DKPro Core UIMA wrapper code for the Stanford parser [1]. We use tagset information extracted to:
|
Done On Thu, Dec 4, 2014 at 8:16 AM, Richard Eckart de Castilho <
|
Thanks! :) (I guess the commit comes later). |
I recall something about the pushing to github no longer working. I don't
|
The script pushing to Github should be working again -- is the commit still not showing up? |
It is possible to tie commits to issues by including the issue number in the commit message (cf. link below). Doing so causes commits to show up in an issue. I assumed you do that, so I didn't even check the actual commits list to search for the a related commit. https://guides.github.com/features/issues/ |
I see you added a |
Perhaps mistakenly I assumed Richard wanted phrasal categories. POS tags The srparser could theoretically add the list of expected tags at training John
|
I admit the way I have done it isn't great, but an srparser model does have an implicit tag set, reflecting the set of tags it was trained on. And it has proven to be a great data integrity/compatibility check to have this available. For instance, I now know that the spanish SR parser models have a tag set incompatibility problem versus the PCFG and tagger models (perhaps because older?). They're missing the tags: de0000, faa, fia, pe000000, vaic000, vsic000, which are present in the latter two.... |
I'm fine with the states (knownStates). Through earlier conversations with you, I (think I) know pretty well how to derive the actual tagset from those. At least I get consistent tagsets extrated across all the different parsers using different APIs (shift-reduce, pcfg, rnn, etc.). |
it would be nice if the
ShiftReduceParser
exposed atagSet()
method which would basically do aCurrently, I need to use reflection to access
ShiftReduceParser.model
andBaseModel.knownStates
to extract the tag set.The text was updated successfully, but these errors were encountered: