Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid class between Tag as class and Tag as type #33

Open
andrewufrank opened this issue Jan 14, 2018 · 3 comments
Open

avoid class between Tag as class and Tag as type #33

andrewufrank opened this issue Jan 14, 2018 · 3 comments

Comments

@andrewufrank
Copy link

I have difficulties using chatter when I extend it with other POS tags and had to copy the data Tag .. verbatim to my code. The name tag is used multiple times, first as class and then as type in Conll and Brown. Exporting Tag(..) from NLP.Types and NLP.Corpora.Conll produces the clash.

I would recommend to change the class name as Tags and to change the type names to ConllTag and BrownTag.
(by the way, in NLP.Corpora.Conll there are two qualified imports as T (Data.Text and NLP.Types.Tags))
thank you!
andrew

@creswick
Copy link
Owner

Can you share the code you're working on? I'd like to see how you ran into the clash with types.

My intent was that the tag types would be used qualified (since you need to know which tag type you're using) or use the type class API (since if you don't know the specific type, then you can't rely on anything more specific). However, that's skewed by my perspective--it probably makes more sense while working on the library, than it would on a specific app where you likely only use one tag type.

When adding a new POS tag set, you will have to duplicate the data Tag (although you can call it whatever you want). That data structure represents the tag sets -- the differences between two POS tag sets is grounded on that type. E.g., Connl and Brown have dramatically different sets of POS tags, and as such, they have dramatically different Tag data types.

I can probably help more if I could take a look at your code! I want to be sure I understand how you're using the library before making changes.

Thanks for pointing out the qualified imports in Conll! I'll fix that shortly.

@andrewufrank
Copy link
Author

andrewufrank commented Jan 14, 2018

dear creswick - thank you for looking into it. I understand how to add POS tag sets and how to instantiate them for the class Tag. Obviously I select different names for each tagset (at the moment I try to find out which tag set a coreNLP model uses....(:-) and build them).
the problem with the clash occurs when i have a module in which I import the Tag (type) and the Tag (class) and then want to export both to reduce the number of imports necessary. The export is using the qualified names, but what is (my understanding) actually exported are the unqualified names, therefore the clash (I do not remember the GHC error message. Sorry) - I avoided the issue by duplicating your Conll tagset definition with a diffferent name.
My code is a bit large and not up to date on github (yet); I will come back when I have the additional tagsets you could incorporate.

@andrewufrank
Copy link
Author

I have a follow up question: how should one deal with combined POS tags like
EplusRD | -- dalla
at the moment I have code which requires a defined enumeration for each combination and they are recognized from E+RD with the similar readTag function as you produced (replacing + with plus). I am not sure if this is satisfactory and wonder if one should recognize each tag separately and return a set of tags [E,RD]. What is your opinion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants