Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #63 Consider mn and mfn genders #64

Merged
merged 4 commits into from Aug 9, 2018

Conversation

highsource
Copy link
Contributor

This will fix the #63.

This is a WIP PR, please do not merge yet.

  - Added a reproducing tests for Tetragraph (mn) and Flipchart (mfn)
  - Moved gender text parsing to a special-purpose enum.
  - Added recognition of mn
  - Added recognition of mfn
  - Small fixes in tests
@codecov-io
Copy link

codecov-io commented Aug 6, 2018

Codecov Report

Merging #64 into master will increase coverage by 0.27%.
The diff coverage is 83.87%.

Impacted file tree graph

@@             Coverage Diff             @@
##             master     #64      +/-   ##
===========================================
+ Coverage     76.03%   76.3%   +0.27%     
  Complexity     1323    1323              
===========================================
  Files           101     102       +1     
  Lines          4631    4643      +12     
  Branches        802     795       -7     
===========================================
+ Hits           3521    3543      +22     
+ Misses          895     890       -5     
+ Partials        215     210       -5
Impacted Files Coverage Δ Complexity Δ
...tl/parser/de/components/DEPartOfSpeechHandler.java 80.76% <57.14%> (+13.12%) 11 <1> (-4) ⬇️
.../ukp/jwktl/parser/de/components/DEGendersText.java 91.66% <91.66%> (ø) 5 <5> (?)
.../jwktl/parser/de/components/DEWordFormHandler.java 78.06% <0%> (+0.58%) 72% <0%> (-1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 50a37bb...529e057. Read the comment docs.

  - Renamed DEGrammaticalGendersText to DEGendersText as it is closer to
the gender text rather than grammatical genders
@highsource
Copy link
Contributor Author

This is how I'd work with genders x, 0, pl as well:

  • First parse gender text to an intermediary enum like DEGendersText.
  • Use this enum for internal purposes. It could include X, _0, PL as values.
  • Finally, when needed, convert the enum to GrammaticalGender or collection thereof.

This allows considering "raw" values during the parsing, but does not expose those in the resulting model.

@highsource
Copy link
Contributor Author

@chmeyer This PR is ready to be merged.

NM("nm", GrammaticalGender.NEUTER, GrammaticalGender.MASCULINE),
NF("nf", GrammaticalGender.NEUTER, GrammaticalGender.FEMININE),
MFN("mfn", GrammaticalGender.MASCULINE, GrammaticalGender.FEMININE, GrammaticalGender.NEUTER);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep compatibility to DEPartOfSpeechHandler, let's add "w" => FEM as well (see line 102 in the original code)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, w just does not seem to occur anymore.

  - Added `w` to `DEGendersText` for backwards compatibility
@highsource
Copy link
Contributor Author

Added w to DEGendersText for backwards compatibility.

I think this can be merged now.

@chmeyer chmeyer merged commit 5937f5d into dkpro:master Aug 9, 2018
@highsource highsource deleted the bugfix/63-Consider-mn-mfn-genders branch December 19, 2018 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants