Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEEDAFFIX with both PFX and SFX #932

Open
srtxg opened this issue May 1, 2023 · 0 comments
Open

NEEDAFFIX with both PFX and SFX #932

srtxg opened this issue May 1, 2023 · 0 comments

Comments

@srtxg
Copy link

srtxg commented May 1, 2023

Hello,

The walloon language has changes on the beginning of words depending on phonetic of previous word;
I implemented that with PFX rules.

Then I use various SFX rules.
For verbs I use scond level SFX rules, to decrease the number of rules.

It works quite well, however when SFX and PFX flags are used, the stem is made a valid word, despite being explicitely flagges with NEEDAFIX.
(at least is it like that in 1.7.0 version)

---- x.aff ----

SET UTF-8
FLAG UTF-8

TRY ersainthocuxdlpéymbzîvjåfèwgkêôûçERSAINTHOCUXDLPÉYMBZÎVJÅFÈWGKÊÔÛ’'Ç-

NEEDAFFIX *

# "v" flag is for verbs;
# if the stem given in dic file ends in "é" it is a verb of 1st group (flag "1"),
# and it is also "stem A" of verbs (they can have several stems, but kept simple here)
# the ending "é" is stipped, but I use the "*" flag to tell this stripped stem is not a word
SFX v Y 2
SFX v   é       /1*     é       po:v
SFX v   é       /A*     é       po:v

# rules for 1st group of verbs, the ending "é" is added back; is stemA (bdjA) and past participle (p.p.) form
SFX 1 Y 1
SFX 1   0       é       .       is:bdjA is:p.p.

# rules for "stemA" of verbs (here a signle rule for the 1st person plural of present tense
SFX A Y 2
SFX A   0       ans     [^k]    is:bdjA is:pr. is:1pl
SFX A   k       cans    k       is:bdjA is:pr. is:1pl

# and prefix rule, di- can be elided to d- (eg: diné -> dné)
PFX i Y 2
PFX i   0       0       di      sp:plin
PFW i   di      d       di      sp:spotch


------- x.dict -----

2
diné/iv*        st:diner
viké/v*         st:viker

------ testfile.txt -----

diné
dné
dinans
dnans
din
dn

viké
vicans
vik


in case of only using SFX (eg viké/v* ); the stripped stem "vik" is correctly ignored as a valid word;
however, when using also PFX (eg: diné/iv ) the stripped stem "din" as well as "dn" are incorrectly included as valid:

$ hunspell -d x -m test.txt
diné  sp:plin st:diner
diné  st:diner po:v is:bdjA is:p.p.
diné sp:plin  st:diner po:v is:bdjA is:p.p.

dné  sp:spotch st:diner
dné sp:spotch  st:diner po:v is:bdjA is:p.p.

dinans  st:diner po:v is:bdjA is:pr. is:1pl
dinans sp:plin  st:diner po:v is:bdjA is:pr. is:1pl

dnans sp:spotch  st:diner po:v is:bdjA is:pr. is:1pl

din sp:plin  st:diner po:v   <=== wrong

dn sp:spotch  st:diner po:v <==== wrong

viké    st:viker po:v is:bdjA is:p.p.

vicans          st:viker po:v is:bdjA is:pr. is:1pl

vik

I would have expected

diné/iv* st:diner

to be equivalent to

diné/v* sp:plin st:diner
dné/v* sp:spotch st:diner

(if I do write it like that, it's ok; but that would require rewriting a thousand lines)

Is there something I am missing, or is that behaviour incorrect ?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant