Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

two neg.ifi paradigms #12

Open
jonorthwash opened this issue Mar 30, 2019 · 8 comments
Open

two neg.ifi paradigms #12

jonorthwash opened this issue Mar 30, 2019 · 8 comments

Comments

@jonorthwash
Copy link
Member

Similar to #10, Kazakh has the issue of two neg.ifi paradigms.

First-person singular (neg.ifi.p1.sg) looks like this:

  • мен барған жоқпын
  • мен бармадым

The question is whether there is a difference in usage between these two forms, or if they are identical. The answer to this question will inform what needs to be done in the transducer in regards to the issue.

@IlnarSelimcan
Copy link
Member

I'm not sure whether handling analytical tense forms like барған жоқпын in transducers and not in transfer is a good idea.

@IlnarSelimcan
Copy link
Member

IlnarSelimcan commented Mar 30, 2019

Especially if the transducer's not weighted and it will just take one analysis in a greedy manner and go on with that.

@IlnarSelimcan
Copy link
Member

IlnarSelimcan commented Mar 30, 2019

"Мен ол кітапты көрген де, алған да, оқыған да жоқпын." My knowledge of Tatar suggests me that this sentence should be valid in Kazakh. If so, why would we treat GAN жок in the morph. transducer? There is simply not enough information at this stage. That's only my opinion.

@jonorthwash
Copy link
Member Author

Especially if the transducer's not weighted and it will just take one analysis in a greedy manner and go on with that.

I don't think there's ever ambiguity with these forms...?

"Мен ол кітапты көрген де, алған да, оқыған да жоқпын." My knowledge of Tatar suggests me that this sentence should be valid in Kazakh.

This is basically the same problem as suspended affixation in Turkish, e.g. "Ben şu kitabı göre de okuyabiliyorum." (@MemduhG halp?) or maybe "Ben şu kitabı göriyor da okuyorum" (??). Or a different sort of example might be "Ben kitaplar okur, makaleler yazarım" (??). Maybe have a look at how we ~deal with this problem in apertium-tur.

If so, why would we treat GAN жок in the morph. transducer?

Linguistically it makes some sense to just throw in a <neg> tag, especially if your primary goal is analysis and not translation. The morphology gets messier if you treat it as two separate words. What might you propose, though?

There is simply not enough information at this stage. That's only my opinion.

What do you mean about not enough information?

@IlnarSelimcan
Copy link
Member

You might find this to be a contrived example, but I think that it demonstrates what I'm trying to say:

"<Мұнда>"
	"бұл" prn dem loc
	"мұнда" adv
	"е" cop aor p3 pl
		"бұл" prn dem loc
	"е" cop aor p3 sg
		"бұл" prn dem loc
"<бұл>"
	"бұл" det dem
	"бұл" prn dem nom
	"е" cop aor p3 pl
		"бұл" prn dem nom
	"е" cop aor p3 sg
		"бұл" prn dem nom
"<кітапты>"
	"кітап" n acc
	"лы" post
		"кітап" n
"<оқыған>"
	"оқы" v tv past p3 sg
	"е" cop aor p3 sg
		"оқыған" adj subst nom
	"е" cop aor p3 pl
		"оқыған" adj subst nom
	"е" cop aor p3 sg
		"оқыған" adj
	"е" cop aor p3 pl
		"оқыған" adj
	"оқыған" adj subst nom
	"оқы" v iv past p3 sg
	"оқы" v iv past p3 pl
	"оқы" v iv gpr_past subst nom
	"оқы" v tv gpr_past
	"оқы" v tv past p3 pl
	"оқы" v tv gpr_past subst nom
	"оқыған" adj advl
	"оқы" v iv ger_past nom
	"оқы" v tv ger_past nom
	"оқыған" adj
	"оқы" v iv gpr_past
"<адам>"
	"ада" n px1sg nom
	"адам" n nom
	"адам" n attr
	"е" cop aor p3 pl
		"ада" n px1sg nom
	"е" cop aor p3 sg
		"ада" n px1sg nom
	"е" cop aor p3 pl
		"адам" n nom
	"е" cop aor p3 sg
		"адам" n nom
"<бар ма>"
	"ма" qst
		"бар" adj
	"ма" qst
		"бар" n nom
	"ма" qst
		"бар" adj subst nom
	"ма" qst
		"е" cop aor p3 pl
			"бар" adj
	"ма" qst
		"е" cop aor p3 sg
			"бар" adj
	"ма" qst
		"е" cop aor p3 pl
			"бар" n nom
	"ма" qst
		"е" cop aor p3 sg
			"бар" n nom
	"ма" qst
		"е" cop aor p3 pl
			"бар" adj subst nom
	"ма" qst
		"е" cop aor p3 sg
			"бар" adj subst nom
"<?>"
	"?" sent
"<Жоқ>"
	"жоқ" ij
	"жоқ" adj
	"жоқ" adj subst nom
	"е" cop aor p3 pl
		"жоқ" adj
	"е" cop aor p3 sg
		"жоқ" adj
	"е" cop aor p3 pl
		"жоқ" adj subst nom
	"е" cop aor p3 sg
		"жоқ" adj subst nom
"<,>"
	"," cm
"<оқыған жоқ>"
	"оқы" v tv neg ifi p3 pl
	"оқы" v tv neg ifi p3 sg
	"оқы" v iv neg ifi p3 pl
	"оқы" v iv neg ifi p3 sg
"<.>"
	"." sent

@IlnarSelimcan
Copy link
Member

IlnarSelimcan commented Apr 8, 2019

You simply don't know whether <GAN жок> is neg ifi or not without seeing the full sentence. In the above example, transducer took the longest analysis and the gpr_subst analysis is gone.

A better way would be giving only the GAN form, not inclduing joq, the neg.ifi analysis and answer the neg.ifi or not question later in CG.

@IlnarSelimcan
Copy link
Member

IlnarSelimcan commented Apr 8, 2019

In my opinion even better way would be not to diverge from the "one affix = one tag" principle (which I think is much more approachable for most users of a morphological analyser) and not give the GAN form yet another competing analysis and add a monolingual, sl-to-sl transfer rule which maps `^GAN$ ^жоқ$ to ^GAN<neg.ifi>$.

@jonorthwash
Copy link
Member Author

jonorthwash commented Aug 22, 2020

I think your examples are okay, though I'm probably not the person to ask.

So what do you propose for the two analyses of a form like "оқыған жок"? And what analysis would you propose for a form like оқыған жоқпын?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants