Make a postprocess to handle capitalisation #67

ftyers · 2020-03-09T01:43:39Z

Capitalisation should not be done in transfer, it should be done in a postprocess, much like "recasing" in SMT.

hectoralos · 2020-03-09T07:38:26Z

At what stage exactly and on the basis of which information? I'm thinking about how dealing with the difference in French nouns like "allemand" (the language) and "Allemand" (a person). Currently, I do this in transfer.

khannatanmai · 2020-05-12T21:03:02Z

@ftyers we can use secondary tags to propagate the case till the post generator and then apply it there if needed.

ftyers · 2020-07-03T16:22:39Z

This is related: #75

ftyers · 2020-07-03T16:24:23Z

@hectoralos I would do it in posttransfer using the LU and perhaps a 1-2 word context window.

unhammer · 2021-04-25T17:13:20Z

@ftyers basically only using dictionary case and "is this a sentence end"-context and ignoring input case? We'd lose the ability to keep UPPER CASE and Titles with Titlecase but maybe that's worth the code simplification …

mr-martian · 2021-04-25T18:28:15Z

lt-proc could record the original capitalization and put that in word-bound blanks which could then be used to determine that.

unhammer · 2021-04-25T18:38:49Z

@mr-martian lt-proc outputs the original word form anyway, so a separate step can do the job. I actually have a branch of nno-nob that just adds tags aa/Aa/AA that way to all words (capstag.rlx runs after morph ana/dis), removed again in transfer. I'm considering switching to this system so we can get dictionary-based correction but keep input caps (for start of sentence or where there are several upper-cased words in a row), but have to make sure it doesn't lead to regressions first.

mr-martian · 2022-12-22T21:12:49Z

Processor added in 7e7004d

ftyers added enhancement New feature or request help wanted Extra attention is needed labels Mar 9, 2020

ftyers mentioned this issue Jul 7, 2020

modify-case unknown expression apertium/apertium-recursive#60

Closed

mr-martian mentioned this issue Apr 25, 2021

How to keep caps? apertium/apertium-separable#36

Closed

mr-martian mentioned this issue Sep 19, 2021

Breaking change in case treatment in apertium-transfer #135

Open

mr-martian mentioned this issue Jul 17, 2022

Capitalization Post-processor #170

Merged

mr-martian closed this as completed Dec 22, 2022

unhammer added the capitalisation label Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make a postprocess to handle capitalisation #67

Make a postprocess to handle capitalisation #67

ftyers commented Mar 9, 2020

hectoralos commented Mar 9, 2020

khannatanmai commented May 12, 2020

ftyers commented Jul 3, 2020

ftyers commented Jul 3, 2020

unhammer commented Apr 25, 2021

mr-martian commented Apr 25, 2021

unhammer commented Apr 25, 2021

mr-martian commented Dec 22, 2022

Make a postprocess to handle capitalisation #67

Make a postprocess to handle capitalisation #67

Comments

ftyers commented Mar 9, 2020

hectoralos commented Mar 9, 2020

khannatanmai commented May 12, 2020

ftyers commented Jul 3, 2020

ftyers commented Jul 3, 2020

unhammer commented Apr 25, 2021

mr-martian commented Apr 25, 2021

unhammer commented Apr 25, 2021

mr-martian commented Dec 22, 2022