Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not shortest relations path for Sk language #3819

Open
slavkoja opened this issue Apr 2, 2021 · 51 comments
Open

Not shortest relations path for Sk language #3819

slavkoja opened this issue Apr 2, 2021 · 51 comments

Comments

@slavkoja
Copy link

slavkoja commented Apr 2, 2021

In this case, the relations path is shortest in English language but it differs from path in Slovak (and Czech too) language:

2021-04-02-162229_587x376_scrot

In English it shows "partner’s second cousin" (red path), but in Slovak it shows "vnuk od teta od svokor" (BTW, horrible form), which can be raw translated as "father-in-law's aunt's grandson" (blue path). It took me some hours to check translation files, but then i find, that it comes from code, where i lost in shortest path selection...

@ric2016
Copy link
Contributor

ric2016 commented Apr 2, 2021

This works as designed: The actual path is the same in both cases (a path of length 6, as indicated by the dotted lines in your image, i.e. S1-C1-P1-GP1-GP2-P2-C2).

The algorithm actually works like this: It determines for each language which term that describes this path has the fewest characters. It does not (as apparently expected) break down the path into a minimal number of sub-paths (determined according to some criteria) and then translates those.

In case of English, both approaches lead to the same result: "partner’s second cousin" is the term with the fewest characters, and also the combined term for the two components 'partner' and 'second cousin'.

In case of Slovak, you'd get a term with more characters for the second approach (something like 'partner od druhostupňová sesternica'?). For more complex relationships, this approach becomes infeasible.

That said, improving the language-specific relationship names has been an open issue for a long time now, see e.g. here: #2331.

@slavkoja
Copy link
Author

slavkoja commented Apr 3, 2021

I misused term "relations path", but it seems that you understand my approach, but you are right, that translation happens via "%1$s’s %2$s" -> "%2$s od %1$s" (this took me a lot time to find).

I am not sure, if i understand properly the last paragraph. Do you want to tell, that in English is "partner’s second cousin" chosen, because it is shorter string than "father-in-law's aunt's grandson" and in Slovak it is vice versa?

If yes, then it is IMO bad algorithm/condition and must leads to unpredictable results in different languages, as the length of the term will differs by terms itself, not by relation. Nobody will describe this relation by the blue path, simple because it is horrible complicated (in really, i understand it only after i draw the image). IMO here is need to count the "path" (blue vs red steps) used to construct the string, not the length of the string itself.

In other words, the term "druhostupňová sesternica" must have the same weight (preference) as "vnuk" or "teta", because all describes one person. But the "druhostupňová sesternica" (one step relation) must have higher weight than "vnuk od teta" (two steps relation), because one step is shorter than two steps, no matter of the string length... (eh, terrible to describe for me at all, especially in English)

Anyway, the construct of "partner od druhostupňová sesternica" is grammatically bad, but i do not want to discus this, i understand the limitations...

@ro-la
Copy link
Contributor

ro-la commented Apr 3, 2021

Anyway, the construct of "partner od druhostupňová sesternica" is grammatically bad, but i do not want to discus this, i understand the limitations...

I agree with you - it looks odd and nobody in real world would use such grammatical construction. This translation was inspired by the Czech translation. Some languages use only a simple arrow. There is no real good solution of the translation.

But I agree with Slavko, that only the length of a string is not the best sollution.

Ladislav

@ric2016
Copy link
Contributor

ric2016 commented Apr 3, 2021

I am not sure, if i understand properly the last paragraph. Do you want to tell, that in English is "partner’s second cousin" chosen, because it is shorter string than "father-in-law's aunt's grandson" and in Slovak it is vice versa?

Yes, that's how the algorithm works currently. I guess it has been chosen because it works (somewhat) for all languages. It wasn't easily possible in earlier webtrees versions to use different algorithms for different languages.

It would be possible now (in webtrees 2.x) to choose the actual algorithm per language, or even offer a choice of different algorithms. Once we have defined a proper interface for this functionality, it will be possible to develop language-specific (custom) modules for this.

(I'm not sure though how to best handle the language-specific relationship terms - I don't think this can be done via weblate, so these terms would probably be part of the language-specific module, and only be editable there)

@ro-la
Copy link
Contributor

ro-la commented Apr 3, 2021

On the fly I have played a little bit with the Slovak translations:

English Slovak (OK) Slovak (may be) Slovak (OK)
(2nd) cousin "bratranec" "druhostupňový bratranec"
(2nd) cousin’s wife "bratrancova žena" "žena bratranca" "žena druhostupňového bratranca"
(2nd) cousin’s wife’s son "syn bratrancovej ženy" "syn ženy bratranca" "syn ženy druhostupňového bratranca"
2nd cousin’s wife’s son’s partner "partner syna bratrancovej ženy" "partner syna ženy bratranca" "partner syna ženy druhostupňového bratranca"

It seems a better "translation" for "%1$s’s %2$s" would be something like "%2$s GENITIVE(%1$s)" for the 1st level of recursion and "GENITIVE(%2$s) GENITIVE(%1$s)" for deeper calls. I have no idea how this recursion works, so it is possible, that it is not so easy to decide, which is the very first (or last) part of the string, which only should be in nominative. The other parts of the string should be in genitive. I know this is not only a matter of translation, but it is perhaps easier than a complet new module. And this could be helpful also for other slavic languages.

Ladislav

@ric2016
Copy link
Contributor

ric2016 commented Apr 3, 2021

Using the genitive case for specific languages has also been suggested before. See Greg's reply here: #947 (comment) . This was about five years ago - there has been no progress at all on these issues since then unfortunately. For a while, it was at least on the 2.1 milestone list, but it has been indefinitely postponed apparently.

The interface I mentioned earlier may not even be that complex. Instead of using getRelationshipNameFromPath from functions.php, we need an additional method in ModuleLanguageInterface (or LocaleInterface?). Its default implementation via ModuleLanguageTrait could use the existing implementation.

@fisharebest: Any thoughts on this?

@slavkoja
Copy link
Author

slavkoja commented Apr 4, 2021

I guess it has been chosen because it works (somewhat) for all languages.

I am curious which languages are those "all languages". It doesn't work for at least for Slovak and Czech (but i afraid about other Slavic languages too), thus your statement is not right. I did look at statistics page and i see, that most installations are in USA and DE, do you mean English ant Deutsch as all languages?

(I'm not sure though how to best handle the language-specific relationship terms - I don't think this can be done via weblate, so these terms would probably be part of the language-specific module, and only be editable there)

weblate is not an issue here, i am not translator (while i collaborate with our one). The issue is bad algorithm, as i stated before, the string length is not measurement, at least not universal measurement.

@ric2016
Copy link
Contributor

ric2016 commented Apr 4, 2021

It doesn't work for at least for Slovak and Czech

Again, it works as designed. We don't disagree there could be better solutions! There are different aspects to be addressed, the algorithm itself is just one of them. I would be interested in tackling this, but it won't be possible without Greg's input.

@ro-la
Copy link
Contributor

ro-la commented Apr 4, 2021

I think, the idea of using the length of the strings is not a very good one.

For example a very simple change - not to use the length of the strings, but the number of the blanks is in my opinion better. At least for Slovak it will give better results.

So I changed the line 2.400 (quite at the very end) of the file Functions.php (in the folder /app/Functions)

from
if (!$relationship || strlen($tmp) < strlen($relationship)) { $relationship = $tmp;
to
if (!$relationship || substr_count($tmp, ' ') < substr_count($relationship, ' ')) { $relationship = $tmp;

Ladislav

@slavkoja
Copy link
Author

slavkoja commented Apr 5, 2021

I afraid, that counting spaces is not universal too. What about "bratranec z 3. kolena", more spaces than "vnuk od tety"...

@ric2016
Copy link
Contributor

ric2016 commented Apr 5, 2021

Regarding the algorithm, I think the original suggestion is more suitable than counting spaces. Anyway, I'm working on all of this over here as long as there is no progress on this in webtrees itself. I hope to have a first version (for German) available as a showcase shortly.

@ro-la
Copy link
Contributor

ro-la commented Apr 5, 2021

Regarding the algorithm, I think the original suggestion is more suitable than counting spaces.

It depends on the way, how is the string "%1$s’s %2$s" translated in English (and German) its adding only one blank. in other languages (Slovak, Czech, ...) its adding 2 blanks. For such languages it will make better results. And the longer the relationship the better. Of course counting of the sub-paths is far better.

I afraid, that counting spaces is not universal too. What about "bratranec z 3. kolena", more spaces than "vnuk od tety"...

I am not saying it is universal, but I am sure, that for some languages it gives better results, and for the rest its giving not a worse. And AFAIK a "3rd cousin" is "directly translated", so this will not happen. And a second thing - when a translator knows, that the leng of the string, or the count of blanks is important for the algorithmus it cann ba changed "bratranec z 3. kolena" cann be also "treťostupňový bratranec" or even "3-stupňový bratranec".

Anyway, I'm working on all of this over here as long as there is no progress on this in webtrees itself. I hope to have a first version (for German) available as a showcase shortly.

This is a very good news. When you need a tester for other languages ...

@slavkoja
Copy link
Author

slavkoja commented Apr 6, 2021

IMO, because translation happens in "%1$s’s %2$s" there must be some recursion (i didn't check the code, i will lost in it anyway). Simplest seems to be to count the recursion count (depth), which can be really universal, to find shortest path (relation).

@ric2016
Copy link
Contributor

ric2016 commented Apr 6, 2021

@slavkoja: I have already implemented an improved algorithm, and an option to use more grammatical genitive-based constructions. What's left to do is to create the language-specific files, e.g. in a format similar to the file for German.

We should continue the discussion here.

@fisharebest
Copy link
Owner

fisharebest commented Apr 6, 2021

@ric2016 - I have tried a similar approach to yours before.

Here are some issues that you might consider...

The relationship and relationship-name can depend on marriage.

For example, the relationship between myself and my girlfriend's brother depends on whether we are unmarried, married or divorced.

unmarried => partner's brother
married => brother-in-law
divorced => ex-brother-in-law

So solve this, I used closures instead of strings for the names.

The relationship name can depend on pedigree.

e.g. father, foster-father, adopted-great-grandfather, etc.

Also, the relationship name can depend on the sex of the first individual. In some languages (Polish?) the name for a man's uncle is different to the name for a woman's uncle.

I am very interested to see what progress you make.

@slavkoja - the example above ("partner's brother") shows why it would be difficult to count recursion depths. This relationship name already contains two parts.

If we can remove all the "exotic" names which exist for only one language, then I think we can use a non-recursive solution. Simply match as much of the first part of the relationship as possible.

For very long paths, the recursive algorithm to find the "best" name is slower than the code to find the relationship.

@fisharebest
Copy link
Owner

there is no progress on this in webtrees itself

I have been working on this intermittently. But I do not have a solution that is good enough to commit to the main repository.

@fisharebest
Copy link
Owner

@ro-la, @ric2016 - Is it true that the pattern "GENITITVE GENITIVE GENITIVE ... NOMINATIVE" is used for all languages?

@ric2016
Copy link
Contributor

ric2016 commented Apr 6, 2021

Is it true that the pattern "GENITITVE GENITIVE GENITIVE ... NOMINATIVE" is used for all languages?

My goal is to split this joiner from the rest of the code. The joiner can then be implemented per language, as required. By the way, in German it's more grammatical to use a reversed construction i.e. "NOMINATIVE ... GENITIVE GENITIVE" ("husband of the cousin of the partner" rather than "partner's cousin's husband") but that is just a detail (drawback is that the respective diagram has to be parsed from right to left when using this construction, which may be confusing as well).

For very long paths, the recursive algorithm to find the "best" name is slower than the code to find the relationship.

If we define the criteria differently, the recursive algorithm performs better as well. In the first modification, it is first attempted to split in specific places only (before/after spouse), because the resulting sub-paths are often available directly (without further recursion).

The result preserves common-ancestor-based relationships, so that we actually get "partner’s second cousin" rather than e.g. "father-in-law's aunt's grandson" in all languages regardless of string length (see the original issue).

Also, the relationship name can depend on the sex of the first individual.

I have this case covered - I'm not sure about the other cases though (pedigree/marriage): This additional information is currently not used either, except for simple relationships (path length of one). So the input to the function (currently implemented in Functions:: getRelationshipNameFromPath) would have to be extended.

@slavkoja
Copy link
Author

slavkoja commented Apr 6, 2021

@ro-la, @ric2016 - Is it true that the pattern "GENITITVE GENITIVE GENITIVE ... NOMINATIVE" is used for all languages?

No, at least in our language it is impossible to generalize, as words changes between noun, pronoun or even adjective, as genitives chains.

@ric2016
Copy link
Contributor

ric2016 commented Apr 6, 2021

@slavkoja from the example given in the table above, it seemed that it would be possible to chain, as only the second part of the overall term has to be adjusted in each step? Or is that a special case?

@ro-la
Copy link
Contributor

ro-la commented Apr 6, 2021

@ro-la, @ric2016 - Is it true that the pattern "GENITITVE GENITIVE GENITIVE ... NOMINATIVE" is used for all languages?

I dont know exactly, but I think that this should work in slavic languages, perhaps in all European languages. On the other hand, there are also in Slovak other patterns for complex relationship - with possesive adjectives, which sounds better, but the pattern is far more complex. See the column Slovak (OK) #3819 (comment) this form is for example gender sensitive.

Also the question (from @slavkoja here) "Do we realy need a description for very complex relationships?" should be discussed. Wouldnt it be better simply say "This relationship is too complex - see the picture"? Or when there is no picture then a text string "IND1 simple_relationship1-2 INDI2 simple_relationship2-3 INDI3 ..... simple_relationship(n-1)-n INDIN"?

@ro-la
Copy link
Contributor

ro-la commented Apr 6, 2021

it seemed that it would be possible to chain, as only the second part of the overall term has to be adjusted in each step? Or is that a special case?

I would say the GENITIVE chain is acceptable. For "long" chains even good. For "small" chains the possesive adjectives sound better. But som of them are defined directly, not through the recursive routine.

But I hope also @slavkoja would agree, that the "Genitive chain" is a better solution then we have today.

@slavkoja
Copy link
Author

slavkoja commented Apr 7, 2021

IMO, whole discussion gets bad direction. We cannot solve genitives problem, until we get decision about underlying shortest relation's path identification. Currently implemented algorithm is bad, and this this one case with Sk and Cs language only shows, that the localized string length cannot be used as measure. The genitive's discusion have to be filled as separate issue, depended on result of this one.

From my point of view, there can be measured depth of "translation's" recursion directly on English string, to get better decision. But i contact my mathematician friends (Graph theory), i hope that i get answer soon, while i am not sure, if they will able to help with this directly in PHP, as no one from us is PHP man, but i hope that he can help to move this somewhere.

@ric2016
Copy link
Contributor

ric2016 commented Apr 7, 2021

IMO, whole discussion gets bad direction. We cannot solve genitives problem, until we get decision about underlying shortest relation's path identification

These are seperate issues - Both can be handled independently (I know this issue originally was only about one of them).

Again, I already have a implemented an improved solution for the path algorithm, which seems to work well. There is no need to over-complicate this aspect.

@hartenthaler
Copy link

Ich schreibe in Deutsch, weil das Thema sehr komplex ist. Ich denke, dass der Ansatz die Verwandtschaftsbezeichnungen pro Sprache zu erzeugen nicht so gut ist. Vielleicht müsste man erst einmal festlegen, welches Verwandtschaftssystem zum Einsatz kommen soll. Wir verwenden ja das Eskimosystem (kognatische, bilaterale Abstammung), aber es gibt ja noch mehr (Hawaiisystem, Iroquoisystem, Crow/Omahasystem, Sudansystem). Innerhalb dieser Systeme gibt es dann noch pro verwendeter Sprache Varianten, aber das Gruppieren in Verwandtschaftssysteme könnte vielleicht helfen die Vielfalt etwas zu strukturieren. Hilfreich finde ich:

  • wikipedia: Verwandtschaftsbezeichnungen_in_Moieties
  • wikipedia: Verwandtschaftsterminologie
  • wikipedia Verwandtschaftssystem
  • Kinship Terminologies
  • Atlas der Verwandschaftssysteme
    Ich denke, dass das zu verwendende Verwandtschaftssystem vom Nutzer abhängt, dass dieser aber überfordert wäre zu entscheiden welchem Verwandtschaftssystem (Eskimo, Hawaii, ...) er folgt. Vielleicht kann man aus der Sprache des Nutzers auf das in dieser Sprache übliche Verwandtschaftssystem schließen. Eventuell ist es aber den Aufwand auch nicht wert, denn höchstens ein Prozent der webtrees-Installationen (laut Server-Statistik) stehen in Ländern mit einem vom Eskimosystem abweichenden System und ich nehme an, dass das dann auch für die Anzahl der Nutzer in ähnlicher Größenordnung der Fall sein wird. Dann könnte man aus praktischen Erwägungen auch vom Eskimosystem ausgehen und die Betreiber der Server etwa in der Türkei und in China oder mit einem Schwerpunkt auf jüdische Verwandte müssen sich etwas Besonderes überlegen.

@ric2016
Copy link
Contributor

ric2016 commented Apr 12, 2021

The first version of the new approach is now available in Vesta release 2.0.15.1.0 (module "Extended Relationships" in particular).

So solve this, I used closures instead of strings for the names.
The relationship name can depend on pedigree.

@fisharebest Thank you for your input - I have kept the common cases, but a closure/callback is now used internally, and can also be used directly to implement additional strategies of arbitrary complexity. As an example of this, I have implemented modified step-father etc. relationships (in the English language version) which take into account the date of marriage vs the child's birthdate, as suggested on the forum a while ago.

@hartenthaler Additional kinship systems are an interesting but rather academic topic (at least, as you point out, for the large majority of users). In any case we need a solution for the existing issues, which definitely require a language-specific approach. That's what this is about - Everything beyond is a different issue that should be handled separately, if at all.

@fisharebest
Copy link
Owner

@slavkoja - I have now found my original solution to this problem, and further developed it.

Relationships are defined using code like this:

Relationship::fixed('aunt', 'aunt’s %s')->parent()->sister(),
Relationship::fixed('aunt', 'aunt’s %s')->parent()->sibling()->wife(),
Relationship::fixed('uncle', 'uncle’s %s')->parent()->brother(),
Relationship::fixed('aunt', 'aunt’s %s')->parent()->sibling()->husband(),

We can distinguish between married and unmarried partners.
->wife() is a shortcut for ->married()->partner()->female()

Variable relationships are defined using some callback functions. e.g.

$cousin = function () { ... };
$great = function () { ... };
Relationship::dynamic(fn (int $n) => $great($n - 2, 'granddaughter'))->descendant()->female(),
Relationship::dynamic(fn (int $n) => $great($n - 2, 'grandson'))->descendant()->male(),
Relationship::dynamic(fn (int $n) => $great($n - 2, 'grandchild'))->descendant(),
Relationship::dynamic($cousin)->ancestor()->sibling()->descendant(),

It matches relationships using a "longest substring" algorithm.

Are you able to help me by providing definitions for Slovak?

@ric2016
Copy link
Contributor

ric2016 commented Apr 14, 2021

@fisharebest have you looked at the code in the 'Extended Relationship' module? It seems rather similar regarding general concepts. We don't have to start to duplicate development on this now after years of no progress? The current Vesta release has a finished solution for English, German and Slovak.

Start e.g. here.

@ro-la
Copy link
Contributor

ro-la commented Apr 15, 2021

@slavkoja - I have now found my original solution to this problem, and further developed it.

Are you able to help me by providing definitions for Slovak?

As Ric wrote the Slovak definitions are in LanguageSlovakExt.php. Some translations cann be discussed (for some relationships we have two equivalent names), perhaps I forgot to add some relationships. Step/adopted relationships are (intentionally) minimalised - cann be discussed. But I am very happy, that with this aproach we are now not using the "n-th cousin y-removed" relationship.

Of course, when you need further assitence with Slovak translation I cann help also.

Ladislav

@slavkoja
Copy link
Author

@fisharebest i can, of course, while i am not sure, what you want from me.

I still afraid about decision by string length. Is it problem to count the recursion depth? When i return to the initial image, the red "path" has 2 recursions (relations), but the blue path has 3. The shortest one is with less recursion depth...

@ric2016
Copy link
Contributor

ric2016 commented Apr 15, 2021

@slavkoja

I still afraid about decision by string length.

There is a working solution available in Vesta based on criteria like the one you suggest. I'd be interested in your feedback!

@fisharebest
Copy link
Owner

have you looked at the code in the 'Extended Relationship' module?

I have looked at the code now...
If we have both found similar solutions, then I guess it must be a good technique ;-)
But I want to test my code with some different languages before I upload it to github.

I still afraid about decision by string length. Is it problem to count the recursion depth?

I am currently matching the longest substring with a named-translation. (This part of the algorithm can be changed easily.)

So, in your example, we have spouse-parent-sibling-child (4 steps).

We can match 3 steps with the definition for "cousin" (parent-sibling-child).
Depending on the language's order, this will translate as either:

  • spouse's cousin
  • cousin of the spouse

@slavkoja - I have taken most of the relationship definitions from @ric2016's code. But I cannot work out the numbering system for cousins.

A cousin is: parent-parent-parent...sibling...child-child-child.

If you have X parents and Y children, then how do you calculate N in bratranec zo N. kolena?

@ric2016
Copy link
Contributor

ric2016 commented Apr 16, 2021

@fisharebest

But I want to test my code with some different languages before I upload it to github.

The main question for me is whether we should continue to work on this in parallel. Seems like unnecessary effort - There are so many open other issues. Why do you (re-)start to work on this now? Again, I have a finished solution. You can download the Vesta modules and test it with the relationship chart.

I had hoped we could do it like this: We test additional languages in the custom module, and then we move its code to webtrees eventually.

Edit: Maybe we should discuss this further elsewhere?

@slavkoja
Copy link
Author

I am currently matching the longest substring with a named-translation. (This part of the algorithm can be changed easily.)

OK, then we can discuss this algorithm latter.

A cousin is: parent-parent-parent...sibling...child-child-child.
If you have X parents and Y children, then how do you calculate N in bratranec zo N. kolena?

First, i need to state, that i am not interested in genealogy at all, i only install & manage webtrees on my server for my mother ;-) Thus my info can be bad and would be great if @ro-la can confirm or correct this.

AFAIK, the our cousin's level requires that X = Y, thus cousins have common grand-parent, second cousins grand-grand-parent, etc... In our language cousin's name depends on sex, thus we have "bratranec" for male and "sesternica" for female (and here must be something for unknown too).

  • for two levels (parent-parent -> sibling -> child-child) it is "bratranec"
  • fort three levels (parent-parent-parent -> sibling -> child-child-child)) can be "bratranec z 2. kolena" (current translation is "druhostupňový bratranec", but "z 2..." is acceptable too)
  • etc

Thus, when X = Y, then in our language is valid N = X - 1 = Y - 1 for cousin's levels.

IMO, you can inspect different languages in messages.php file, where i found (beside other forms):

'FEMALE%s × cousin' => 'sesternica z %s. kolena',
'MALE%s × cousin' => 'bratranec z %s. kolena',

For all languages you can use in lang directory:

grep -r '%s × cousin' */

Quick look at result shows, that all languages simple reuse the %s value, but this can be simple due gettext limitation, but e.g.:

grep -r 'sixth cousin' */

shows, that not all languages uses 6 (number) in translation, but i am not able to understand a most of them, thus some can use it as word (six)...

Another complication is, that right translation is e.g. "z 2...", "z 3...", but "zo 4..." or "zo 6...". AFAIK this is impossible to achieve with gettext, then i often solve it by "z/zo %s..." or i simple ignore it (as this is not only one i18n limitation).

Be free to ask if other info is needed.


As experienced free/open software (mostly gettext) translator i am strictly against hardcoding string's translation into code, as it is unmaintainable and most of translators will do not doing updates in it. One can use dummy PHP file (even with by hand generated "dynamic" values) for them, to they go to POT file and will be translatable by usual way.

@ro-la
Copy link
Contributor

ro-la commented Apr 17, 2021

A cousin is: parent-parent-parent...sibling...child-child-child.
If you have X parents and Y children, then how do you calculate N in bratranec zo N. kolena?

AFAIK, the our cousin's level requires that X = Y, thus cousins have common grand-parent, second cousins grand-grand-parent, etc... In our language cousin's name depends on sex, thus we have "bratranec" for male and "sesternica" for female (and here must be something for unknown too).

In Slovak if X<>Y we dont use description like in English. One would mostly say something like "3rd cousin of my grandfather"

Another complication is, that right translation is e.g. "z 2...", "z 3...", but "zo 4..." or "zo 6...". AFAIK this is impossible to achieve with gettext, then i often solve it by "z/zo %s..." or i simple ignore it (as this is not only one i18n limitation).

In Ric's aproach we used a special translation for 4, 6, 7, 14, 16, 17 with "zo", the rest ist with "z". Its hard to say, that somebody would need 40-th cousin.

As experienced free/open software (mostly gettext) translator i am strictly against hardcoding string's translation into code, as it is unmaintainable and most of translators will do not doing updates in it. One can use dummy PHP file (even with by hand generated "dynamic" values) for them, to they go to POT file and will be translatable by usual way.

Generaly I agree.

But we have for example the Census Asistant - where the header of the table is not translated and is used in the original language.

I think the relationships names are very language specific - different language (groups) use quite different system of family relationships. But also inside a languagegroup some relationships have a specific name in one language and not in a second one. So for a translator it is not good to have all possible relationships names in one file with other translations. So this "relationships messages" should not be added to the "messages.php" in folder "sk", but in a separate file, perhaps even in separate directory.

In this case I would accept also "hard-coded" text. Because the translation is not only a simple translation, the translator must identifiy what messages are needed, what should be used and what not.

Ladislav

@ric2016
Copy link
Contributor

ric2016 commented Apr 17, 2021

@slavkoja

As experienced free/open software (mostly gettext) translator i am strictly against hardcoding string's translation into code, as it is unmaintainable

I tried to explain the reason for this here. As Ladislav also pointed out, you cannot solve this issue with only a translator's usual tools (you noticed yourself some things like "z/zo %s..." are not solvable via gettext).

So, do you really think it is easier to create a file like this, which has to be language-specific, using strings that are then to be translated in a separate file? In case of updates, usually both files would have to be adjusted anyway.

@fisharebest
Copy link
Owner

  • for two levels (parent-parent -> sibling -> child-child) it is "bratranec"

What is the name for one level? e.g. parent - sibling - child?

  • for three levels (parent-parent-parent -> sibling -> child-child-child)) can be "bratranec z 2. kolena" (current translation is "druhostupňový bratranec", but "z 2..." is acceptable too)

Do other levels have "names" that are prefered to numbers.

@ric2016
Copy link
Contributor

ric2016 commented Apr 19, 2021

The definitions in LanguageSlovakExt.php are based on Ladislav's input, so you should be able to reuse them.

But note that slavkoja's definitions do not match these (they seem to be off by one level):

for three levels (parent-parent-parent -> sibling -> child-child-child)) can be "bratranec z 2. kolena"

shouldn't this be "bratranec z 3. kolena"?

Regarding your earlier question:

But I cannot work out the numbering system for cousins.

//IMPL NOTE: used as back-reference (i.e. count must match in '->child($ref)')
$ref = Times::min(1, 1); 
$defs []= RelDefBuilder::def()->parent()->parent($ref)->sibling()->child($ref)->son()->is('bratranec z %s. kolena', 'bratranca z %s. kolena');

When the same object $ref is used in a definition, it collects the count in the first matcher (in this case ->parent($ref)), and the count is then used in subsequent matchers (in this case ->child($ref)). The second argument in 'Times::min(1, 1)' is an offset, so a count of 2 would result in a value of 3 to be used for '%s'.

@ro-la
Copy link
Contributor

ro-la commented Apr 19, 2021

  • for two levels (parent-parent -> sibling -> child-child) it is "bratranec"

What is the name for one level? e.g. parent - sibling - child?

bratranec (male)
sesternica (female)

  • for three levels (parent-parent-parent -> sibling -> child-child-child)) can be "bratranec z 2. kolena" (current translation is "druhostupňový bratranec", but "z 2..." is acceptable too)

I think this is an error - this is "bratranec z 3. kolena"

Do other levels have "names" that are prefered to numbers.

As far as I know not.

Ladislav

@fisharebest
Copy link
Owner

Screenshot 2021-04-19 at 13 23 18
Screenshot 2021-04-19 at 13 23 07

The slovak language definitions need to be checked at:

https://github.com/fisharebest/webtrees/blob/main/app/Module/LanguageSlovakian.php

Tests need to be written at:

@ric2016
Copy link
Contributor

ric2016 commented Apr 19, 2021

Note that the Slovak relationships from LanguageSlovakExt.php are designed to be used with a particular path-splitting algorithm, for which see that file starting from line 21. Using them with the standard shortest-string-based algorithm won't produce the intended results in all cases.

@slavkoja
Copy link
Author

Yes, sometime it is terrible to get common math:
2021-04-19-155303_610x610_scrot

@ro-la
Copy link
Contributor

ro-la commented Apr 19, 2021

I have tried to make some changes to the LanguageSlovakian.php - this is first time I did it using GitHub.

Where and how cann I test the translations?

Ladislav

@fisharebest
Copy link
Owner

this is first time I did it using GitHub

Yes - this worked OK!

Where and how cann I test the translations?

https://dev.webtrees.net/demo-dev

@fisharebest
Copy link
Owner

Note that the Slovak relationships from LanguageSlovakExt.php are designed to be used with a particular path-splitting algorithm, for which see that file starting from line 21. Using them with the standard shortest-string-based algorithm won't produce the intended results in all cases.

We no longer have a "shortest-string" algorithm.

Instead, we match the longest sub-path that has a single name.

For the example at the top of this issue, we have spouse-parent-sibling-child

We could split this in several ways:

  1. (spouse-parent)-(sibling-child)
  2. spouse-(parent-sibling)-child
  3. spouse-(parent-sibling-child)

Option 3 has the longest subpath (3 steps), and so gets extracted first.

@ric2016
Copy link
Contributor

ric2016 commented Apr 20, 2021

ok, but 'longest subpath' isn't the desired criteria in all cases either (in Slovak): 'first cousin 3 times ascending' would split into 'great-great-grandfather' + 'niece', where 'great-grandfather' + 'cousin' would be preferred.

Also, common-ancestor-based sub-paths are preferred, which you don't always get with 'longest subpath' either.
'spouse-parent-sibling' would split into '(spouse-parent)-sibling', not 'spouse-(parent-sibling)' (or what is the tiebreaker in case of equal length?)

@ro-la
Copy link
Contributor

ro-la commented Apr 20, 2021

I have tried this:
https://dev.webtrees.net/demo-dev/tree/demo/relationships-0-99/X9275/i13

The First relationship is not the best description:
image
It should be "exmanželka 3-krát pravnuka"
We should not divide n-th grand child into subpaths.

@ro-la
Copy link
Contributor

ro-la commented Apr 20, 2021

On the same page the second relationship is also not the best. And it has an error.
image
The red relationship should be "synovca" = nephew's (now is a feminimum used "netere" = niece's) - an error either in the translation or in the algorithm.

I would prefer the green line "ex-manželka 2-krát pravnuka švagra syna"

@ro-la
Copy link
Contributor

ro-la commented Apr 20, 2021

I have added a new pull request for LanguageSlovakExt.php - the definition for nephew an niece are exchanged.

@ro-la
Copy link
Contributor

ro-la commented Apr 23, 2021

@fisharebest as far as the problem with n-th grandparent is not solved, this issue should be reopened - or should I rise a new bug report?

@fisharebest fisharebest reopened this Apr 23, 2021
@fisharebest
Copy link
Owner

HI ro-la - I have a solution for this. I hope to submit it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants