Skip to content

Commit

Permalink
Consider umlaut forms when building tokenized map
Browse files Browse the repository at this point in the history
Allows for writing broadcast player replacements using umlaut form (i.e
"Blübaum" instead of "Bluebaum") and have replacement happen even if the
name in the PGN is spelled "Bluebaum".

Previously each player replacement name was mapped into a single token
string which identifies the replacement info:
"Matthias Blübaum" -> Map("blubaum matthias" -> ReplacementInfo)

Now names with umlauts will make an additional mapping:
"Matthias Blübaum" -> Map("blubaum matthias"  -> ReplacementInfo,
                          "bluebaum matthias" -> ReplacementInfo)

Relates: lichess-org#15152
  • Loading branch information
tors42 committed May 5, 2024
1 parent 6bd1bcd commit dd6aba9
Showing 1 changed file with 16 additions and 1 deletion.
17 changes: 16 additions & 1 deletion modules/relay/src/main/RelayPlayers.scala
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,22 @@ private class RelayPlayersTextarea(val text: String):

// With tokenized player names
private lazy val tokenizedPlayers: Map[PlayerToken, RelayPlayer] =
players.mapKeys(name => tokenize(name.value))
players.iterator
.flatMap((name, player) => Set(name, umlautify(name)).map((_, player)))
.map((name, player) => (tokenize.apply(name.value), player))
.toMap

private def umlautify: PlayerName => PlayerName =
diacritics.iterator.foldLeft(_):
case (name, (k, v)) =>
if name.value.contains(k) then PlayerName(name.value.replaceAll(k, v)) else name

val diacritics = Map(
"ö" -> "oe",
"ä" -> "ae",
"ü" -> "ue",
"ß" -> "ss"
)

// With player names combinations.
// For example, if the tokenized player name is "A B C D", the combinations will be:
Expand Down

0 comments on commit dd6aba9

Please sign in to comment.