Skip to content

Commit

Permalink
Merge pull request #13483 from annando/languages
Browse files Browse the repository at this point in the history
Additional languages / new hook for additional language detection
  • Loading branch information
MrPetovan committed Oct 1, 2023
2 parents ad414f3 + 0840086 commit 6503016
Show file tree
Hide file tree
Showing 4 changed files with 52 additions and 8 deletions.
8 changes: 8 additions & 0 deletions doc/Addons.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,13 @@ Please note: body contents are bbcode - not HTML
Called when receiving a post from another source. This may also be used to post local activity or system generated messages.
`$b` is the item array of information to be stored in the database and the item body is bbcode.

### detect_languages
Called after the language detection. This can be used for alternative language detection methods.
`$data` is an array:

- **text**: The text that is analyzed.
- **detected**: (input/output) Array of language codes detected in the related text.

### addon_settings
Called when generating the HTML for the addon settings page.
`$data` is an array containing:
Expand Down Expand Up @@ -800,6 +807,7 @@ Here is a complete list of all hook callbacks with file locations (as of 24-Sep-

### src/Model/Item.php

Hook::callAll('detect_languages', $item);
Hook::callAll('post_local', $item);
Hook::callAll('post_remote', $item);
Hook::callAll('post_local_end', $posted_item);
Expand Down
8 changes: 8 additions & 0 deletions doc/de/Addons.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,13 @@ Derzeitige Hooks
$b ist das Item-Array einer Information, die in der Datenbank und im Item gespeichert ist.
{Bitte beachte: der Seiteninhalt ist bbcode - nicht HTML)

**'detect_languages'**
Wird nach der Sprachenerkennung aufgerufen.
Dieser Hook kann dafür verwendet werden, alternative Erkennungsfunktionen einzubinden.
`$data` ist ein Array:
'text' => Der analysierte Text.
'detected' => (Eingabe/Ausgabe) Das Array mit den erkannten Sprachen.

**'addon_settings'** - wird aufgerufen, wenn die HTML-Ausgabe der Addon-Einstellungsseite generiert wird.
$b ist die HTML-Ausgabe (String) der Addon-Einstellungsseite vor dem finalen "</form>"-Tag.

Expand Down Expand Up @@ -316,6 +323,7 @@ Eine komplette Liste aller Hook-Callbacks mit den zugehörigen Dateien (am 01-Ap

### src/Model/Item.php

Hook::callAll('detect_languages', $item);
Hook::callAll('post_local', $item);
Hook::callAll('post_remote', $item);
Hook::callAll('post_local_end', $posted_item);
Expand Down
30 changes: 24 additions & 6 deletions src/Core/L10n.php
Original file line number Diff line number Diff line change
Expand Up @@ -397,13 +397,31 @@ public function getAvailableLanguages(bool $additional = false): array
// See https://github.com/friendica/friendica/issues/10511
// Persian is manually added to language detection until a persian translation is provided for the interface, at
// which point it will be automatically available through `getAvailableLanguages()` and this should be removed.
// Additionally Portuguese, Ukrainian, traditional Chinese and Welsh are added to that list.
// Additionally some more languages are added to that list that are used in the Fediverse.
$additional_langs = [
'cy' => 'Cymraeg',
'uk' => 'Українська',
'pt-PT' => 'Português',
'zh-hant' => '繁體',
'fa' => 'فارسی'
'af' => 'Afrikaans',
'cy' => 'Cymraeg',
'el-monoton' => 'Ελληνικά',
'eu' => 'euskara',
'fa' => 'فارسی',
'gl' => 'Galego',
'hi' => 'हिन्दी',
'hr' => 'Hrvatski',
'id' => 'bahasa Indonesia',
'ko' => '한국인',
'lt' => 'lietuvių',
'lv' => 'latviešu',
'sk' => 'slovenský',
'sl' => 'Slovenščina',
'sw' => 'Kiswahili',
'th' => 'แบบไทย',
'tl' => 'Wikang Tagalog',
'tr' => 'Türkçe',
'pt-PT' => 'Português',
'uk' => 'Українська',
'uz' => 'Ўзбек',
'vi' => 'Tiếng Việt',
'zh-hant' => '繁體',
];
$langs = array_merge($additional_langs, $langs);
ksort($langs);
Expand Down
14 changes: 12 additions & 2 deletions src/Model/Item.php
Original file line number Diff line number Diff line change
Expand Up @@ -2017,7 +2017,7 @@ public static function getLanguageArray(string $body, int $count): array
$naked_body = BBCode::toPlaintext($naked_body);

// Remove possibly remaining links
$naked_body = preg_replace(Strings::autoLinkRegEx(), '', $naked_body);
$naked_body = trim(preg_replace(Strings::autoLinkRegEx(), '', $naked_body));

if (empty($naked_body)) {
return [];
Expand All @@ -2029,7 +2029,17 @@ public static function getLanguageArray(string $body, int $count): array
$availableLanguages = DI::l10n()->convertForLanguageDetection($availableLanguages);

$ld = new Language(array_keys($availableLanguages));
return $ld->detect($naked_body)->limit(0, $count)->close() ?: [];
$languages = $ld->detect($naked_body)->limit(0, $count)->close() ?: [];

$data = [
'text' => $naked_body,
'detected' => $languages,
];

Hook::callAll('detect_languages', $data);
$languages = $data['detected'];

return $languages;
}

/**
Expand Down

0 comments on commit 6503016

Please sign in to comment.