Skip to content
Permalink
Browse files
[GTK] [EFL] Hyphenation can never work in practice due to requirement…
…s on lang tags

https://bugs.webkit.org/show_bug.cgi?id=147310

Patch by Martin Robinson <mrobinson@igalia.com> on 2016-01-14
Reviewed by Michael Catanzaro.

Source/WebCore:

Test: platform/gtk/fast/text/hyphenate-flexible-locales.html

* platform/text/hyphen/HyphenationLibHyphen.cpp: Make locale matching for dictionary
selection a lot looser by matching case insensitively, matching multiple dictionaries
when only the language is specified, and ignoring the difference between '_' and '-' in
the locale name.
(WebCore::scanDirectoryForDicionaries): Now produce HashMap of Vectors instead of a single
path for each locale. Also add alternate entries to handle different ways of specifying
the locale.
(WebCore::scanTestDictionariesDirectoryIfNecessary): Update to handle the difference
in HashMap type.
(WebCore::availableLocales): Ditto.
(WebCore::canHyphenate): Also look for the lowercased version of the locale.
(WebCore::AtomicStringKeyedMRUCache<RefPtr<HyphenationDictionary>>::createValueForKey):
Key on the dictionary path now so that we can load more than one dictionary per locale.
(WebCore::lastHyphenLocation): Iterate through each matched dictionary in turn.

LayoutTests:

Update some baselines and add a GTK+ specific test for locale variations.

* platform/gtk/fast/text/hyphenate-flexible-locales-expected.html: Added.
* platform/gtk/fast/text/hyphenate-flexible-locales.html: Added.
* platform/gtk/fast/text/hyphenate-locale-expected.png: We now properly hyphenate
text with the 'en' locale.
* platform/gtk/fast/text/hyphenate-locale-expected.txt:

Canonical link: https://commits.webkit.org/171185@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@195058 268f45cc-cd09-0410-ab3c-d52691b4dbfc
  • Loading branch information
mrobinson authored and webkit-commit-queue committed Jan 14, 2016
1 parent cbeadab commit 1dcb7d55df12d5dc11afc7ada473855aa8dbb0ea
@@ -1,3 +1,18 @@
2016-01-14 Martin Robinson <mrobinson@igalia.com>

[GTK] [EFL] Hyphenation can never work in practice due to requirements on lang tags
https://bugs.webkit.org/show_bug.cgi?id=147310

Reviewed by Michael Catanzaro.

Update some baselines and add a GTK+ specific test for locale variations.

* platform/gtk/fast/text/hyphenate-flexible-locales-expected.html: Added.
* platform/gtk/fast/text/hyphenate-flexible-locales.html: Added.
* platform/gtk/fast/text/hyphenate-locale-expected.png: We now properly hyphenate
text with the 'en' locale.
* platform/gtk/fast/text/hyphenate-locale-expected.txt:

2016-01-14 Youenn Fablet <youenn.fablet@crf.canon.fr>

Fix problems with cross-origin redirects
@@ -0,0 +1,11 @@
<div style="-webkit-hyphens: auto; font-size: 36px; width: 130px;">
<div style="-webkit-locale: 'en_US';">throughout</div>
<div style="-webkit-locale: 'en_US';">throughout</div>
<div style="-webkit-locale: 'en_US';">throughout</div>
<div style="-webkit-locale: 'en_US';">throughout</div>
<div style="-webkit-locale: 'en_US';">throughout</div>
<div style="-webkit-locale: 'en_US';">throughout</div>
<div style="-webkit-locale: 'en_US';">throughout</div>
<div style="-webkit-locale: 'en_US';">throughout</div>
<div style="-webkit-locale: 'en_US';">throughout</div>
</div>
@@ -0,0 +1,11 @@
<div style="-webkit-hyphens: auto; font-size: 36px; width: 130px;">
<div style="-webkit-locale: 'en';">throughout</div>
<div style="-webkit-locale: 'en_US';">throughout</div>
<div style="-webkit-locale: 'en-US';">throughout</div>
<div style="-webkit-locale: 'en';">throughout</div>
<div style="-webkit-locale: 'en_us';">throughout</div>
<div style="-webkit-locale: 'en-us';">throughout</div>
<div style="-webkit-locale: 'EN';">throughout</div>
<div style="-webkit-locale: 'EN_US';">throughout</div>
<div style="-webkit-locale: 'EN-US';">throughout</div>
</div>
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@@ -1,40 +1,42 @@
layer at (0,0) size 800x600
RenderView at (0,0) size 800x600
layer at (0,0) size 800x600
RenderBlock {HTML} at (0,0) size 800x600
RenderBody {BODY} at (8,8) size 784x584
RenderBlock {DIV} at (0,0) size 130x240
layer at (0,0) size 785x616
RenderView at (0,0) size 785x600
layer at (0,0) size 785x616
RenderBlock {HTML} at (0,0) size 785x616
RenderBody {BODY} at (8,8) size 769x600
RenderBlock {DIV} at (0,0) size 130x280
RenderBlock {DIV} at (0,0) size 130x40
RenderText {#text} at (0,0) size 158x40
text run at (0,0) width 158: "throughout"
RenderBlock {DIV} at (0,40) size 130x40
RenderText {#text} at (0,0) size 158x40
text run at (0,0) width 158: "throughout"
RenderBlock {DIV} at (0,80) size 130x80
RenderBlock {DIV} at (0,40) size 130x80
RenderText {#text} at (0,0) size 106x80
text run at (0,0) width 106: "throug" + hyphen string "\x{2010}"
text run at (0,40) width 64: "hout"
RenderBlock {DIV} at (0,120) size 130x80
RenderText {#text} at (0,0) size 106x80
text run at (0,0) width 106: "throug" + hyphen string "\x{2010}"
text run at (0,40) width 64: "hout"
RenderBlock {DIV} at (0,160) size 130x40
RenderBlock {DIV} at (0,200) size 130x40
RenderText {#text} at (0,0) size 158x40
text run at (0,0) width 158: "throughout"
RenderBlock {DIV} at (0,200) size 130x40
RenderBlock {DIV} at (0,240) size 130x40
RenderText {#text} at (0,0) size 158x40
text run at (0,0) width 158: "throughout"
RenderBlock {DIV} at (0,240) size 135x280
RenderBlock {DIV} at (0,280) size 135x320
RenderBlock {DIV} at (0,0) size 135x40
RenderText {#text} at (0,0) size 156x40
text run at (0,0) width 156: "reciprocity"
RenderBlock {DIV} at (0,40) size 135x40
RenderText {#text} at (0,0) size 156x40
text run at (0,0) width 156: "reciprocity"
RenderBlock {DIV} at (0,80) size 135x80
RenderBlock {DIV} at (0,40) size 135x80
RenderText {#text} at (0,0) size 114x80
text run at (0,0) width 114: "recipro" + hyphen string "\x{2010}"
text run at (0,40) width 54: "city"
RenderBlock {DIV} at (0,120) size 135x80
RenderText {#text} at (0,0) size 114x80
text run at (0,0) width 114: "recipro" + hyphen string "\x{2010}"
text run at (0,40) width 54: "city"
RenderBlock {DIV} at (0,160) size 135x80
RenderBlock {DIV} at (0,200) size 135x80
RenderText {#text} at (0,0) size 96x80
text run at (0,0) width 96: "recipr" + hyphen string "\x{2010}"
text run at (0,40) width 72: "ocity"
RenderBlock {DIV} at (0,240) size 135x40
RenderBlock {DIV} at (0,280) size 135x40
RenderText {#text} at (0,0) size 156x40
text run at (0,0) width 156: "reciprocity"
@@ -1,3 +1,27 @@
2016-01-14 Martin Robinson <mrobinson@igalia.com>

[GTK] [EFL] Hyphenation can never work in practice due to requirements on lang tags
https://bugs.webkit.org/show_bug.cgi?id=147310

Reviewed by Michael Catanzaro.

Test: platform/gtk/fast/text/hyphenate-flexible-locales.html

* platform/text/hyphen/HyphenationLibHyphen.cpp: Make locale matching for dictionary
selection a lot looser by matching case insensitively, matching multiple dictionaries
when only the language is specified, and ignoring the difference between '_' and '-' in
the locale name.
(WebCore::scanDirectoryForDicionaries): Now produce HashMap of Vectors instead of a single
path for each locale. Also add alternate entries to handle different ways of specifying
the locale.
(WebCore::scanTestDictionariesDirectoryIfNecessary): Update to handle the difference
in HashMap type.
(WebCore::availableLocales): Ditto.
(WebCore::canHyphenate): Also look for the lowercased version of the locale.
(WebCore::AtomicStringKeyedMRUCache<RefPtr<HyphenationDictionary>>::createValueForKey):
Key on the dictionary path now so that we can load more than one dictionary per locale.
(WebCore::lastHyphenLocation): Iterate through each matched dictionary in turn.

2016-01-14 Per Arne Vollan <peavo@outlook.com>

[Win] Remove workarounds for fixed bugs in fmod and pow.
@@ -60,14 +60,27 @@ static String extractLocaleFromDictionaryFilePath(const String& filePath)
return fileName.substring(prefixLength, fileName.length() - prefixLength - suffixLength);
}

static void scanDirectoryForDicionaries(const char* directoryPath, HashMap<AtomicString, String>& availableLocales)
static void scanDirectoryForDicionaries(const char* directoryPath, HashMap<AtomicString, Vector<String>>& availableLocales)
{
for (const auto& filePath : listDirectory(directoryPath, "hyph_*.dic"))
availableLocales.set(AtomicString(extractLocaleFromDictionaryFilePath(filePath)), filePath);
for (const auto& filePath : listDirectory(directoryPath, "hyph_*.dic")) {
String locale = extractLocaleFromDictionaryFilePath(filePath).convertToASCIILowercase();
availableLocales.add(locale, Vector<String>()).iterator->value.append(filePath);

String localeReplacingUnderscores = String(locale);
localeReplacingUnderscores.replace('_', '-');
if (locale != localeReplacingUnderscores)
availableLocales.add(localeReplacingUnderscores, Vector<String>()).iterator->value.append(filePath);

size_t dividerPosition = localeReplacingUnderscores.find('-');
if (dividerPosition != notFound) {
localeReplacingUnderscores.truncate(dividerPosition);
availableLocales.add(localeReplacingUnderscores, Vector<String>()).iterator->value.append(filePath);
}
}
}

#if ENABLE(DEVELOPER_MODE)
static void scanTestDictionariesDirectoryIfNecessary(HashMap<AtomicString, String>& availableLocales)
static void scanTestDictionariesDirectoryIfNecessary(HashMap<AtomicString, Vector<String>>& availableLocales)
{
// It's unfortunate that we need to look for the dictionaries this way, but
// libhyphen doesn't have the concept of installed dictionaries. Instead,
@@ -89,10 +102,10 @@ static void scanTestDictionariesDirectoryIfNecessary(HashMap<AtomicString, Strin
}
#endif

static HashMap<AtomicString, String>& availableLocales()
static HashMap<AtomicString, Vector<String>>& availableLocales()
{
static bool scannedLocales = false;
static HashMap<AtomicString, String> availableLocales;
static HashMap<AtomicString, Vector<String>> availableLocales;

if (!scannedLocales) {
for (size_t i = 0; i < WTF_ARRAY_LENGTH(gDictionaryDirectories); i++)
@@ -112,7 +125,9 @@ bool canHyphenate(const AtomicString& localeIdentifier)
{
if (localeIdentifier.isNull())
return false;
return availableLocales().contains(localeIdentifier);
if (availableLocales().contains(localeIdentifier))
return true;
return availableLocales().contains(AtomicString(localeIdentifier.string().convertToASCIILowercase()));
}

class HyphenationDictionary : public RefCounted<HyphenationDictionary> {
@@ -158,10 +173,9 @@ RefPtr<HyphenationDictionary> AtomicStringKeyedMRUCache<RefPtr<HyphenationDictio
}

template<>
RefPtr<HyphenationDictionary> AtomicStringKeyedMRUCache<RefPtr<HyphenationDictionary>>::createValueForKey(const AtomicString& localeIdentifier)
RefPtr<HyphenationDictionary> AtomicStringKeyedMRUCache<RefPtr<HyphenationDictionary>>::createValueForKey(const AtomicString& dictionaryPath)
{
ASSERT(availableLocales().get(localeIdentifier));
return HyphenationDictionary::create(fileSystemRepresentation(availableLocales().get(localeIdentifier)));
return HyphenationDictionary::create(fileSystemRepresentation(dictionaryPath.string()));
}

static AtomicStringKeyedMRUCache<RefPtr<HyphenationDictionary>>& hyphenDictionaryCache()
@@ -190,9 +204,6 @@ static void countLeadingSpaces(const CString& utf8String, int32_t& pointerOffset

size_t lastHyphenLocation(StringView string, size_t beforeIndex, const AtomicString& localeIdentifier)
{
ASSERT(availableLocales().contains(localeIdentifier));
RefPtr<HyphenationDictionary> dictionary = hyphenDictionaryCache().get(localeIdentifier);

// libhyphen accepts strings in UTF-8 format, but WebCore can only provide StringView
// which stores either UTF-16 or Latin1 data. This is unfortunate for performance
// reasons and we should consider switching to a more flexible hyphenation library
@@ -211,32 +222,38 @@ size_t lastHyphenLocation(StringView string, size_t beforeIndex, const AtomicStr
Vector<char> hyphenArray(utf8StringCopy.length() - leadingSpaceBytes + 5);
char* hyphenArrayData = hyphenArray.data();

char** replacements = nullptr;
int* positions = nullptr;
int* removedCharacterCounts = nullptr;
hnj_hyphen_hyphenate2(dictionary->libhyphenDictionary(),
utf8StringCopy.data() + leadingSpaceBytes,
utf8StringCopy.length() - leadingSpaceBytes,
hyphenArrayData,
nullptr, /* output parameter for hyphenated word */
&replacements,
&positions,
&removedCharacterCounts);

if (replacements) {
for (unsigned i = 0; i < utf8StringCopy.length() - leadingSpaceBytes - 1; i++)
free(replacements[i]);
free(replacements);
}

free(positions);
free(removedCharacterCounts);

for (int i = beforeIndex - leadingSpaceCharacters - 1; i >= 0; i--) {
// libhyphen will put an odd number in hyphenArrayData at all
// hyphenation points. A number & 1 will be true for odd numbers.
if (hyphenArrayData[i] & 1)
return i + leadingSpaceCharacters;
String lowercaseLocaleIdentifier = AtomicString(localeIdentifier.string().convertToASCIILowercase());
ASSERT(availableLocales().contains(lowercaseLocaleIdentifier));
for (const auto& dictionaryPath : availableLocales().get(lowercaseLocaleIdentifier)) {
RefPtr<HyphenationDictionary> dictionary = hyphenDictionaryCache().get(AtomicString(dictionaryPath));

char** replacements = nullptr;
int* positions = nullptr;
int* removedCharacterCounts = nullptr;
hnj_hyphen_hyphenate2(dictionary->libhyphenDictionary(),
utf8StringCopy.data() + leadingSpaceBytes,
utf8StringCopy.length() - leadingSpaceBytes,
hyphenArrayData,
nullptr, /* output parameter for hyphenated word */
&replacements,
&positions,
&removedCharacterCounts);

if (replacements) {
for (unsigned i = 0; i < utf8StringCopy.length() - leadingSpaceBytes - 1; i++)
free(replacements[i]);
free(replacements);
}

free(positions);
free(removedCharacterCounts);

for (int i = beforeIndex - leadingSpaceCharacters - 1; i >= 0; i--) {
// libhyphen will put an odd number in hyphenArrayData at all
// hyphenation points. A number & 1 will be true for odd numbers.
if (hyphenArrayData[i] & 1)
return i + leadingSpaceCharacters;
}
}

return 0;

0 comments on commit 1dcb7d5

Please sign in to comment.