Skip to content

Commit

Permalink
Speculative fix for crash at WebCore: PAL::newTextCodec
Browse files Browse the repository at this point in the history
https://bugs.webkit.org/show_bug.cgi?id=263084
rdar://110454455

Reviewed by Chris Dumez.

This is a speculative fix for CrashTracer: com.apple.WebKit.WebContent at WebCore: PAL::newTextCodec.

Text encodings work with the use of 2 static maps:
- Name map, TextEncodingNameMap: maps encoding names to a canonical encoding name.
- Codec map, TextCodecMap: maps canonical encoding names to a function that creates
the desired codec. All possible canonical name has an entry in the map.

When a TextEncoding object is created, it receives a string name that
is transformed in a canonical name by atomCanonicalTextEncodingName().
A TextEncoding name can only be null, in which case it is not valid, or
a valid canonical name. Therefore, it should not be possible to create a
valid name which is not canonical.

When a canonical name is being created, we populate both Name and Codec
maps. First we try to populate them with a reduced set (buildBaseTextCodecMaps()),
if not sufficient we populate them with an extended set (extendTextCodecMaps()).

Since it is not possible to create a valid non-canonical name and that
all canonical names have an entry in the Codec maps, it should not be
possible to have no entry in the Codec map for a valid TextEncoding, i.e.
a TextEncoding with a valid name.

For that reason, the callers of newTextCodec() only assert that m_encoding,
which is a TextEncoding, isValid(). This should theoretically be enough,
but since it is crashing, it is not. The crash seems to happen because
we are trying to deref a value from the TextEncodingNameMap that is null,
that can happen if:
[1]. we hace an entry for the encoding name in the map but the value is null.
[2]. we have a valid encoding name that has no entry in the codec map.

[1] is most likely not what is happening because we would need:
1.a: Having one of the functions that populate the codec maps (registerCodecs())
registering a null value. This shouldn't be possible becuse all of these function
register lambda functions.

1.b: race-condition: This could also happen if we would get a valid entry in the map and
between this point in time and the time we access the value of such iterator,
the entry would be invalidated. That also shouldn't be possible, because
all functions that mutate/access the maps are guarded by a static lock (encodingRegistryLock).

[2] For testing this hypothesis, I've forced WebKit to populate
the name and codec maps in the extended version, so I could verify a state
in which all possible canonical names (and its codecs) are registered.
I've then checked that in fact there is no canonical name without an
entry in the codec map. Therefore, theoretically this should also be ruled
out.

As I haven't found a valid hypothesis for now, and also as I can't reproduce
this issue, I'm trying a speculative fix, in which, we no longer just
assert that m_encoding is valid and assume that there will be a
an entry for it on the codec map. Instead, newTextCodec() will prevent a
null deref by returning a default UTF-8 codec for:
- a invalid encoding
- if there is no entry for a valid encoding in the codec map
- if there is an entry for a valid encoding in the codec map, but that
entry has a null value.

We also generate proper log that will help on further investigation.

* Source/WebCore/PAL/pal/Logging.h:
* Source/WebCore/PAL/pal/text/TextCodecUTF8.cpp:
(PAL::TextCodecUTF8::codec):
(PAL::TextCodecUTF8::registerCodecs):
* Source/WebCore/PAL/pal/text/TextCodecUTF8.h:
* Source/WebCore/PAL/pal/text/TextEncodingRegistry.cpp:
(PAL::newTextCodec):

Canonical link: https://commits.webkit.org/269304@main
  • Loading branch information
vitorroriz committed Oct 13, 2023
1 parent 0263797 commit 7d0bddb
Show file tree
Hide file tree
Showing 4 changed files with 22 additions and 3 deletions.
3 changes: 2 additions & 1 deletion Source/WebCore/PAL/pal/Logging.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ PAL_EXPORT void registerNotifyCallback(ASCIILiteral, Function<void()>&&);
#endif

#define PAL_LOG_CHANNELS(M) \
M(Media)
M(Media) \
M(TextEncoding)

#undef DECLARE_LOG_CHANNEL
#define DECLARE_LOG_CHANNEL(name) \
Expand Down
7 changes: 6 additions & 1 deletion Source/WebCore/PAL/pal/text/TextCodecUTF8.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,15 @@ void TextCodecUTF8::registerEncodingNames(EncodingNameRegistrar registrar)
registrar("x-unicode20utf8", "UTF-8");
}

std::unique_ptr<TextCodecUTF8> TextCodecUTF8::codec()
{
return makeUnique<TextCodecUTF8>();
}

void TextCodecUTF8::registerCodecs(TextCodecRegistrar registrar)
{
registrar("UTF-8", [] {
return makeUnique<TextCodecUTF8>();
return codec();
});
}

Expand Down
1 change: 1 addition & 0 deletions Source/WebCore/PAL/pal/text/TextCodecUTF8.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ class TextCodecUTF8 final : public TextCodec {
static void registerCodecs(TextCodecRegistrar);

static Vector<uint8_t> encodeUTF8(StringView);
static std::unique_ptr<TextCodecUTF8> codec();

private:
void stripByteOrderMark() final { m_shouldStripByteOrderMark = true; }
Expand Down
14 changes: 13 additions & 1 deletion Source/WebCore/PAL/pal/text/TextEncodingRegistry.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
#include "config.h"
#include "TextEncodingRegistry.h"

#include "Logging.h"
#include "TextCodecCJK.h"
#include "TextCodecICU.h"
#include "TextCodecLatin1.h"
Expand Down Expand Up @@ -254,8 +255,19 @@ std::unique_ptr<TextCodec> newTextCodec(const TextEncoding& encoding)
Locker locker { encodingRegistryLock };

ASSERT(textCodecMap);
if (!encoding.isValid()) {
RELEASE_LOG_ERROR(TextEncoding, "Trying to create new text codec with invalid (null) encoding name. Will default to UTF-8.");
return TextCodecUTF8::codec();
}
auto result = textCodecMap->find(encoding.name());
ASSERT(result != textCodecMap->end());
if (result == textCodecMap->end()) {
RELEASE_LOG_ERROR(TextEncoding, "Can't find codec for valid encoding %" PUBLIC_LOG_STRING ". Will default to UTF-8.", encoding.name());
return TextCodecUTF8::codec();
}
if (!result->value) {
RELEASE_LOG_ERROR(TextEncoding, "Codec for encoding %" PUBLIC_LOG_STRING " is null. Will default to UTF-8", encoding.name());
return TextCodecUTF8::codec();
}
return result->value();
}

Expand Down

0 comments on commit 7d0bddb

Please sign in to comment.