Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Cherry-pick 7d0bddb. rdar://110454455
Speculative fix for crash at WebCore: PAL::newTextCodec https://bugs.webkit.org/show_bug.cgi?id=263084 rdar://110454455 Reviewed by Chris Dumez. This is a speculative fix for CrashTracer: com.apple.WebKit.WebContent at WebCore: PAL::newTextCodec. Text encodings work with the use of 2 static maps: - Name map, TextEncodingNameMap: maps encoding names to a canonical encoding name. - Codec map, TextCodecMap: maps canonical encoding names to a function that creates the desired codec. All possible canonical name has an entry in the map. When a TextEncoding object is created, it receives a string name that is transformed in a canonical name by atomCanonicalTextEncodingName(). A TextEncoding name can only be null, in which case it is not valid, or a valid canonical name. Therefore, it should not be possible to create a valid name which is not canonical. When a canonical name is being created, we populate both Name and Codec maps. First we try to populate them with a reduced set (buildBaseTextCodecMaps()), if not sufficient we populate them with an extended set (extendTextCodecMaps()). Since it is not possible to create a valid non-canonical name and that all canonical names have an entry in the Codec maps, it should not be possible to have no entry in the Codec map for a valid TextEncoding, i.e. a TextEncoding with a valid name. For that reason, the callers of newTextCodec() only assert that m_encoding, which is a TextEncoding, isValid(). This should theoretically be enough, but since it is crashing, it is not. The crash seems to happen because we are trying to deref a value from the TextEncodingNameMap that is null, that can happen if: [1]. we hace an entry for the encoding name in the map but the value is null. [2]. we have a valid encoding name that has no entry in the codec map. [1] is most likely not what is happening because we would need: 1.a: Having one of the functions that populate the codec maps (registerCodecs()) registering a null value. This shouldn't be possible becuse all of these function register lambda functions. 1.b: race-condition: This could also happen if we would get a valid entry in the map and between this point in time and the time we access the value of such iterator, the entry would be invalidated. That also shouldn't be possible, because all functions that mutate/access the maps are guarded by a static lock (encodingRegistryLock). [2] For testing this hypothesis, I've forced WebKit to populate the name and codec maps in the extended version, so I could verify a state in which all possible canonical names (and its codecs) are registered. I've then checked that in fact there is no canonical name without an entry in the codec map. Therefore, theoretically this should also be ruled out. As I haven't found a valid hypothesis for now, and also as I can't reproduce this issue, I'm trying a speculative fix, in which, we no longer just assert that m_encoding is valid and assume that there will be a an entry for it on the codec map. Instead, newTextCodec() will prevent a null deref by returning a default UTF-8 codec for: - a invalid encoding - if there is no entry for a valid encoding in the codec map - if there is an entry for a valid encoding in the codec map, but that entry has a null value. We also generate proper log that will help on further investigation. * Source/WebCore/PAL/pal/Logging.h: * Source/WebCore/PAL/pal/text/TextCodecUTF8.cpp: (PAL::TextCodecUTF8::codec): (PAL::TextCodecUTF8::registerCodecs): * Source/WebCore/PAL/pal/text/TextCodecUTF8.h: * Source/WebCore/PAL/pal/text/TextEncodingRegistry.cpp: (PAL::newTextCodec): Canonical link: https://commits.webkit.org/269304@main Identifier: 267815.311@safari-7617.1.11.13-branch
- Loading branch information