Skip to content

Potential bug: Decoding partial response may result in corrupt string in case of multi-byte encoding #12

@Makaopior

Description

@Makaopior

This one is rather theoretical bug (I haven't encounter one) but may emerge someday.
Consider this part of code, which is reading response from the socket:

do
{
    cbRead = s.Read(readBuff, 0, readBuff.Length);
    res.Append(options.Encoding.GetString(readBuff, 0, cbRead));

In case of single-byte ASCII encoding, it does well. But some registries use other encodings. For example, JPNIC uses ISO-2022-JP to encode dual-byte characters. In this case, if response byte chunk ends in the middle of multi-byte codepoint, the result will be corrupted.
This test demonstrates a potential behavior:

[Test]
public void StringWithDualByteChars_BrokenInTheMiddleOfTheChar_DecodingBothPartsDerivesIncorrectResult()
{
#if NETCOREAPP
    Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
#endif

    var isoEncoding = Encoding.GetEncoding("iso-2022-jp");

    var testBytes = isoEncoding.GetBytes("こちらはテスト文字列である");
    var firstBytes = testBytes.Take(9).ToArray();
    var lastBytes = testBytes.Skip(9).ToArray();

    var finalString = isoEncoding.GetString(firstBytes) + isoEncoding.GetString(lastBytes);

    testBytes.Length.Is(firstBytes.Length + lastBytes.Length);
    finalString.Is("こちらはテスト文字列である"); //fails
}

Result:

Assert.That(actual, Is.EqualTo(expected))
Expected string length 13 but was 23. Strings differ at index 3.
Expected: "こちらはテスト文字列である"
But was:  "こちら$O%F%9%HJ8;zNs$G$"$k"
--------------^

 at NUnit.Framework.Legacy.ClassicAssert.AreEqual(Object expected, Object actual, String message, Object[] args)

The fix would be reading the full response message in something like List<byte> and decoding it afterwards.
I'll create a PR later in order not to mess up with other fixes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions