forked from keithjjones/WhoisClient.NET
-
Notifications
You must be signed in to change notification settings - Fork 14
Closed
Description
This one is rather theoretical bug (I haven't encounter one) but may emerge someday.
Consider this part of code, which is reading response from the socket:
do
{
cbRead = s.Read(readBuff, 0, readBuff.Length);
res.Append(options.Encoding.GetString(readBuff, 0, cbRead));
In case of single-byte ASCII encoding, it does well. But some registries use other encodings. For example, JPNIC uses ISO-2022-JP to encode dual-byte characters. In this case, if response byte chunk ends in the middle of multi-byte codepoint, the result will be corrupted.
This test demonstrates a potential behavior:
[Test]
public void StringWithDualByteChars_BrokenInTheMiddleOfTheChar_DecodingBothPartsDerivesIncorrectResult()
{
#if NETCOREAPP
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
#endif
var isoEncoding = Encoding.GetEncoding("iso-2022-jp");
var testBytes = isoEncoding.GetBytes("こちらはテスト文字列である");
var firstBytes = testBytes.Take(9).ToArray();
var lastBytes = testBytes.Skip(9).ToArray();
var finalString = isoEncoding.GetString(firstBytes) + isoEncoding.GetString(lastBytes);
testBytes.Length.Is(firstBytes.Length + lastBytes.Length);
finalString.Is("こちらはテスト文字列である"); //fails
}
Result:
Assert.That(actual, Is.EqualTo(expected))
Expected string length 13 but was 23. Strings differ at index 3.
Expected: "こちらはテスト文字列である"
But was: "こちら$O%F%9%HJ8;zNs$G$"$k"
--------------^
at NUnit.Framework.Legacy.ClassicAssert.AreEqual(Object expected, Object actual, String message, Object[] args)
The fix would be reading the full response message in something like List<byte>
and decoding it afterwards.
I'll create a PR later in order not to mess up with other fixes.
jsakamoto
Metadata
Metadata
Assignees
Labels
No labels