-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
Description
I ran into an issue in an application where a XmlDocument had a node with 150000 text node children.
The element contained an base64-encoded pdf which for some reason was split up in text nodes of 256 characters each.
The problem is not the split but that it takes long time to iterate over all children.
for (var node = root.FirstChild; node != null; node = node.NextSibling)
{
count += 1;
}(and also node.ChildNodes.Count)
In my test it also looks like the iteration time is exponential.
50000 text nodes = 1.8s
100000 text nodes = 7s
150000 text nodes = 16s
My test code
void Main()
{
var doc = new XmlDocument();
var root = doc.CreateElement("Root");
doc.AppendChild(root);
AddChildren(doc, root, 150000);
CountChildren(doc.FirstChild).Dump();
}
void AddChildren(XmlDocument doc, XmlNode node, int count)
{
string text = new string('x', 256);
for (int ii = 0; ii < count; ++ii)
{
var textNode = doc.CreateTextNode(text);
node.AppendChild(textNode);
}
}
int CountChildren(XmlNode root)
{
int count = 0;
var sw = new Stopwatch();
sw.Start();
for (var node = root.FirstChild; node != null; node = node.NextSibling) {
count += 1;
}
Console.WriteLine($"{count} - {sw.ElapsedMilliseconds}ms");
return count;
}Could not figure out why, but in my real application it takes much longer time.
The source for the XmlDocument in that case is a WCF message created from
var doc = new XmlDocument();
using (var docWriter = doc.CreateNavigator().AppendChild())
{
message.WriteBody(docWriter);
}Configuration
Benchmark Process Environment Information:
BenchmarkDotNet v0.13.8
Runtime=.NET 9.0.9 (9.0.925.41916), X64 RyuJIT AVX2
GC=Concurrent Workstation
HardwareIntrinsics=AVX2,AES,BMI1,BMI2,FMA,LZCNT,PCLMUL,POPCNT,AvxVnni,SERIALIZE VectorSize=256
Regression?
I got the same result in .Net Core 3.1.32
Data
Analysis
I looked at the source code which seems to be a single linked list. Have no idea how that can result in exponential iteration time.