You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
HtmlNodeNavigator assigns new instances to _doc and _nameTable, as such:
private readonly HtmlDocument _doc = new HtmlDocument();
private readonly HtmlNameTable _nametable = new HtmlNameTable();
However, some of the constructors (including the copy constructor) immediately assign new values to these fields. This causes unnecessary creation and destruction of these objects, which can cause slowness when used repeatedly, for example HtmlNodeNavigator.Clone() is indirectly called by HtmlNode.SelectNodes.
Suggested fix
Move the assignments from the field declaration line to the empty constructor:
internal HtmlNodeNavigator()
{
_doc = new HtmlDocument();
_nametable = new HtmlNameTable();
Reset();
}
and call this constructor from all other constructors except the ones that initialize _doc and _nametable themselves.
How to reproduce
I added the following unit test:
[Test]
public void SelectEventAttributesTest()
{
String xpath = "//* [@onkeypress or @onkeydown or @onkeyup or @onclick or @ondblclick or @onmousedown or @onmouseup or @onmouseover or @onmousemove or @onmouseout or @onmouseenter or @onmouseleave or @onmousewheel or @oncontextmenu or @onabort or @onbeforeunload or @onerror or @onload or @onmove or @onresize or @onscroll or @onstop or @onunload or @onreset or @onsubmit or @onblur or @onchange or @onfocus or @onfocusin or @onfocusout or @oninput or @onbeforeactivate or @onactivate or @onbefordeactivate or @ondeactivate or @onbounce or @onfinish or @onstart or @onbeforecopy or @onbeforecut or @onbeforeeditfocus or @onbeforepaste or @onbeforeupdate or @oncopy or @oncut or @ondrag or @ondragdrop or @ondragend or @ondragenter or @ondragleave or @ondragover or @ondragstart or @ondrop or @onlosecapture or @onpaste or @onselect or @onselectstart or @oncontrolselect or @onmovestart or @onmoveend or @onafterupdate or @oncellchange or @ondataavailable or @ondatasetchanged or @ondatasetcomplete or @onerrorupdate or @onrowenter or @onrowexit or @onrowsdelete or @onrowsinserted or @onafterprint or @onbeforeprint or @onfilterchange or @onhelp or @onpropertychange or @onreadystatechange]";
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(@"<html><body><div id='foo' onclick='bar'><span> some</span> text</div></body></html>");
for (int i = 0; i < 100000; i++)
{
doc.DocumentNode.SelectNodes(xpath).ToList();
}
}
Runtime with and without the fix:
Without the fix: 8 seconds.
With the fix: 5 seconds.
Further technical details
HAP version: 1.11.1.0
NET version (net472, netcore, etc.): 4.7.2
The text was updated successfully, but these errors were encountered:
Description
HtmlNodeNavigator assigns new instances to _doc and _nameTable, as such:
However, some of the constructors (including the copy constructor) immediately assign new values to these fields. This causes unnecessary creation and destruction of these objects, which can cause slowness when used repeatedly, for example HtmlNodeNavigator.Clone() is indirectly called by HtmlNode.SelectNodes.
Suggested fix
Move the assignments from the field declaration line to the empty constructor:
and call this constructor from all other constructors except the ones that initialize _doc and _nametable themselves.
How to reproduce
I added the following unit test:
Runtime with and without the fix:
Further technical details
The text was updated successfully, but these errors were encountered: