You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am writing, what I would think is a fairly simple usage of AngleSharp[.Css], I am extracting a html table of covid-19 cases etc.. by country. The headers [or other cells] can contain html <br>. INode.Text() [an extension] and INode.TextContent() remove the <br> returning values like “TotalCases”. My implementation parses the 3000ish cells in 4.6 seconds. Using AngleSharp.Css’s ElementExtensions’s string GetInnerText(this IElement element); takes over 8 minutes makeing it unusable.
I debugged it and what slows it down is basically the computation of the style rules and because i also dont need styles for InnerText, except the default rules like paragraph or div break lines and stuff, i added 2 null checks.
In that case i can use InnerText without specifying .WithCss and without calling WithRenderDevice, this makes your code parse in 25 ms, instead of 8 minutes.
I will use my fork for now because this is probably not a acceptable solution for Florian
Bug Report
I am writing, what I would think is a fairly simple usage of AngleSharp[.Css], I am extracting a html table of covid-19 cases etc.. by country. The headers [or other cells] can contain html <br>. INode.Text() [an extension] and INode.TextContent() remove the <br> returning values like “TotalCases”. My implementation parses the 3000ish cells in 4.6 seconds. Using AngleSharp.Css’s ElementExtensions’s string GetInnerText(this IElement element); takes over 8 minutes makeing it unusable.
I assume you must implement Css’s display:none and visibility:hidden. I do not require that functionality, as I do not require an implementation of Javascript. If GetInnerText() can not be sped up a reasonable solution would be to use something like my code with your implementation of html entities such as © etc..
The attached project’s interesting code is in AngleSharpCssSpeedFault.cs.
AngleSharpCssSpeedFault.zip
The last method InnerText(IElement) has a #if to switch between the two implementations of InnerText().
Prerequisites
Run the attached solution.
Description
see above
Steps to Reproduce
Possible Solution
Use my InnerText() but add the expanding of all html & entities as that is missing.
The text was updated successfully, but these errors were encountered: