Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional System.ArgumentOutOfRangeException .Render() #164

Open
wjchristenson2 opened this issue Oct 6, 2014 · 15 comments
Open

Occasional System.ArgumentOutOfRangeException .Render() #164

wjchristenson2 opened this issue Oct 6, 2014 · 15 comments

Comments

@wjchristenson2
Copy link

I'm getting the below error occasionally. It's random which is strange. It may work 95% of the time and then fail out of nowhere...

var divPricing = this.MyCQ["div.info div.pricing"].FirstOrDefault();
if (divPricing != null)
{
    CQ divPricingCQ = divPricing.Render();
}

Details: System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
at CsQuery.Implementation.DomElement.get_ClassName()
at CsQuery.Implementation.DomElement.d__38.MoveNext()
at CsQuery.Output.FormatDefault.RenderElementInternal(IDomObject element, TextWriter writer, Boolean includeChildren)
at CsQuery.Output.FormatDefault.RenderStack(TextWriter writer)
at CsQuery.Output.FormatDefault.Render(IDomObject node)

@jamietre
Copy link
Owner

jamietre commented Oct 6, 2014

Can you try using the prerelease package on Nuget if you are not now? This
appears to be an obscure bug that was fixed some time ago but never made it
to release. I will make an effort to finally get the current package into
release soon.
On Oct 6, 2014 10:14 AM, "wjchristenson2" notifications@github.com wrote:

I'm getting the below error occasionally. It's random which is strange. It
may work 95% of the time and then fail out of nowhere...

var divPricing = this.MyCQ["div.info div.pricing"].FirstOrDefault();
if (divPricing != null)
{
CQ divPricingCQ = divPricing.Render();
}

Details: System.ArgumentOutOfRangeException: Index was out of range. Must
be non-negative and less than the size of the collection.
Parameter name: index
at CsQuery.Implementation.DomElement.get_ClassName()
at CsQuery.Implementation.DomElement.d__38.MoveNext()
at CsQuery.Output.FormatDefault.RenderElementInternal(IDomObject element,
TextWriter writer, Boolean includeChildren)
at CsQuery.Output.FormatDefault.RenderStack(TextWriter writer)
at CsQuery.Output.FormatDefault.Render(IDomObject node)


Reply to this email directly or view it on GitHub
#164.

@wjchristenson2
Copy link
Author

Thanks. I'll use 1.3.4 for now and will wait for 1.3.5 to be released. It's good to know I can bounce to the pre-release version which should fix the issue if it is escalated.

@wjchristenson2
Copy link
Author

FYI - This just occurred again tonight running on CsQuery 1.3.5-beta5 (Prerelease).

Details: System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
at CsQuery.Implementation.DomElement.get_ClassName()
at CsQuery.Implementation.DomElement.d__35.MoveNext()
at CsQuery.Output.FormatDefault.RenderElementInternal(IDomObject element, TextWriter writer, Boolean includeChildren)
at CsQuery.Output.FormatDefault.RenderStack(TextWriter writer)
at CsQuery.Output.FormatDefault.Render(IDomObject node)

@jamietre
Copy link
Owner

My bad - I didn't look at the stack trace carefully when you posted this, this is not the same old bug I had been thinking of that was fixed in the prerelease. I will look into this today.

@wjchristenson2
Copy link
Author

Have you had a chance to look into this? We are getting this error daily several times a day. Let me know if you need anything from me.

Thanks!

@jamietre
Copy link
Owner

Sorry I didn't look at it yet... I am looking at the relevant code right now and see nothing obvious, in normal use, though I do see that this method could be susceptible to thread safety issues. That is, if the CSS classes of an element were modified at the same time as it was being rendered, this error could occur. Is this scenario possible in the way you're using it?

If this is a thread safety problem try this change to Implementation/DomElement.cs

public override string ClassName
    {
        get
        {
            if (HasClasses)
            {
                string className = "";
                lock (((ICollection)_Classes).SyncRoot)
                {
                    foreach (ushort clsId in _Classes)
                    {
                        className += (className == "" ? "" : " ") + HtmlData.TokenName(clsId);
                    }
                }
                return className;
            }
            else
            {
                return "";
            }
        }
        set
        {
            SetClassName(value);
        }
    }

.. it will probably be hard for me to repro your error so the best way to figure out this is for you to compile your own -- if this presents any problem for you I can send you a DLL

@wjchristenson2
Copy link
Author

We are definitely using CsQuery on multiple threads, although each instance of CsQuery is created and used on one thread. So, if there are static variables/etc that are not thread safe then that could be a cause.

We are not altering any of the DOM. We are simply extracting relevant information (read-only).

If this error occurs, then we take the HTML page in question and requeue it for processing later and then it typically goes through fine w/o exception a subsequent time. Its very intermittent, but definitely happens frequently.

Not to confuse the matter, but here's another error we just got.

Details: System.NullReferenceException: Object reference not set to an instance of an object.
at CsQuery.Implementation.AttributeCollection.d__0.MoveNext()
at CsQuery.Implementation.DomElement.d__35.MoveNext()
at CsQuery.Output.FormatDefault.RenderElementInternal(IDomObject element, TextWriter writer, Boolean includeChildren)
at CsQuery.Output.FormatDefault.RenderStack(TextWriter writer)
at CsQuery.Implementation.DomElement.get_InnerHTML()

If you wanted us to test changes on our systems for awhile, sending the DLL would probably work best.

Thanks for your help.

@jamietre
Copy link
Owner

CsQuery is fine for use in a multithreaded environment generally - all static data is managed using threadsafe types - but individual instances of a CQ object are definitely not. Both of these errors are occurring while iterating over internal collections of their respective objects, which is why I suspect some kind of threading problem.

This is a long shot, but are you using weak references to CQ objects in your app?

We can try a build with some thread protection on the low level access methods (though there could be a lot of them) to see if it changes things but want to be sure there's nothing unusual architecturally.

@wjchristenson2
Copy link
Author

We are not using weak references to CQ objects. Sounds good on the thread protection.

@wjchristenson2
Copy link
Author

Here's another exception that works fine 99% of the time then just blew up a minute ago:

        var ulDetails = myCQ["ul.details"];
        if (ulDetails != null)
        {
            CQ ulDetailsCQ = ulDetails.RenderSelection();

            var span = ulDetailsCQ["li.stockNumber span.value"].FirstOrDefault();
        }

Details: System.NullReferenceException: Object reference not set to an instance of an object.
at CsQuery.Implementation.AttributeCollection.d__0.MoveNext()
at CsQuery.Implementation.DomElement.d__35.MoveNext()
at CsQuery.Output.FormatDefault.RenderElementInternal(IDomObject element, TextWriter writer, Boolean includeChildren)
at CsQuery.Output.FormatDefault.RenderStack(TextWriter writer)
at CsQuery.CQ.RenderSelection(IOutputFormatter outputFormatter, StringWriter writer)

Here's the HTML:

<ul class="details">
    <li class="bodyStyle">
        <strong class="title">Bodystyle:</strong>
        <span class="value">SUV</span>
    </li>
    <li class="engine">
        <strong class="title">Engine:</strong>
        <span class="value">5.6LL V-8 cyl</span>
    </li>
    <li class="transmission">
        <strong class="title">Transmission:</strong>
        <span class="value">7-Speed Automatic</span>
    </li>
    <li class="driveLine">
        <strong class="title">Drive Line:</strong>
        <span class="value">4x4</span>
    </li>
    <li class="exteriorColor">
        <strong class="title">Exterior Color:</strong>
        <span class="value">Black Obsidian</span>
    </li>
    <li class="interiorColor">
        <strong class="title">Interior Color:</strong>
        <span class="value">Graphite w/Leather-Appointed Seat Trim w/Tuscan Bu</span>
    </li>
    <li class="stockNumber">
        <strong class="title">Stock #:</strong>
        <span class="value">A46045</span>
    </li>
</ul>

@jamietre
Copy link
Owner

There is something a little bit weird about that code (though nothing that should make it blow up) -

CQ ulDetailsCQ = ulDetails.RenderSelection()

RenderSelection returns a string - and assigning a string to CQ object invokes an implicit static overload of the = operator to parse the string as HTML, this is really the same as

CQ ulDetailsCQ = CQ.Create(ulDetails.RenderSelection())

It would be a lot more efficient to do:

CQ ulDetailsCQ = ulDetails.Clone();

.. and if all you are trying to do is do a sub-query then you really don't even need to do that..

var span = ulDetails.Find("li.stockNumber span.value").FirstOrDefault();

would get you the same thing without having to create a new CQ.

This is just a code review, though, I really can't think why this would be in any way responsible for the problem you are seeing.

@wjchristenson2
Copy link
Author

Thanks for the code review! Always helpful to see new approaches to getting at the same result. If the problem is with creating a new CQ object, I can try to avoid such and use the Find() method instead.

@jamietre
Copy link
Owner

Well, I doubt this is the problem, but it might at least shed some light on what the problem is by trying different things out.

@wjchristenson2
Copy link
Author

Update: I've changed all code with CsQuery to use lock {} to ensure that any CsQuery interaction is single-threaded. The quantity of errors has gone down slightly, but they still occur. For whatever reason, retrying the web page at a later time works. On a web page that had the error occur, I've taken the HTML which CsQuery errored on and taken it offline and ran the exact same code on it in a loop of 1,000 times and it did not error. At this point, I am out things to try.

My only thought is it may be something with memory (garbage collection) or timing.

@wjchristenson2
Copy link
Author

Another Update: After reviewing all exceptions, it appears they randomly occur with using the IDomObject.Render() method or calling the IDomObject.InnerHTML property.

We get random NullReferenceExceptions or Index out of range exceptions. I've began refactoring our code to avoid them.

lukasbob added a commit to Siteimprove/CsQuery that referenced this issue May 3, 2016
…mplemented in HtmlData.cs.

This addresses issues where the `nextID` counter overflows ushort.MaxValue and wraps around to return the wrong index value, resulting in wrong strings output from the `Render` method.

Fix jamietre#204, jamietre#205, jamietre#189, jamietre#164
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants