Azure webjob not appearing to respect MaxDequeueCount property #1045

mflps · 2017-03-07T14:39:03Z

Repro steps

We are having the same issue as described here:

http://stackoverflow.com/questions/42260068/azure-webjob-not-appearing-to-respect-maxdequeuecount-property

We have a function on a queueu, but the webjob is not respecting the maxdequeuecount property of 5.

It's running for days now....

Expected behavior

The webjob function to respect the maxdequeuecount property

Actual behavior

Webjob is not respecting the maxdequeuecount property of 5. In the screenshot you can see it has been running for 142 times...

Known workarounds

Don't know....

Related information

Provide any related information

Package version

Microsoft.Azure.Webjobs v2.0.0

Links to source:

http://stackoverflow.com/questions/42260068/azure-webjob-not-appearing-to-respect-maxdequeuecount-property

christopheranderson · 2017-03-07T18:58:13Z

@soninaren - Assigning to Naren to investigate

mathewc · 2017-03-07T23:45:44Z

Here is a pointer to the code our logic for handling max dequeue count.

As you can see, we first copy the queue message, then we delete it. Perhaps the delete is failing for some reason. That would explain things.

Do you see the message in the poison queue - did it get copied there?

mflps · 2017-03-07T23:49:57Z

Hi, Yes the message is copied to the poison queue. The message on the original queue was hidden. What could be the reason for the message not be deleted? thank you, Miguel

…

On Tue, Mar 7, 2017 at 11:45 PM, Mathew Charles ***@***.***> wrote: Here is a pointer to the code <https://github.com/Azure/azure-webjobs-sdk/blob/bd5891d622c06bbed5141beb7332c9b1b1ab6a93/src/Microsoft.Azure.WebJobs.Host/Queues/QueueProcessor.cs#L116> our logic for handling max dequeue count. As you can see, we first copy the queue message, then we delete it. Perhaps the delete is failing for some reason. That would explain things. Do you see the message in the poison queue - did it get copied there? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1045 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AK_Hj_-JOnp9mfAOCQ9xiwJQpMM1ZwVpks5rjewqgaJpZM4MVj68> .

-- Miguel Santos mobi2do - applications that go with you tel: +351913601091 email: miguels@mobi2do.com skype: migas2006

mathewc · 2017-03-07T23:58:52Z

Yes, good question. As you can see in our delete code here we have some specific handling for various exception types.

What would really help diagnose this would be if you could write a quick app using the storage SDK that tries to delete this message and share the results. That would pinpoint the issue and we could get a fix in for it. I haven't been able to repro this, but since you have a repro already that would help.

mflps · 2017-03-08T00:04:17Z

Hum... Don't have to much time for that, but I'll try to reproduce tomorrow. Thank you for your help! :) Best regards, Miguel

…

On Tue, Mar 7, 2017 at 11:58 PM, Mathew Charles ***@***.***> wrote: Yes, good question. As you can see in our delete code here <https://github.com/Azure/azure-webjobs-sdk/blob/bd5891d622c06bbed5141beb7332c9b1b1ab6a93/src/Microsoft.Azure.WebJobs.Host/Queues/QueueProcessor.cs#L192> we have some specific handling for various exception types. What would really help diagnose this would be if you could write a quick app using the storage SDK that tries to delete this message and share the results. That would pinpoint the issue and we could get a fix in for it. I haven't been able to repro this, but since you have a repro already that would help. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1045 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AK_Hj8HCwgKGblR63uwayMNupis6TuaHks5rje89gaJpZM4MVj68> .

-- Miguel Santos mobi2do - applications that go with you tel: +351913601091 email: miguels@mobi2do.com skype: migas2006

Bio2hazard · 2017-03-08T00:36:10Z

Symptoms that @mflps describes are identical to #985 , so it's probably related to Azure Storage version 8.0.1.

mathewc · 2017-03-08T00:54:36Z

Good point @Bio2hazard. @mflps did you by chance move to 8.0.1?

fabiomaulo · 2017-03-09T10:56:52Z

I have the same problem (message in poison and message in the queue forever) and another very similar one. The first is probably the most difficult to recreate because, IMO, you need a multi-instance process consuming the queue (4 o 5 instances for example).
The second and very similar problem is when you have an OutOfMemory or "Never finished" job.
The solution could be very easy to implement (for us, users, or better for the SDK it self): the dequeue count have to be checked even before pass it to the job (before or in BeginProcessingMessageAsync default impl).

Note: I'm using 8.1.1 Storage sdk, and 2.0.0 webJob sdk.

mathewc · 2017-03-10T20:51:56Z

Note that storage 8.x is not really supported. We only claim support for the version we ship with - 7.2.1 currently. That's all we're testing against. If you're upgrading to new major versions you may have issues. That said, we're planning on doing a test run with 8.x so that we can unblock this, but in general any time you're force moving to later versions, you might have issues that we can't anticipate.

fabiomaulo · 2017-03-11T14:09:00Z

Mat,
there is no problem with issues; more people are moving forward than more opportunity to make the SDK better and stable.
The matter here is about possible issues checking the DequeueCount only after the job run. If the job cause a situation that prevents the catch of the exception (for example OutOfMemory) you can't check the DequeueCount; if the situation happens again and again, at each job run, you will never put the message in the poison queue.

tomeastham · 2017-03-14T09:44:11Z

I had this issue and reverted back to storage 7.2.1 and it all works fine now.

mflps · 2017-03-17T19:06:47Z

The same for me. It seems Azure Storage 8.x is not working properly. Thank you, Miguel Santos

…

On Tue, Mar 14, 2017 at 9:44 AM, Tom ***@***.***> wrote: I had this issue and reverted back to storage 7.2.1 and it all works fine now. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1045 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AK_HjwT-J8sq0vczTfEsMEt7holCFiMsks5rlmFsgaJpZM4MVj68> .

-- Miguel Santos mobi2do - applications that go with you tel: +351913601091 email: miguels@mobi2do.com skype: migas2006

chadwackerman · 2017-03-26T12:24:17Z

@christopheranderson If there are issues like this, please, please mark your NuGet packages as requiring WindowsAzure.Storage < 8.0 until things get fixed. You could put a company out of business with a bug like this. Also given the obvious Sev 1 nature of this problem I'm surprised to see a lack of followup. I'd prefer not to discover critical bugs by reading issue one-thousand-something on a Sunday morning.

Open source is great, but I'm noticing that the primitive GitHub issue management system doesn't scale well to projects with an audience as large as Microsoft technologies. It seems to be flooding Program Managers with so many little issues that big ones like this keep getting ignored.

christopheranderson · 2017-03-27T16:57:35Z

Sorry for the confusion here folks. We're working on updating all our package versions in another major version release which will likely start to have some pre-release bits this summer, which will unlock us to run on dotnet standard 2.0/core, etc.

Unfortunately, since some folks can use 8 (if they aren't using Storage bindings), we can't add the version flag now since that's technically a breaking change. This is our bad for not finding this before we released 2.0. After discussing it, we think the least impactful thing is to leave the dependency version the same and document the issue with Azure Storage 8.0.

I've opened up a separate issue to track adding 8.0 support - #1091

This issue will remain open until we address that one and/or add a proper version cap to our dependency version.

chadwackerman · 2017-03-27T18:31:54Z

A package manager refusing to install or upgrade something because there's a known compatibility issue is a feature. It's not something to hide from.

The goal is maximal transparency -- not keeping version numbers low and issues buried. It's just numbers. Unless a new version is imminent (days away), I'd release 3.0 and fix the NuGet dependency.

Here's an example of what not to do. The CoreFX team broke the entire .NET HTTP stack with 4.0. People screamed but nobody took it seriously. They released System.Net.Http 4.1, 4.2 and even 4.3 without fixing the bug. Six months later they had to roll back features and remove API. Did a beta as 4.4. Then released the final version as... 4.3.1.

Breaking changes, API removed, and they tacked on 0.0.1. Yikes.

I think some of this is driven by ego and embarrassment because it makes little technical sense. I really don't know what's going on with versioning but I'd encourage you to champion SemVer internally because the random versioning is really causing problems for devs on the outside.

http://semver.org/

Sorry for the lecture but somebody has to wave this flag.

brettsam · 2017-05-12T21:43:17Z

Addressed with #1141.

claudio-yuri · 2017-06-05T18:07:46Z

Hi @mflps, is this fix included in the latest version of the SDK?

christopheranderson assigned soninaren Mar 7, 2017

christopheranderson added the bug label Mar 7, 2017

christopheranderson added this to the March 2017 milestone Mar 7, 2017

paulbatum unassigned soninaren Mar 27, 2017

christopheranderson mentioned this issue Mar 27, 2017

Support Azure Storage SDK 8.+ #1091

Open

christopheranderson modified the milestones: Backlog, March 2017 Mar 27, 2017

brettsam closed this as completed May 12, 2017

mathewc mentioned this issue Jul 18, 2018

Messages never move to poison queue incase of timeout exceptions Azure/azure-functions-host#3157

Closed

pragnagopa mentioned this issue Jul 19, 2018

Messages never move to poison queue incase of timeout exceptions #1809

Closed

gruckion mentioned this issue Sep 10, 2019

fixing issues with Storage SDK 8.x #1141

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Azure webjob not appearing to respect MaxDequeueCount property #1045

Azure webjob not appearing to respect MaxDequeueCount property #1045

mflps commented Mar 7, 2017 •

edited by paulbatum

christopheranderson commented Mar 7, 2017

mathewc commented Mar 7, 2017

mflps commented Mar 7, 2017 via email

mathewc commented Mar 7, 2017

mflps commented Mar 8, 2017 via email

Bio2hazard commented Mar 8, 2017

mathewc commented Mar 8, 2017

fabiomaulo commented Mar 9, 2017 •

edited

mathewc commented Mar 10, 2017 •

edited

fabiomaulo commented Mar 11, 2017

tomeastham commented Mar 14, 2017

mflps commented Mar 17, 2017 via email

chadwackerman commented Mar 26, 2017

christopheranderson commented Mar 27, 2017

chadwackerman commented Mar 27, 2017

brettsam commented May 12, 2017

claudio-yuri commented Jun 5, 2017

Azure webjob not appearing to respect MaxDequeueCount property #1045

Azure webjob not appearing to respect MaxDequeueCount property #1045

Comments

mflps commented Mar 7, 2017 • edited by paulbatum

Repro steps

Expected behavior

Actual behavior

Known workarounds

Related information

christopheranderson commented Mar 7, 2017

mathewc commented Mar 7, 2017

mflps commented Mar 7, 2017 via email

mathewc commented Mar 7, 2017

mflps commented Mar 8, 2017 via email

Bio2hazard commented Mar 8, 2017

mathewc commented Mar 8, 2017

fabiomaulo commented Mar 9, 2017 • edited

mathewc commented Mar 10, 2017 • edited

fabiomaulo commented Mar 11, 2017

tomeastham commented Mar 14, 2017

mflps commented Mar 17, 2017 via email

chadwackerman commented Mar 26, 2017

christopheranderson commented Mar 27, 2017

chadwackerman commented Mar 27, 2017

brettsam commented May 12, 2017

claudio-yuri commented Jun 5, 2017

mflps commented Mar 7, 2017 •

edited by paulbatum

fabiomaulo commented Mar 9, 2017 •

edited

mathewc commented Mar 10, 2017 •

edited