Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding extendedSessionsEnabled configuration causes the orchestration to be stuck in pending / running state. #98

Closed
soninaren opened this issue Jul 30, 2019 · 20 comments
Labels
bug Something isn't working P2 Priority 2 item
Milestone

Comments

@soninaren
Copy link
Member

Investigative information

  • Durable Functions extension version: 1.8.0 and 1.8.3
  • durable-functions npm module version: 1.2.2
  • Language (JavaScript/TypeScript) and version: Javascript
  • Node.js version: 10.14.1

Describe the bug
Adding extendedSessionsEnabled configuration to host.json causes the orchestration to be stuck in pending / running state.

To Reproduce

  • Create a function app in portal with node runtime
  • Set the host.config to
{
  "version": "2.0",
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[1.*, 2.0.0)"
  },
  "extensions": {
    "durableTask": {
      "extendedSessionsEnabled": true,
      "extendedSessionIdleTimeoutInSeconds": 30
    }
  }
}
  • Install durable-functions npm pacakge
  • Create a function via durable functions templates
  • Execute the function

Observed Behavior:
After executing the durable function via httpStarter function. Going to the status uri shows that the function is stuck in pending / running state. Removing the extendedSessionsEnabled fixes the issue of being stuck in pending / running state.

@kasravi
Copy link

kasravi commented Nov 13, 2019

Still reproducible, is there any plan to fix this?

@soninaren soninaren added this to the Triaged milestone Nov 22, 2019
@ConnorMcMahon ConnorMcMahon added bug Something isn't working P2 Priority 2 item labels Jan 24, 2020
@ConnorMcMahon
Copy link

@kasravi

We need to investigate the root cause of the bug. Hopefully it will be a straightforward fix, but it is potentially possible that the out-of-process model that JavaScript relies on may require some significant reworking to allow for extended sessions.

In that case, we will have to balance the engineering effort for this fix vs other improvements to the JS experience.

@andreujuanc
Copy link

I just enabled it to tune in performance a bit more and got them all stuck now. I'll turn it off.
Hope this get addressed soon!
Let me know if I can be of any help

@ConnorMcMahon
Copy link

Note that I have a PR for our documentation to make it more clear that this is currently unsupported.

@cgillum, we should have a discussion about the feasibility of this scenario. My understanding of extended sessions is that it heavily uses .NET constructs, and I'm not sure our current approach for out-of-proc languages supports receiving events after the execution has already started.

@cgillum
Copy link
Collaborator

cgillum commented Feb 14, 2020

@ConnorMcMahon More than just documentation, if we detect that extended sessions are enabled for an out-of-proc language, we should probably throw a runtime exception of some kind.

But yes, I suggest we open an item to track making extended sessions work properly for out-of-proc languages. We certainly won't be able to resume in-progress tasks without language worker support, but I think we should be able to keep some of the other benefits, such as not reloading the history from Azure Storage on each replay.

@andreujuanc
Copy link

Maybe at least it should just ignore it? And log it out obviously. At least until you guys decide if it's possible to implement or not.

@humanamburu
Copy link

humanamburu commented Jun 16, 2020

Hi @ConnorMcMahon,

Do you have any plans to support this feature for JS? Currently, I have a case with multiple small activity functions calls, and the performance of the orchestrator degrades after some time.

Looks like these options will help me to improve performance. Thank you.

@ConnorMcMahon
Copy link

@humanamburu,

This work item will take a bit of time, and as Chris mentioned, it won't be as complete of a solution as for C# just due to some inherent limitations with our architecture.

If you are having performance issues, I would recommend opening an issue for that. If you give us an instance id, region, and timestamp then we can look at our telemetry. If you could share the performance numbers, that would also be helpful.

I have a feeling that if you just want to improve your orchestrator performance, there are some much lower hanging fruit that we can get orchestrator performance to where you want in a more timely manner before extended sessions are finished.

@humanamburu
Copy link

@ConnorMcMahon thank you for the quick response!

Ok, I will try to refactor my logic and will create an issue with example if refactoring will not help.

BTW could you please add some note about the availability of this feature for JS to the documentation? Because I spent quite a bit of time debugging before founding this issue.

Thanks!

@Khadgar
Copy link

Khadgar commented Aug 12, 2020

I've also tried to enable exteded-sessions but it made my orchestrator hang. I've created a bug report: #214

@terem42
Copy link

terem42 commented Dec 3, 2020

I observe similar behaviour for Powershell runtime for Azure Durable functions.
Orchestrator function hangs and does not progress.

@ConnorMcMahon
Copy link

Extended sessions don't work with any non-.NET function app.

I believe we have made it so if "extendedSessions" is set to true for non-.NET apps, we just ignore that value to prevent hanging orchestrations. That change should be deployed to v2 of extension bundles shortly.

@terem42
Copy link

terem42 commented Dec 5, 2020

Pity. In my orchestrator function I use looped polling mechanism for checking Azure containers groups statuses (Running/Failed/Terminated), because there is no events sent on container group changes, so event callbacks cannot be used yet.
On 10-50 status checking threads this leads to massive influx of entries in taskhub history table. Having extendedSessions settings working for Powershell would be very beneficial.

@ConnorMcMahon
Copy link

@terem42

I'm a bit confused about why extended sessions would impact how many orchestration instances you would have. Extended sessions are just a performance optimization, enabling or disabling them should not result in any more orchestrations being scheduled.

@terem42
Copy link

terem42 commented Dec 7, 2020

Performance of course. Polling loop on containers in case of extended sessions is expected to work faster.

@brandonh-msft
Copy link
Member

@ConnorMcMahon Seems like this issue's going to become even more of a showstopper with .NET 5 being out-of-proc, right? .NET customers who may be using this successfully today will find that, upon moving to .NET 5 on AF, their Functions no longer work with it enabled...

@ConnorMcMahon
Copy link

This should no longer be an issue.

The extension now checks for out-of-proc status. If it is out-of-proc, extendedSessions is automatically set to false.

I believe this should be widely deployed now via extension bundles, but I will check.

@brandonh-msft
Copy link
Member

Does it prevent it from being set to true, though? or ignore it if it is?

@ConnorMcMahon
Copy link

It just ignores it if it is true for now (I believe it does log a warning though). If/when we add some sort of extended session support for out-of-proc, we will resume respecting values of true.

@davidmrdavid
Copy link
Collaborator

Closing as this should no longer be an issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2 Priority 2 item
Projects
None yet
Development

No branches or pull requests

10 participants