Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.StartNewAsync() and .RestartAsync() fail, when running locally and TaskHubName='TestHubName' #1926

Open
scale-tone opened this issue Aug 20, 2021 · 16 comments
Assignees
Labels

Comments

@scale-tone
Copy link

This is a follow-up on #1592. It turned out, that, when running outside Azure, DurableFunctionsMonitor fails to .RestartAsync() and/or .StartNewAsync(), if the Task Hub it is attached to is named 'TestHubName'. With other Task Hubs it works fine. It also works fine (no matter what Task Hub is called) in Azure.

I believe it is due to this logic here, that turns on/off orchestration name validation.

Can we think of a better way to invoke that logic?

FYI, DfMon's code, that makes the call, is here.

Expected behavior

Both .StartNewAsync() and .RestartAsync() succeed.

Actual behavior

System.Private.CoreLib: Exception while executing function: DfmPostOrchestrationFunction. Microsoft.Azure.WebJobs.Extensions.DurableTask: The function 'MapReduceOrchestrator' doesn't exist, is disabled, or is not an orchestrator function. 
Additional info: No orchestrator functions are currently registered!.

is thrown, when Task Hub is named 'TestHubName'.

App Details

  • Durable Functions extension version (e.g. v1.8.3): 2.4.3
  • Azure Functions runtime version (1.0 or 2.0): 3.0
  • Programming language used: C#
@ghost ghost added the Needs: Triage 🔍 label Aug 20, 2021
@cgillum
Copy link
Collaborator

cgillum commented Aug 20, 2021

Have you tried setting ExternalClient = true?

[DurableClient(TaskHub = Globals.TaskHubRouteParamName, ExternalClient = true)] IDurableClient durableClient, 

@scale-tone
Copy link
Author

scale-tone commented Aug 20, 2021

Just tried. Weirdly, that value gets discarded once Task Hub is 'TestHubName' (and still takes its effect otherwise):

image

And yes, it still throws then.

@cgillum
Copy link
Collaborator

cgillum commented Aug 23, 2021

Interesting, I think we'll need to create a local repro for this to see why the attribute value isn't sticking. I couldn't immediately see anything in the code that would cause this behavior.

@scale-tone
Copy link
Author

You can use DfMon as a repro. Just run it locally and press 'Restart' button
image

, once connected to a 'TestHubName' hub.

But that ExternalClient setting - is it documented somewhere? What's the expected behavior of it?

@scale-tone
Copy link
Author

I think I've found a workaround.
Setting WEBSITE_SITE_NAME env variable to something, e.g. to http://localhost:<my-port-nr> , does seem to turn that validation off.
But I'm really curious whether it has a potential of breaking something else. Do you think it can be safe, @cgillum ?

@cgillum
Copy link
Collaborator

cgillum commented Aug 30, 2021

That workaround sounds very counter-intuitive to me, but then again, the problem you're running into also seems counter-intuitive to me :). FWIW, that environment variable is already set when deployed to Azure Functions. The only problem I can think of is if your port number in the environment variable is different from the port number used in the Functions core tools (which defaults to 7071).

@scale-tone
Copy link
Author

that environment variable is already set when deployed to Azure Functions

Yes, which is probably the reason why it all works fine in Azure.

That workaround sounds very counter-intuitive to me, but then again, the problem you're running into also seems counter-intuitive to me :).

Two minuses usually make one plus, so let us think you're somewhat positive about this :)
Allright then, will issue a patch for DfMon, until this one is properly resolved.

@davidmrdavid
Copy link
Contributor

Just doing some triaging here :-)

@scale-tone is this issue still reproducible on your end? If so, is there any chance you could attach a minimal repro to this thread? That would help accelerate / re-active this investigation. Thanks!

@davidmrdavid davidmrdavid added Needs: Author Feedback Waiting for the author of the issue to respond to a question and removed Needs: Triage 🔍 labels Apr 6, 2022
@scale-tone
Copy link
Author

@davidmrdavid , this behavior is still the same, even with latest Microsoft.Azure.WebJobs.Extensions.DurableTask v2.6.1.

To reproduce, create a Function project, use "TestHubName" as a Task Hub name, get a default instance of DurableClient and call its .StartNewAsync() method with random parameters. Run all of this locally (not in Azure)

@ghost ghost added Needs: Attention 👋 and removed Needs: Author Feedback Waiting for the author of the issue to respond to a question labels Apr 7, 2022
@davidmrdavid davidmrdavid added Needs: Investigation 🔍 A deeper investigation needs to be done by the project maintainers. and removed Needs: Attention 👋 labels Apr 7, 2022
@davidmrdavid
Copy link
Contributor

davidmrdavid commented Apr 7, 2022

@scale-tone: thanks for the quick response. I'm trying to repro this right now and I'm not able to.
Just to make sure I follow, let me share my code with you.

So this is my host.json

{
    "version": "2.0",
    "logging": {
        "applicationInsights": {
            "samplingSettings": {
                "isEnabled": true,
                "excludedTypes": "Request"
            }
        }
    },
    "extensions": {
        "durableTask": {
          "hubName": "TestHubName"
        }
      }
}

And my I have a default DurableClient coming from the default C# DF template, so it looks like this:

        [FunctionName("DurableFunctionsOrchestrationCSharp_HttpStart")]
        public static async Task<HttpResponseMessage> HttpStart(
            [HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestMessage req,
            [DurableClient] IDurableOrchestrationClient starter,
            ILogger log)
        {
            // Function input comes from the request content.
            string instanceId = await starter.StartNewAsync("DurableFunctionsOrchestrationCSharp", null);

            log.LogInformation($"Started orchestration with ID = '{instanceId}'.");

            return starter.CreateCheckStatusResponse(req, instanceId);
        }

Is this what you were suggesting as a repro? Or am I supposed to construct the client some other way? Currently, when I execute this, I'm able to run the default orchestrator template code with no exceptions

@davidmrdavid
Copy link
Contributor

This is all locally, running on Azurite as my storage emulator

@davidmrdavid davidmrdavid added the Needs: Author Feedback Waiting for the author of the issue to respond to a question label Apr 7, 2022
@scale-tone
Copy link
Author

The orchestration name, that you're trying to .StartNewAsync(), should not exist in the current project.
The DurableClient instance should act in "External Client" mode (whatever it means).

To clarify again, this scenario is all about using DurableClient for managing orchestration instances that run in an external Function. Just like DfMon does.
In this scenario DurableClient does allows to start/restart orchestrations, unless the task hub is called "TestHubName".

Also just to mention, the current DfMon version includes a workaround for this issue, so you couldn't reproduce it there.

@ghost ghost added Needs: Attention 👋 and removed Needs: Author Feedback Waiting for the author of the issue to respond to a question labels Apr 7, 2022
@davidmrdavid
Copy link
Contributor

Thanks @scale-tone.

I think I was able to repro this now. This is what I did:
I have two local apps: A and B.

App A has an external client which starts an orchestrator in app B.
App B has "TestHubName" as its hubName.

Now here's the interesting bit.

If app A doesn't specify it's own hubName is host.json, then I can repro your behavior.
However, if app A does specify its own hubName (say it's set to TestHubName2, then I can not repro it.

Which leads me to believe the issue here is with the default value of "hubName" which, according to this, is equal to "TestHubName"!

Here's my hypothesis: I think that if you have 2 apps both of which share the same taskhub name, an external client in app A will not be able to recognize Orchestrators defined in app B because A thinks there's only one app running due to it using the same taskhub as the other one! Now, I'm not particularly familiar with this part of the codebase, but this seems like a likely explanation from what I know thus far.

@davidmrdavid
Copy link
Contributor

davidmrdavid commented Apr 7, 2022

Let me add this to our backlog so we can start prioritizing it, although it does seem like a rare edge case so it may take us a bit to get there.

@bachuv
Copy link
Collaborator

bachuv commented Jul 19, 2022

I was also able to locally reproduce this and confirmed that if the default task hub name is used when configuring DurableClient then it acts like there is only one app.

To add to the discussion above, here's the default task hub name behavior (from these docs):

When deployed in Azure, the task hub name is derived from the name of the function app. When running outside of Azure, the default task hub name is TestHubName.

@bachuv bachuv removed the Needs: Investigation 🔍 A deeper investigation needs to be done by the project maintainers. label Jul 19, 2022
@davidmrdavid
Copy link
Contributor

@bachuv: any chance we could improve the error message so that it would suggest to customers that they need different TaskHub names per app for this scenario?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants