-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orchestrator fails with multilingual model on Azure Sites #1287
Comments
@daveta - can you please investigate, advise? |
@nephinj, just to confirm, the multilingual model works when running locally? what's the size of the azure vm? |
@nephinj : Specifically, is it x64? And how much memory does the VM have? We recommend x64 VM's. |
Hi @nephinj, can you list the file sizes of the unzipped model folder? In the error message dump, "bad allocation" indicated that the system/machine does not have sufficient (contiguous) memory to load the Orchestrator Multilingual ONNX model file. Please provision a larger Azure VM and try again. Bad allocation is a standard C++ exception if a program cannot allocate enough memory, please see https://www.cplusplus.com/reference/new/bad_alloc/ for details. Thanks. |
Model works when running locally. The only multilingual model I could get to work on Azure was pretrained.20210608.microsoft.dte.01.06.int.unicoder_multilingual.onnx. I was running the 32bit with the P1V2 Production Sku which has 3.5 GB of memory. I can try changing it to x64 to see if that helps and playing with the various plans. |
Hi @nephinj, the "int" model is quantized and 1/4 in size compared to the model it got quantized from. The quantized model could perform a little worse than its original model, but only about 1% drop (micro average accuracy) in our experiences. |
Hi @nephinj, please give us the exact Azure VM config you are using, so we can repro. Thanks. |
Hi @nephinj, thanks for the Azure configuration. We can repro the issue on an Azure VM. Will debug, |
Hi @nephinj, we suspect that you might be using a x86 nodejs installation in your Azure VM, thus the bf-cli packages were also installed as their 32-bit versions. Please double check if your Azure VM were installed with x64 version or not. |
Hi @nephinj, we have tested several Azure VM scenarios and the bigger multilingual models (800MB+) can only be loaded using x64 build which can be installed along with an x64 NodeJS installation. We were able to run Orchestrator on the even bigger 12L multilingual model (1GB+) using a VM provisioned with only 4GB main memory (B2s sku). We also tested x86 Orchestrator (installed by an x86 NodeJS installation) on a variety of x64 Windows Azure VMs with memory provisioned from 8GB to 64GB. Just like what you described, Orchestrator cannot load the larger multilingual models (6L or 12L), but was able to load smaller ones (quantized or EN-only models). Since you started with "32bit with the P1V2 Production Sku which has 3.5 GB of memory", I suspected the NodeJS installation was still x86 even after you upgraded the VM to x64. Based on my experiences, I think the 3.5GB P1V2 VM should be sufficient to load any Orchestrator models as long as its NodeJS and Orchestrator packages were x64. |
@hcyang Thanks for helping to look into this. I can verify by running |
Hi @nephinj, I think this issue is well understood and can be closed. Let us know if you have more questions. |
Hi @nephinj I'm closing this due to inactivity. Feel free to reopen if needed |
Versions
@microsoft/botframework-cli/4.14.1
win32-x64
node-v14.13.0
Describe the bug
When building a model with pretrained.20210205.microsoft.dte.00.06.unicoder_multilingual.onnx azure sites does not seem to find the model or thinks it is corrupt. Code works in the emulator and with ngork when running it locally. If I switch to the default English model it all works fine locally and on Azure Sites.
To Reproduce
bf orchestrator:add -t qna --id "<your id here>" -k "<key here>" --routingName <routing name>
Expected behavior
Expect code to find the model and return the right result if using the English or multi-lingual model.
Screenshots
Error in the log file:
Hosting environment: Production
Content root path: D:\home\site\wwwroot
Now listening on: http://127.0.0.1:45432
Application started. Press Ctrl+C to shut down.
EXCEPTION THROWN - utility_onnx::OnnxUtility::InitOnnxSession(): e.what()=Load model from D:\home\site\wwwroot\model\model.onnx failed:bad allocation, FILE=D:\a\1\s\oc\utility\OnnxUtility.h, LINE=117
EXCEPTION THROWN - OC - EmbedderBase::EmbedderBase(json const& config, const string onnxVocabFileDefault, const string onnxModelFileDefault): e.what()=Load model from D:\home\site\wwwroot\model\model.onnx failed:bad allocation, FILE=D:\a\1\s\oc\EmbedderBase.cc, LINE=57
fail: Microsoft.Bot.Builder.Integration.AspNet.Core.BotFrameworkHttpAdapter[0]
[OnTurnError] unhandled error : Failed to find or load Model with path D:\home\site\wwwroot\model
System.InvalidOperationException: Failed to find or load Model with path D:\home\site\wwwroot\model
---> System.ApplicationException: Load model from D:\home\site\wwwroot\model\model.onnx failed:bad allocation
at Microsoft.BotFramework.Orchestrator.Orchestrator..ctor(String baseModelConfigOrPath)
at Microsoft.Bot.Builder.AI.Orchestrator.OrchestratorRecognizer.b__39_0(String path)
--- End of inner exception stack trace ---
at Microsoft.Bot.Builder.AI.Orchestrator.OrchestratorRecognizer.b__39_0(String path)
at System.Collections.Concurrent.ConcurrentDictionary
2.GetOrAdd(TKey key, Func
2 valueFactory)at Microsoft.Bot.Builder.AI.Orchestrator.OrchestratorRecognizer.InitializeModel()
at Microsoft.Bot.Builder.AI.Orchestrator.OrchestratorRecognizer.RecognizeAsync(DialogContext dc, Activity activity, CancellationToken cancellationToken, Dictionary
2 telemetryProperties, Dictionary
2 telemetryMetrics)at SSC.Chatbot.QnABot
1.OnMessageActivityAsync(ITurnContext
1 turnContext, CancellationToken cancellationToken) in D:\a\1\s\Bots\QnABot.cs:line 121at Microsoft.Bot.Builder.ActivityHandler.OnTurnAsync(ITurnContext turnContext, CancellationToken cancellationToken)
at SSC.Chatbot.QnABot`1.OnTurnAsync(ITurnContext turnContext, CancellationToken cancellationToken) in D:\a\1\s\Bots\QnABot.cs:line 97
at Microsoft.Bot.Builder.TelemetryLoggerMiddleware.OnTurnAsync(ITurnContext context, NextDelegate nextTurn, CancellationToken cancellationToken)
at Microsoft.Bot.Builder.Integration.ApplicationInsights.Core.TelemetryInitializerMiddleware.OnTurnAsync(ITurnContext context, NextDelegate nextTurn, CancellationToken cancellationToken)
at Microsoft.Bot.Builder.BotFrameworkAdapter.TenantIdWorkaroundForTeamsMiddleware.OnTurnAsync(ITurnContext turnContext, NextDelegate next, CancellationToken cancellationToken)
at Microsoft.Bot.Builder.MiddlewareSet.ReceiveActivityWithStatusAsync(ITurnContext turnContext, BotCallbackHandler callback, CancellationToken cancellationToken)
at Microsoft.Bot.Builder.BotAdapter.RunPipelineAsync(ITurnContext turnContext, BotCallbackHandler callback, CancellationToken cancellationToken)
Bot displayed error:
File seems to exist at the location it is looking:
Additional details
Tested with the following models:
(fails) --versionId=pretrained.20210205.microsoft.dte.00.06.unicoder_multilingual.onnx
(fails) --versionId=pretrained.20201210.microsoft.dte.00.12.unicoder_multilingual.onnx
(works) --versionId=pretrained.20210608.microsoft.dte.01.06.int.unicoder_multilingual.onnx
(works) no versionId specified.
[bug][Orchestrator]
The text was updated successfully, but these errors were encountered: