Skip to content

Second time prediction using PredictionEngine in ML.Net throws SEHException in Azure Function deployed to Cloud. Works fine locally. #5361

@praveenraghuvanshi

Description

@praveenraghuvanshi

System information

  • OS version/distro: Windows 10 64 bit
  • .NET Version (eg., dotnet --info):
    C:\Users\praghuvanshi>dotnet --info
    .NET Core SDK (reflecting any global.json):
    Version: 3.1.401
    Commit: 5b6f5e5005

Runtime Environment:
OS Name: Windows
OS Version: 10.0.19041
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\3.1.401\

Host (useful for support):
Version: 3.1.7
Commit: fcfdef8d6b

.NET Core SDKs installed:
3.1.401 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed:
Microsoft.AspNetCore.All 2.1.21 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.21 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 3.1.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 2.1.21 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 3.1.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.WindowsDesktop.App 3.1.7 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

To install additional .NET Core runtimes or SDKs:
https://aka.ms/dotnet-download

Issue

  • What did you do?

I am working on a sample image classification problem of classifying dog and cat. I have used AutoML for Image Classification with 10 images each of cat and dog and generated the model(MLModel.zip~93MB) and c# code. ... I have been successful in loading the model in a function app locally and it works flawlessly giving the predictions properly. Source code attached.

Steps:

  • Create Image Classification project using AutoML-Model Builder
  • Generate Model(MLModel.zip) and C# Code
  • Use MLModel.zip in Azure function
  • Run it locally - Works fine
  • Publish to Azure Function(Cloud) - Function App(Windows)
  • 'tensorflow' DllNotFound exception is thrown
  • Change 'Deployment Mode' to 'Self-Contained' and Target Runtime to 'win-x64'. Also change Platform Settings from 32-bit to 64-bit in Azure Function setting in portal.
  • Hit API from REST Client(Postman) : Classification is done successfuly
  • Hit API second time or consecutively - SSHException is thrown at PredictionEngine.Predict() method.

What happened?

  • Deploying the same function over Azure Function(Cloud) gave DllNotFound exception for tensorflow dll during loading of the model.
  • Referred https://developers.de/2019/10/25/hosting-ml-net-in-appservice/ and changed Target Runtime to x64. Tensorflow dll not found error disappeared.
  • Hit Function API through Postman first time after above change and it was successful with proper prediction.
  • However, when API is hit second or consecutively, SSHException is thrown during prediction using PredictionEngine.

What did you expect?

Source code / logs

Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.

Source Code: Attached zip file : src.zip
Azure Function Project: TestImageClassificationFunctionApp
Steps, call stack, snippets, logs: ./testimageclassification/Readme.md
Diagnostic logs: ./testimageclassification/diagnosis/

Exception: System.Runtime.InteropServices.SEHException
FailedMethod: Tensorflow.c_api.TF_SessionRun

System.Runtime.InteropServices.SEHException:
   at Tensorflow.c_api.TF_SessionRun (TensorFlow.NET, Version=0.11.8.1, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51)
   at Microsoft.ML.TensorFlow.TensorFlowUtils+Runner.Run (Microsoft.ML.TensorFlow, Version=1.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51)
   at Microsoft.ML.Vision.ImageClassificationModelParameters+Classifier.Score (Microsoft.ML.Vision, Version=1.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51)
   at Microsoft.ML.Vision.ImageClassificationModelParameters+<>c__DisplayClass22_0`2.<Microsoft.ML.Data.IValueMapper.GetMapper>b__0 (Microsoft.ML.Vision, Version=1.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51)
   at Microsoft.ML.Data.PredictedLabelScorerBase.EnsureCachedPosition (Microsoft.ML.Data, Version=1.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51)
   at Microsoft.ML.Data.MulticlassClassificationScorer+<>c__DisplayClass16_0.<GetPredictedLabelGetter>b__0 (Microsoft.ML.Data, Version=1.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51)
   at Microsoft.ML.Transforms.KeyToValueMappingTransformer+Mapper+KeyToValueMap`2+<>c__DisplayClass8_0.<GetMappingGetter>b__0 (Microsoft.ML.Data, Version=1.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51)
   at Microsoft.ML.Data.TypedCursorable`1+TypedRowBase+<>c__DisplayClass9_0`2.<CreateConvertingActionSetter>b__0 (Microsoft.ML.Data, Version=1.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51)
   at Microsoft.ML.Data.TypedCursorable`1+TypedRowBase.FillValues (Microsoft.ML.Data, Version=1.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51)
   at TestImageClassificationFunctionApp.ClassifyImage+<Run>d__0.MoveNext (TestImageClassificationFunctionApp, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null)

**Azure Function *.csproj **

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>netcoreapp3.1</TargetFramework>
    <AzureFunctionsVersion>v3</AzureFunctionsVersion>
    <UserSecretsId>xxxxx-xxxx-xxxxx-xxxxxx</UserSecretsId>
    <Platforms>AnyCPU;x64</Platforms>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Azure.Storage.Blobs" Version="12.5.1" />
    <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.Storage" Version="3.0.10" />
    <PackageReference Include="Microsoft.ML" Version="1.5.1" />
    <PackageReference Include="Microsoft.ML.ImageAnalytics" Version="1.5.1" />
    <PackageReference Include="Microsoft.ML.Vision" Version="1.5.1" />
    <PackageReference Include="Microsoft.NET.Sdk.Functions" Version="3.0.7" />
    <PackageReference Include="SciSharp.TensorFlow.Redist" Version="2.1.0" />
  </ItemGroup>
  <ItemGroup>
    <None Update="host.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
    </None>
    <None Update="local.settings.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <CopyToPublishDirectory>Never</CopyToPublishDirectory>
    </None>
  </ItemGroup>
</Project>

Let me know in case any more information is required.

src.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    AutoML.NETAutomating various steps of the machine learning processP1Priority of the issue for triage purpose: Needs to be fixed soon.bugSomething isn't workingimageBugs related image datatype tasks

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions