Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text-Classification: Invocation exception when cross-validation #2718

Closed
Symbai opened this issue Jun 16, 2023 · 10 comments · Fixed by dotnet/machinelearning#6768
Closed

Text-Classification: Invocation exception when cross-validation #2718

Symbai opened this issue Jun 16, 2023 · 10 comments · Fixed by dotnet/machinelearning#6768
Assignees
Labels
Bug Something isn't working Priority:0 Work that we can't release without
Milestone

Comments

@Symbai
Copy link

Symbai commented Jun 16, 2023

System Information (please complete the following information):

  • Model Builder Version (available in Manage Extensions dialog):
    image

  • Visual Studio Version 17.7.0 Preview 2.0

Describe the bug
Every time I reopen the project (also happens when creating a new project) and I select a DIFFERENT a csv file (small or big one doesnt matter) and I try to pick a label column VS crashes completely. But the "Preview data" shows my new CSV content. Only occurs when a CSV was selected and model trained etc and I select a different a csv file and try to pick the first column. Selecting a csv file for the first time always works.

My mbconfig file: #2716 (comment)

To Reproduce

  1. Create a new text classification model based on a CSV file, train this model
  2. Go to Data tab, pick a new CSV file
  3. Wait until VS shows the content
  4. Open the combobox of the first column (label).
  5. VS crashes. I can reproduce this over and over again

Expected behavior
No crash?

Additional context

Application: devenv.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.Exception
   at Microsoft.ML.ModelBuilder.Configuration.Extension.TrainingConfigurationExtension.SetColumnPurpose(Microsoft.ML.ModelBuilder.Configuration.ITrainingConfiguration, System.String, Microsoft.ML.ModelBuilder.Configuration.ColumnPurposeType)
   at Microsoft.ML.ModelBuilder.Configuration.Extension.TrainingConfigurationExtension.SetLabelName(Microsoft.ML.ModelBuilder.Configuration.ITrainingConfiguration, System.String)
   at Microsoft.ML.ModelBuilder.ToolWindows.LabelColumnComponentControl.PredictColumnSelectionCombo_SelectionChanged(System.Object, System.Windows.Controls.SelectionChangedEventArgs)
   at System.Windows.RoutedEventArgs.InvokeHandler(System.Delegate, System.Object)
   at System.Windows.RoutedEventHandlerInfo.InvokeHandler(System.Object, System.Windows.RoutedEventArgs)
   at System.Windows.EventRoute.InvokeHandlersImpl(System.Object, System.Windows.RoutedEventArgs, Boolean)
   at System.Windows.UIElement.RaiseEventImpl(System.Windows.DependencyObject, System.Windows.RoutedEventArgs)
   at System.Windows.Controls.ComboBox.OnSelectionChanged(System.Windows.Controls.SelectionChangedEventArgs)
   at System.Windows.Controls.Primitives.Selector+SelectionChanger.End()
   at System.Windows.Controls.Primitives.Selector+SelectionChanger.SelectJustThisItem(ItemInfo, Boolean)
   at System.Windows.Controls.ComboBox.NotifyComboBoxItemMouseUp(System.Windows.Controls.ComboBoxItem)
   at System.Windows.Controls.ComboBoxItem.OnMouseLeftButtonUp(System.Windows.Input.MouseButtonEventArgs)
   at System.Windows.RoutedEventArgs.InvokeHandler(System.Delegate, System.Object)
   at System.Windows.RoutedEventHandlerInfo.InvokeHandler(System.Object, System.Windows.RoutedEventArgs)
   at System.Windows.EventRoute.InvokeHandlersImpl(System.Object, System.Windows.RoutedEventArgs, Boolean)
   at System.Windows.UIElement.ReRaiseEventAs(System.Windows.DependencyObject, System.Windows.RoutedEventArgs, System.Windows.RoutedEvent)
   at System.Windows.UIElement.OnMouseUpThunk(System.Object, System.Windows.Input.MouseButtonEventArgs)
   at System.Windows.RoutedEventArgs.InvokeHandler(System.Delegate, System.Object)
   at System.Windows.RoutedEventHandlerInfo.InvokeHandler(System.Object, System.Windows.RoutedEventArgs)
   at System.Windows.EventRoute.InvokeHandlersImpl(System.Object, System.Windows.RoutedEventArgs, Boolean)
   at System.Windows.UIElement.RaiseEventImpl(System.Windows.DependencyObject, System.Windows.RoutedEventArgs)
   at System.Windows.UIElement.RaiseTrustedEvent(System.Windows.RoutedEventArgs)
   at System.Windows.Input.InputManager.ProcessStagingArea()
   at System.Windows.Input.InputManager.ProcessInput(System.Windows.Input.InputEventArgs)
   at System.Windows.Input.InputProviderSite.ReportInput(System.Windows.Input.InputReport)
   at System.Windows.Interop.HwndMouseInputProvider.ReportInput(IntPtr, System.Windows.Input.InputMode, Int32, System.Windows.Input.RawMouseActions, Int32, Int32, Int32)
   at System.Windows.Interop.HwndMouseInputProvider.FilterMessage(IntPtr, MS.Internal.Interop.WindowMessage, IntPtr, IntPtr, Boolean ByRef)
   at System.Windows.Interop.HwndSource.InputFilterMessage(IntPtr, Int32, IntPtr, IntPtr, Boolean ByRef)
   at MS.Win32.HwndWrapper.WndProc(IntPtr, Int32, IntPtr, IntPtr, Boolean ByRef)
   at MS.Win32.HwndSubclass.DispatcherCallbackOperation(System.Object)
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(System.Delegate, System.Object, Int32)
   at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(System.Object, System.Delegate, System.Object, Int32, System.Delegate)
   at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(System.Windows.Threading.DispatcherPriority, System.TimeSpan, System.Delegate, System.Object, Int32)
   at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr, Int32, IntPtr, IntPtr)
@LittleLittleCloud LittleLittleCloud added Bug Something isn't working Priority:0 Work that we can't release without labels Jun 16, 2023
@LittleLittleCloud LittleLittleCloud self-assigned this Jun 16, 2023
@LittleLittleCloud LittleLittleCloud added this to the June 2023 milestone Jun 16, 2023
@LittleLittleCloud
Copy link
Contributor

Should already be fixed in latest main

@scott-weeden
Copy link

I have the latest version and am running visual studio 17.6 (preview beta). The Data Classification one crashes visual studio for me.
Is the VSIX code available open source? I really like the design and layout you used, I am developing another Visual Studio extension for e-commerce templates and am curiuos if you used a XAML or Windows Form.

Unfortunately it does seem a little unstable.

@LittleLittleCloud
Copy link
Contributor

LittleLittleCloud commented Jun 21, 2023 via email

@daikoz
Copy link

daikoz commented Jul 10, 2023

same issue with same scenario with:
ML.NET version 17.17.0.2332602
Microsoft Visual Studio 2022 Version 17.6.4

@LittleLittleCloud
Copy link
Contributor

Reopen it due to not fixed

@LittleLittleCloud
Copy link
Contributor

Hey @Symbai

I just can't reproduce the error. I tried text classification on cpu/gpu using wiki-detox as first dataset and tweet.txt as second dataset. And both training just work fine.

Could you provide more details on how to reproduce this issue? Thanks!

@daikoz
Copy link

daikoz commented Jul 11, 2023

Hi,

After remove all and reinstall VS 2022.

data
test.csv

Scenario

bugvsml.mp4

Exception:

   at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
   at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments)
   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at Microsoft.ML.Runtime.ComponentCatalog.LoadableClassInfo.CreateInstanceCore(Object[] ctorArgs)
   at Microsoft.ML.Runtime.ComponentCatalog.TryCreateInstance[TRes](IHostEnvironment env, Type signatureType, TRes& result, String name, String options, Object[] extra)
   at Microsoft.ML.Runtime.ComponentCatalog.TryCreateInstance[TRes,TSig](IHostEnvironment env, TRes& result, String name, String options, Object[] extra)
   at Microsoft.ML.ModelLoadContext.TryLoadModelCore[TRes,TSig](IHostEnvironment env, TRes& result, Object[] extra)
   at Microsoft.ML.ModelLoadContext.TryLoadModel[TRes,TSig](IHostEnvironment env, TRes& result, RepositoryReader rep, Entry ent, String dir, Object[] extra)
   at Microsoft.ML.ModelLoadContext.LoadModel[TRes,TSig](IHostEnvironment env, TRes& result, RepositoryReader rep, Entry ent, String dir, Object[] extra)
   at Microsoft.ML.ModelLoadContext.LoadModelOrNull[TRes,TSig](IHostEnvironment env, TRes& result, RepositoryReader rep, String dir, Object[] extra)
   at Microsoft.ML.ModelLoadContext.LoadModel[TRes,TSig](IHostEnvironment env, TRes& result, RepositoryReader rep, String dir, Object[] extra)
   at Microsoft.ML.ModelOperationsCatalog.Load(Stream stream, DataViewSchema& inputSchema)
   at Microsoft.ML.ModelOperationsCatalog.Load(String filePath, DataViewSchema& inputSchema)
   at Microsoft.ML.ModelBuilder.AutoMLService.ServiceFactory.CodeGeneratorService.SetTorchRunTimeFolderAndLoadModel(ITrainingConfiguration configuration, String modelPath, MLContext& context, ITransformer& model, DataViewSchema& inputSchema) in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/ServiceFactory/CodeGeneratorService.cs:line 139
   at Microsoft.ML.ModelBuilder.AutoMLService.ServiceFactory.CodeGeneratorService.GenerateConsumptionAsync(ITrainingConfiguration configuration, String trainingConfigurationFolder, String nameSpace, String className, TargetType target, String[] labels, CancellationToken ct) in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/ServiceFactory/CodeGeneratorService.cs:line 155   at StreamJsonRpc.JsonRpc.<InvokeCoreAsync>d__151`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.ML.ModelBuilder.ViewModels.TrainViewModel.<GenerateCodeBehindFilesAsync>d__100.MoveNext()

Log output:

Set log file path to C:\Users\CC\AppData\Local\Temp\MLVSTools\logs\MLModel1-AXB532.txt
start text classification 
restore "c:\program files\microsoft visual studio\2022\community\common7\ide\extensions\aeowiwmr.5oo\AutoMLService\RuntimeManager\torchsharp.cpu.csproj" --configfile "c:\program files\microsoft visual studio\2022\community\common7\ide\extensions\aeowiwmr.5oo\AutoMLService\RuntimeManager\NuGet.config" -r win-x64 /p:UsingToolXliff=false /p:TorchSharpVersion=0.99.5 /p:TorchSharpCudaRuntimeVersion=1.13.0.1 /p:TensorflowRuntimeVersion=2.3.1 /p:BaseIntermediateOutputPath="C:\Users\CC\AppData\Local\Temp\ModelBuilder\torchsharp-cpu-0.99.5\obj"
publish "c:\program files\microsoft visual studio\2022\community\common7\ide\extensions\aeowiwmr.5oo\AutoMLService\RuntimeManager\torchsharp.cpu.csproj" -r win-x64 -c Release --no-self-contained -o "C:\Users\CC\AppData\Local\Temp\ModelBuilder\torchsharp-cpu-0.99.5" --no-restore /p:UsingToolXliff=false /p:TorchSharpVersion=0.99.5 /p:TorchSharpCudaRuntimeVersion=1.13.0.1 /p:TensorflowRuntimeVersion=2.3.1 /p:BaseOutputPath="C:\Users\CC\AppData\Local\Temp\ModelBuilder\torchsharp-cpu-0.99.5\bin\\" /p:BaseIntermediateOutputPath="C:\Users\CC\AppData\Local\Temp\ModelBuilder\torchsharp-cpu-0.99.5\obj\\"
start installing runtime in C:\Users\CC\AppData\Local\Temp\ModelBuilder\torchsharp-cpu-0.99.5
Failed to read environment variable [DOTNET_STARTUP_HOOKS], HRESULT: 0x800700CB
  Determining projects to restore...
  All projects are up-to-date for restore.
Failed to read environment variable [DOTNET_STARTUP_HOOKS], HRESULT: 0x800700CB
MSBuild version 17.6.8+c70978d4d for .NET
  torchsharp.cpu -> C:\Users\CC\AppData\Local\Temp\ModelBuilder\torchsharp-cpu-0.99.5\bin\Release\netstandard2.0\win-x64\torchsharp.cpu.dll
  torchsharp.cpu -> C:\Users\CC\AppData\Local\Temp\ModelBuilder\torchsharp-cpu-0.99.5\
install runtime successfully
Use cross validation with fold: 5
|      Trainer                             MacroAccuracy Duration    |
|--------------------------------------------------------------------|
|0     TextClassificationMulti             0.0867     35.2520        |
|--------------------------------------------------------------------|
|                          Experiment Results                        |
|--------------------------------------------------------------------|
|                               Summary                              |
|--------------------------------------------------------------------|
|ML Task: text classification                                        |
|Dataset: C:\Users\CC\Documents\aaa\test.csv                 |
|Total experiment time :    35.2520 Secs                             |
|Label : Category                                                    |
|Total number of models explored: 1                                  |
|--------------------------------------------------------------------|
|                        Top 1 models explored                       |
|--------------------------------------------------------------------|
|      Trainer                             MacroAccuracy Duration    |
|--------------------------------------------------------------------|
|0     TextClassificationMulti             0.0867     35.2520        |
|--------------------------------------------------------------------|
Generate code behind files

@LittleLittleCloud
Copy link
Contributor

OK, now I'm able to reproduce this error, will get back to you once we found the solution/fix

@LittleLittleCloud
Copy link
Contributor

Work-around: use train-validation split ratio instead of cross-validation

@LittleLittleCloud LittleLittleCloud changed the title Visual Studio crashes after choosing a different CSV file and trying to pick a column Text-Classification: Invocation exception when cross-validation Jul 19, 2023
@LittleLittleCloud
Copy link
Contributor

The error is because the shape of deep learning model is fixed the first time the pipeline called Fit. Since the shape of deep learning model is determined by dataset (like # of labels), it might cause shape mismatch exception when the meta-info is different for different fold of cross-validation split.

The current work-around is to use train-validation split when encounter this issue while ML.Net team are working on a fix .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Priority:0 Work that we can't release without
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants