Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent LatentDirichletAllocationTransformer from disposal prior to end of test #4620

Merged
merged 1 commit into from Jan 3, 2020

Conversation

@sharwell
Copy link
Member

sharwell commented Jan 3, 2020

Fixes test exceptions which look like this:

  X Microsoft.ML.RunTests.TestDataPipeNoBaseline.TestLDATransform(iteration: 92) [198ms]
  Error Message:
   System.InvalidOperationException : Splitter/consolidator worker encountered exception while consuming source data
---- System.ObjectDisposedException : Safe handle has been closed
  Stack Trace:
     at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes) in D:\a\1\s\src\Microsoft.ML.Data\Data\DataViewUtils.cs:line 832
   at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore() in D:\a\1\s\src\Microsoft.ML.Data\Data\DataViewUtils.cs:line 1102
   at Microsoft.ML.Data.RootCursorBase.MoveNext() in D:\a\1\s\src\Microsoft.ML.Core\Data\RootCursorBase.cs:line 65
   at Microsoft.ML.RunTests.TestDataPipeNoBaseline.TestLDATransform(Int32 iteration) in D:\a\1\s\test\Microsoft.ML.TestFramework\DataPipe\TestDataPipe.cs:line 1462
----- Inner Stack Trace -----
   at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
   at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
   at Microsoft.ML.TextAnalytics.LdaInterface.InitializeBeforeTest(SafeLdaEngineHandle engine)
   at Microsoft.ML.Transforms.Text.LatentDirichletAllocationTransformer.LdaState.Output(VBuffer`1& src, VBuffer`1& dst, Int32 numBurninIter, Boolean reset) in D:\a\1\s\src\Microsoft.ML.Transforms\Text\LdaTransform.cs:line 470
   at Microsoft.ML.Transforms.Text.LatentDirichletAllocationTransformer.Mapper.<>c__DisplayClass5_0.<GetTopic>b__0(VBuffer`1& dst) in D:\a\1\s\src\Microsoft.ML.Transforms\Text\LdaTransform.cs:line 612
   at Microsoft.ML.Data.DataViewUtils.Splitter.InPipe.Impl`1.Fill() in D:\a\1\s\src\Microsoft.ML.Data\Data\DataViewUtils.cs:line 723
   at Microsoft.ML.Data.DataViewUtils.Splitter.<>c__DisplayClass5_1.<ConsolidateCore>b__2() in D:\a\1\s\src\Microsoft.ML.Data\Data\DataViewUtils.cs:line 415

The issue occurred because the finalizer for LatentDirichletAllocationTransformer was causing states to get disposed while they were still in use.

@sharwell sharwell requested a review from dotnet/mlnet-core as a code owner Jan 3, 2020
@codecov

This comment has been minimized.

Copy link

codecov bot commented Jan 3, 2020

Codecov Report

Merging #4620 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #4620      +/-   ##
==========================================
- Coverage   75.66%   75.65%   -0.02%     
==========================================
  Files         938      938              
  Lines      168623   168626       +3     
  Branches    18206    18206              
==========================================
- Hits       127582   127567      -15     
- Misses      36005    36025      +20     
+ Partials     5036     5034       -2
Flag Coverage Δ
#Debug 75.65% <100%> (-0.02%) ⬇️
#production 71.27% <ø> (-0.02%) ⬇️
#test 90.43% <100%> (ø) ⬆️
Impacted Files Coverage Δ
...icrosoft.ML.TestFramework/DataPipe/TestDataPipe.cs 91.99% <100%> (+0.02%) ⬆️
src/Microsoft.ML.Core/Data/ProgressReporter.cs 70.95% <0%> (-6.99%) ⬇️
....ML.AutoML/PipelineSuggesters/PipelineSuggester.cs 83.19% <0%> (-3.37%) ⬇️
src/Microsoft.ML.Maml/MAML.cs 24.75% <0%> (-1.46%) ⬇️
src/Microsoft.ML.AutoML/Sweepers/Parameters.cs 84.32% <0%> (-0.85%) ⬇️
src/Microsoft.ML.Transforms/Text/LdaTransform.cs 85.38% <0%> (+0.93%) ⬆️
...rosoft.ML.AutoML/ColumnInference/TextFileSample.cs 62.25% <0%> (+2.64%) ⬆️
Copy link
Member

frank-dong-ms left a comment

:shipit:

@eerhardt

This comment has been minimized.

Copy link
Member

eerhardt commented Jan 3, 2020

Nice investigation, @sharwell!

What do you think about fixing the LatentDirichletAllocationTransformer class itself as well? Basically, it shouldn't be touching other managed objects during Dispose(disposing: false). Instead, the LdaSingleBox class should have a finalizer on it, and clean up the native resources using the dispose pattern appropriately.

@sharwell

This comment has been minimized.

Copy link
Member Author

sharwell commented Jan 3, 2020

I can make the update. Note that the unmanaged resource is held in a safe handle, so no finalizers are necessary (or desirable).

@eerhardt

This comment has been minimized.

Copy link
Member

eerhardt commented Jan 3, 2020

Good point. I was looking at an old version of the code, before it used a safe handle.

@sharwell

This comment has been minimized.

Copy link
Member Author

sharwell commented Jan 3, 2020

Submitted #4621 to remove the unnecessary finalizer.

Copy link
Member

eerhardt left a comment

Looks good. Thanks @sharwell.

@@ -1475,6 +1477,10 @@ public void TestLDATransform()
Assert.True(resultThirdRow.GetItemOrDefault(1) == 0);
Assert.True(resultThirdRow.GetItemOrDefault(2) == 1.0);
}
finally
{
ldaTransformer.Dispose();

This comment has been minimized.

Copy link
@eerhardt

eerhardt Jan 3, 2020

Member

Why not use a using?

This comment has been minimized.

Copy link
@sharwell

sharwell Jan 3, 2020

Author Member

The type does not implement IDisposable.

This comment has been minimized.

Copy link
@eerhardt

eerhardt Jan 3, 2020

Member

Should it?

This comment has been minimized.

Copy link
@sharwell

sharwell Jan 3, 2020

Author Member

It could, but that would be part of #4621 if the change is made.

@sharwell sharwell merged commit 5c949a9 into dotnet:master Jan 3, 2020
19 checks passed
19 checks passed
MachineLearning-CI Build #20200102.7 succeeded
Details
MachineLearning-CI (Centos_x64_NetCoreApp30 Debug_Build) Centos_x64_NetCoreApp30 Debug_Build succeeded
Details
MachineLearning-CI (Centos_x64_NetCoreApp30 Release_Build) Centos_x64_NetCoreApp30 Release_Build succeeded
Details
MachineLearning-CI (MacOS_x64_NetCoreApp21 Debug_Build) MacOS_x64_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (MacOS_x64_NetCoreApp21 Release_Build) MacOS_x64_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CI (Ubuntu_x64_NetCoreApp21 Debug_Build) Ubuntu_x64_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (Ubuntu_x64_NetCoreApp21 Release_Build) Ubuntu_x64_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp21 Debug_Build) Windows_x64_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp21 Release_Build) Windows_x64_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp30 Debug_Build) Windows_x64_NetCoreApp30 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp30 Release_Build) Windows_x64_NetCoreApp30 Release_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetFx461 Debug_Build) Windows_x64_NetFx461 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetFx461 Release_Build) Windows_x64_NetFx461 Release_Build succeeded
Details
MachineLearning-CI (Windows_x86_NetCoreApp21 Debug_Build) Windows_x86_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x86_NetCoreApp21 Release_Build) Windows_x86_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CodeCoverage Build #20200102.7 succeeded
Details
MachineLearning-CodeCoverage (Windows_x64 Build_Debug) Windows_x64 Build_Debug succeeded
Details
WIP Ready for review
Details
license/cla All CLA requirements met.
@sharwell sharwell deleted the sharwell:early-finalization branch Jan 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.