Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize the LR test #4446

Merged
merged 3 commits into from Nov 11, 2019

Conversation

@bpstark
Copy link
Member

bpstark commented Nov 6, 2019

Found issue with how we were using random for our
ImageClassificationTrainer. This caused instability in our unit test, as
we were not able to control the random seed. Modified the code to now
use the same random object throughout, the trainer, thus allowing us to
control the seed and therefor have predictable output.

@bpstark bpstark requested review from dotnet/mlnet-core as code owners Nov 6, 2019
@codecov

This comment has been minimized.

Copy link

codecov bot commented Nov 6, 2019

Codecov Report

Merging #4446 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #4446      +/-   ##
==========================================
- Coverage   74.72%   74.71%   -0.02%     
==========================================
  Files         906      906              
  Lines      159288   159285       -3     
  Branches    17145    17145              
==========================================
- Hits       119029   119003      -26     
- Misses      35452    35475      +23     
  Partials     4807     4807
Flag Coverage Δ
#Debug 74.71% <100%> (-0.02%) ⬇️
#production 70.07% <100%> (-0.03%) ⬇️
#test 90.12% <100%> (ø) ⬆️
Impacted Files Coverage Δ
.../Microsoft.ML.Vision/ImageClassificationTrainer.cs 92.18% <100%> (ø) ⬆️
test/Microsoft.ML.AutoML.Tests/AutoFitTests.cs 82.26% <100%> (ø) ⬆️
...cenariosWithDirectInstantiation/TensorflowTests.cs 91.35% <100%> (+0.2%) ⬆️
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0%> (-20.52%) ⬇️
src/Microsoft.ML.AutoML/Sweepers/ISweeper.cs 59.49% <0%> (-7.6%) ⬇️
src/Microsoft.ML.AutoML/Sweepers/Parameters.cs 77.96% <0%> (-5.09%) ⬇️
src/Microsoft.ML.Maml/MAML.cs 24.75% <0%> (-1.46%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 84.9% <0%> (-0.21%) ⬇️
...L.AutoML/TrainerExtensions/TrainerExtensionUtil.cs 86.89% <0%> (+1.74%) ⬆️
#Closed
@codemzs

This comment has been minimized.

Copy link
Member

codemzs commented Nov 7, 2019

@bpstark The polynomial LR test still seems to be failing, see below snapshot from test results.

image
#Resolved

Brian Stark
Found issue with how we were using random for our
ImageClassificationTrainer. This caused instability in our unit test, as
we were not able to control the random seed. Modified the code to now
use the same random object throughout, the trainer, thus allowing us to
control the seed and therefor have predictable output.
@bpstark bpstark force-pushed the bpstark:stabilizeTest branch from 240758b to 217c1af Nov 7, 2019
@@ -937,7 +937,7 @@ private int GetNumSamples(string path)
metrics.Train.LearningRate = learningRate;
// Update train state.
trainstate.CurrentEpoch = epoch;
using (var cursor = trainingSet.GetRowCursor(trainingSet.Schema.ToArray(), new Random()))
using (var cursor = trainingSet.GetRowCursor(trainingSet.Schema.ToArray(), Host.Rand))

This comment has been minimized.

Copy link
@codemzs

codemzs Nov 8, 2019

Member

trainingSet.GetRowCursor(trainingSet.Schema.ToArray(), Host.Rand)) [](start = 36, length = 66)

You don't need to pass anything, look at the code for ShuffleRowTransformer:

        if (_forceShuffle || _forceShuffleSource)
            _forceShuffleSeed = options.ForceShuffleSeed ?? Host.Rand.NextSigned(); #Resolved

This comment has been minimized.

Copy link
@bpstark

bpstark Nov 8, 2019

Author Member

While you are correct that if we do not pass anything then it should still be stable, RowShuffling uses host to create a seed to then create a new random, this seems unnecessary, as we would be creating a new object when we can just use an existing random which is more efficient.


In reply to: 344297118 [](ancestors = 344297118)

This comment has been minimized.

Copy link
@codemzs

codemzs Nov 8, 2019

Member

I discussed offline with @bpstark and we agreed that we should not be passing in Host.Rand object (as I suggested earlier) because it is not thread safe. As I explained, the shuffling transformer code will create a seed from Host.Rand object (that it has access internally) to create an local/internal Rand object and this will ensure stability and also remove concerns around thread safety. #Resolved

Brian Stark
@codemzs

This comment has been minimized.

Copy link
Member

codemzs commented Nov 8, 2019

@bpstark Thanks! Can you please confirm samples run without any change in accuracy? Otherwise the change looks great. #Resolved

@bpstark

This comment has been minimized.

Copy link
Member Author

bpstark commented Nov 8, 2019

Actually accuracy did change, it improved, it just happens that this random number is slightly better than the previously used random number, and since the dataset is so small it made an actual impact.


In reply to: 551956033 [](ancestors = 551956033)

Brian Stark
Copy link
Member

codemzs left a comment

:shipit:

@bpstark bpstark merged commit bcdac55 into dotnet:master Nov 11, 2019
17 of 19 checks passed
17 of 19 checks passed
MachineLearning-CodeCoverage Build #20191108.3 failed
Details
MachineLearning-CodeCoverage (Windows_x64 Build_Debug) Windows_x64 Build_Debug was canceled
Details
MachineLearning-CI Build #20191108.3 had test failures
Details
MachineLearning-CI (Centos_x64_NetCoreApp30 Debug_Build) Centos_x64_NetCoreApp30 Debug_Build succeeded
Details
MachineLearning-CI (Centos_x64_NetCoreApp30 Release_Build) Centos_x64_NetCoreApp30 Release_Build succeeded
Details
MachineLearning-CI (MacOS_x64_NetCoreApp21 Debug_Build) MacOS_x64_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (MacOS_x64_NetCoreApp21 Release_Build) MacOS_x64_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CI (Ubuntu_x64_NetCoreApp21 Debug_Build) Ubuntu_x64_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (Ubuntu_x64_NetCoreApp21 Release_Build) Ubuntu_x64_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp21 Debug_Build) Windows_x64_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp21 Release_Build) Windows_x64_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp30 Debug_Build) Windows_x64_NetCoreApp30 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp30 Release_Build) Windows_x64_NetCoreApp30 Release_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetFx461 Debug_Build) Windows_x64_NetFx461 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetFx461 Release_Build) Windows_x64_NetFx461 Release_Build succeeded
Details
MachineLearning-CI (Windows_x86_NetCoreApp21 Debug_Build) Windows_x86_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x86_NetCoreApp21 Release_Build) Windows_x86_NetCoreApp21 Release_Build succeeded
Details
WIP Ready for review
Details
license/cla All CLA requirements met.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.