Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add source generator to generate search space class for all trainers/transformers #6090

Merged

Conversation

LittleLittleCloud
Copy link
Contributor

@LittleLittleCloud LittleLittleCloud commented Feb 16, 2022

We are excited to review your PR.

So we can do the best job, please check:

  • There's a descriptive title that will make sense to other developers some time from now.
  • There's associated issues. All PR's should have issue(s) associated - unless a trivial self-evident change such as fixing a typo. You can use the format Fixes #nnnn in your description to cause GitHub to automatically close the issue(s) when your PR is merged.
  • Your change description explains what the change does, why you chose your approach, and anything else that reviewers should know.
  • You have included any necessary tests in the same PR.

#5993

This PR adds a source generator that generates SearchSpace classes for all estimators used in AutoML.Net and associated search space configurations which is in Json format.

The generated SearchSpace class looks like this (following is SearchSpace for LGBM)

using Microsoft.ML.SearchSpace;
using OptionAttribute = Microsoft.ML.SearchSpace.OptionAttribute;
using ColorsOrder = Microsoft.ML.Transforms.Image.ImagePixelExtractingEstimator.ColorsOrder;
using ColorBits = Microsoft.ML.Transforms.Image.ImagePixelExtractingEstimator.ColorBits;
using ResizingKind = Microsoft.ML.Transforms.Image.ImageResizingEstimator.ResizingKind;
using Anchor = Microsoft.ML.Transforms.Image.ImageResizingEstimator.Anchor;

#nullable enable

namespace Microsoft.ML.AutoML.CodeGen
{
    public class LgbmOption
    {
        [Range((int)4, (int)32768, init: (int)4, logBase: true)]
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public int NumberOfLeaves {get; set;} = 4;
        [Range((int)20, (int)1024, init: (int)20, logBase: true)]
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public int MinimumExampleCountPerLeaf {get; set;} = 20;
        [Range((double)2E-10, (double)1, init: (double)1, logBase: true)]
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public double LearningRate {get; set;} = 1;
        [Range((int)4, (int)32768, init: (int)4, logBase: true)]
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public int NumberOfTrees {get; set;} = 4;
        [Range((double)2E-10, (double)1, init: (double)1, logBase: true)]
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public double SubsampleFraction {get; set;} = 1;
        [Range((int)8, (int)1024, init: (int)256, logBase: true)]
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public int MaximumBinCountPerFeature {get; set;} = 256;
        [Range((double)2E-10, (double)1, init: (double)1, logBase: false)]
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public double FeatureFraction {get; set;} = 1;
        [Range((double)2E-10, (double)1, init: (double)2E-10, logBase: true)]
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public double L1Regularization {get; set;} = 2E-10;
        [Range((double)2E-10, (double)1, init: (double)1, logBase: true)]
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public double L2Regularization {get; set;} = 1;
        
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public string LabelColumnName {get; set;} = "Label";
        
        [System.Diagnostics.CodeAnalysis.SuppressMessage("Declaration", "MSML_NoInstanceInitializers:No initializers on instance fields or properties")]
        public string FeatureColumnName {get; set;} = "Feature";
        
        public string? ExampleWeightColumnName {get; set;}

    }
}

@LittleLittleCloud LittleLittleCloud added the AutoML.NET Automating various steps of the machine learning process label Feb 16, 2022
@LittleLittleCloud LittleLittleCloud changed the title Add source generator to generate search space class for all trainers Add source generator to generate search space class for all trainers/transformers Feb 16, 2022
@codecov
Copy link

codecov bot commented Feb 16, 2022

Codecov Report

Merging #6090 (b207cd2) into main (ea8ced0) will decrease coverage by 0.35%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #6090      +/-   ##
==========================================
- Coverage   68.63%   68.28%   -0.36%     
==========================================
  Files        1165     1089      -76     
  Lines      246498   241401    -5097     
  Branches    25703    25145     -558     
==========================================
- Hits       169188   164834    -4354     
+ Misses      70596    70019     -577     
+ Partials     6714     6548     -166     
Flag Coverage Δ
Debug 68.28% <ø> (-0.36%) ⬇️
production 62.81% <ø> (-0.59%) ⬇️
test 88.74% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ML.Transforms/Text/StopWordsRemovingTransformer.cs 86.23% <0.00%> (-0.15%) ⬇️
src/Microsoft.ML.AutoML/API/ExperimentSettings.cs
....AutoML/ColumnInference/ColumnGroupingInference.cs
....AutoML/EstimatorExtensions/EstimatorExtensions.cs
...oft.ML.AutoML/Experiment/Runners/CrossValRunner.cs
...Microsoft.ML.AutoML/Experiment/SuggestedTrainer.cs
....ML.AutoML/PipelineSuggesters/PipelineSuggester.cs
...AutoML/TrainerExtensions/MultiTrainerExtensions.cs
...ML.AutoML/TransformInference/TransformInference.cs
...AutoML/TransformInference/TransformInferenceApi.cs
... and 68 more

Copy link
Member

@michaelgsharp michaelgsharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@LittleLittleCloud LittleLittleCloud merged commit 3055403 into dotnet:main Mar 2, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Apr 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
AutoML.NET Automating various steps of the machine learning process
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants