-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
System Information (please complete the following information):
- OS & Version: Windows 11
- ML.NET Version: ML.NET v4.0.2
- .NET Version: .NET 9.0
Describe the bug
FastForestRegression and FastForestOva (unlike FastForestBinary) do not modify the NumberOfLeaves hyperparameter despite being defined in the associated search space. Consequently, while performance for BinaryClassification is on par with comparable AutoML frameworks (e.g. FLAML using exclusively RandomForest), it falls behind for the other task types.
To Reproduce
Steps to reproduce the behavior:
- Train an AutoML experiment using a SweepablePipeline for either Regression or MulticlassClassification
- Set all non-FastForest trainers to false (such that only FastForest remains)
- Append an experiment monitor and log the TrialResult.TrialSettings.Parameters
- Compare the best model to the reported parameters
Expected behavior
I would expect the model hyperparameters logged from the experiment monitor's final ReportBestTrial() method to match the best model obtained from the experiment. However, while other model hyperparameters are in alignment, the NumberOfLeaves looks suspiciously like the default FastTree value.
Screenshots, Code, Sample Projects
Trial 31 obtained new best model with (loss: 63689.36950302901, metric: 63689.36950302901)
{"pipeline":{"SCHEMA":"e0 * e2 * e3 * e4","e0":{"OutputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income"],"InputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income"]},"e2":{"OutputColumnNames":["ocean_proximity"],"InputColumnNames":["ocean_proximity"]},"e3":{"InputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income","ocean_proximity"],"OutputColumnName":"Features"},"e4":{"NumberOfTrees":15,"NumberOfLeaves":81,"FeatureFraction":0.8326445,"LabelColumnName":"median_house_value","FeatureColumnName":"Features"}},"SCHEMA":"e0 * e1 * e3 * e4","e0":{"OutputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income"],"InputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income"]},"e1":{"OutputColumnNames":["ocean_proximity"],"InputColumnNames":["ocean_proximity"]},"e2":{"OutputColumnNames":["ocean_proximity"],"InputColumnNames":["ocean_proximity"]},"e3":{"InputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income","ocean_proximity"],"OutputColumnName":"Features"},"e4":{"NumberOfTrees":100,"NumberOfLeaves":100,"FeatureFraction":1,"LabelColumnName":"median_house_value","FeatureColumnName":"Features"}}
