Skip to content

FastForest NumberOfLeaves hyperparameter not updating in Regression/MulticlassClassification SweepablePipeline #7498

@JoshuaSloan

Description

@JoshuaSloan

System Information (please complete the following information):

  • OS & Version: Windows 11
  • ML.NET Version: ML.NET v4.0.2
  • .NET Version: .NET 9.0

Describe the bug
FastForestRegression and FastForestOva (unlike FastForestBinary) do not modify the NumberOfLeaves hyperparameter despite being defined in the associated search space. Consequently, while performance for BinaryClassification is on par with comparable AutoML frameworks (e.g. FLAML using exclusively RandomForest), it falls behind for the other task types.

To Reproduce
Steps to reproduce the behavior:

  1. Train an AutoML experiment using a SweepablePipeline for either Regression or MulticlassClassification
  2. Set all non-FastForest trainers to false (such that only FastForest remains)
  3. Append an experiment monitor and log the TrialResult.TrialSettings.Parameters
  4. Compare the best model to the reported parameters

Expected behavior
I would expect the model hyperparameters logged from the experiment monitor's final ReportBestTrial() method to match the best model obtained from the experiment. However, while other model hyperparameters are in alignment, the NumberOfLeaves looks suspiciously like the default FastTree value.

Screenshots, Code, Sample Projects
Trial 31 obtained new best model with (loss: 63689.36950302901, metric: 63689.36950302901)
{"pipeline":{"SCHEMA":"e0 * e2 * e3 * e4","e0":{"OutputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income"],"InputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income"]},"e2":{"OutputColumnNames":["ocean_proximity"],"InputColumnNames":["ocean_proximity"]},"e3":{"InputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income","ocean_proximity"],"OutputColumnName":"Features"},"e4":{"NumberOfTrees":15,"NumberOfLeaves":81,"FeatureFraction":0.8326445,"LabelColumnName":"median_house_value","FeatureColumnName":"Features"}},"SCHEMA":"e0 * e1 * e3 * e4","e0":{"OutputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income"],"InputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income"]},"e1":{"OutputColumnNames":["ocean_proximity"],"InputColumnNames":["ocean_proximity"]},"e2":{"OutputColumnNames":["ocean_proximity"],"InputColumnNames":["ocean_proximity"]},"e3":{"InputColumnNames":["longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income","ocean_proximity"],"OutputColumnName":"Features"},"e4":{"NumberOfTrees":100,"NumberOfLeaves":100,"FeatureFraction":1,"LabelColumnName":"median_house_value","FeatureColumnName":"Features"}}

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    untriagedNew issue has not been triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions