Global Field Importance for Clusters

This script determines which input fields are most important for differentiating between clusters. Given a cluster id, it:

  • Using batchcentroid, extends the dataset the cluster was built on with the centroid id each instance is a member of.
  • Builds an ensemble (random forest) on that centroid id field.
  • Averages the importance of each input field across all models in the ensemble.
  • Returns a sorted list of input fields by importance.