# Exploring search results

After finishing a pipeline search, we can inspect the results. First, let's build a search of 10 different pipelines to explore.

In [1]:
import evalml

X, y = evalml.demos.load_breast_cancer()

clf = evalml.AutoClassifier(objective="f1",
                            max_pipelines=10)

clf.fit(X, y)

[1m*****************************[0m
[1m* Beginning pipeline search *[0m
[1m*****************************[0m

Optimizing for F1. Greater score is better.

Searching up to 10 pipelines. 
Possible model types: xgboost, random_forest, linear_model

✔ XGBoost Classifier w/ One Hot Encod...     0%|          | Elapsed:00:00
✔ XGBoost Classifier w/ One Hot Encod...    10%|█         | Elapsed:00:00
✔ Random Forest Classifier w/ One Hot...    20%|██        | Elapsed:00:06
✔ XGBoost Classifier w/ One Hot Encod...    30%|███       | Elapsed:00:06
✔ Logistic Regression Classifier w/ O...    40%|████      | Elapsed:00:14
✔ XGBoost Classifier w/ One Hot Encod...    50%|█████     | Elapsed:00:14
✔ Logistic Regression Classifier w/ O...    60%|██████    | Elapsed:00:21
✔ XGBoost Classifier w/ One Hot Encod...    70%|███████   | Elapsed:00:22
✔ Logistic Regression Classifier w/ O...    80%|████████  | Elapsed:00:29
✔ Logistic Regression Classifier w/ O...    90%|█████████ | Elapsed:00:37
✔ Logisti

## View Rankings
A summary of all the pipelines built can be returned as a dataframe. It is sorted by score. EvalML knows based on your objective function whether or not high or lower is better.

In [2]:
clf.rankings

Unnamed: 0,id,pipeline_name,score,high_variance_cv,parameters
0,8,LogisticRegressionPipeline,0.980527,False,"{'penalty': 'l2', 'C': 0.5765626434012575, 'im..."
1,6,LogisticRegressionPipeline,0.974853,False,"{'penalty': 'l2', 'C': 6.239401330891865, 'imp..."
2,9,LogisticRegressionPipeline,0.974853,False,"{'penalty': 'l2', 'C': 8.123565600467177, 'imp..."
3,4,LogisticRegressionPipeline,0.973411,False,"{'penalty': 'l2', 'C': 8.444214828324364, 'imp..."
4,1,XGBoostPipeline,0.970626,False,"{'eta': 0.38438170729269994, 'min_child_weight..."
5,2,RFClassificationPipeline,0.966846,False,"{'n_estimators': 569, 'max_depth': 22, 'impute..."
6,5,XGBoostPipeline,0.966592,False,"{'eta': 0.6481718720511973, 'min_child_weight'..."
7,0,XGBoostPipeline,0.965192,False,"{'eta': 0.5928446182250184, 'min_child_weight'..."
8,7,XGBoostPipeline,0.963913,False,"{'eta': 0.9786183422327642, 'min_child_weight'..."
9,3,XGBoostPipeline,0.952237,False,"{'eta': 0.5288949197529046, 'min_child_weight'..."


## Describe Pipeline
Each pipeline is given an `id`. We can get more information about any particular pipeline using that id

In [3]:
clf.describe_pipeline(0)

[1m********************************************************************************************[0m
[1m* XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model *[0m
[1m********************************************************************************************[0m

Problem Types: Binary Classification, Multiclass Classification
Model Type: XGBoost Classifier
Objective to Optimize: F1 (greater is better)
Number of features: 18

Pipeline Steps
1. One Hot Encoder
2. Simple Imputer
	 * impute_strategy : most_frequent
3. RF Classifier Select From Model
	 * percent_features : 0.6273280598181127
	 * threshold : -inf
4. XGBoost Classifier
	 * eta : 0.5928446182250184
	 * max_depth : 4
	 * min_child_weight : 8.598391737229157

Training
Training for Binary Classification problems.
Total training time (including CV): 0.2 seconds

Cross Validation
----------------
               F1  Precision  Recall   AUC  Log Loss   MCC # Training # Testing
0           0.950

## Get Pipeline
You can get the object for any pipeline as well

In [4]:
clf.get_pipeline(0)

<evalml.pipelines.classification.xgboost.XGBoostPipeline at 0x135081990>

### Get best pipeline
If you specifically want to get the best pipeline, there is a convenient access. 

In [5]:
clf.best_pipeline

<evalml.pipelines.classification.logistic_regression.LogisticRegressionPipeline at 0x1372054d0>

## Feature Importances

We can get the feature importances of the resulting pipeline

In [6]:
pipeline = clf.get_pipeline(0)
pipeline.feature_importances

Unnamed: 0,feature,importance
0,22,0.407441
1,7,0.239457
2,27,0.120609
3,20,0.072031
4,23,0.052818
5,6,0.038344
6,1,0.033962
7,21,0.028949
8,4,0.003987
9,25,0.002403


## Access raw results
You can also get access to all the underlying data like this

In [7]:
clf.results

{0: {'id': 0,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.5928446182250184,
   'min_child_weight': 8.598391737229157,
   'max_depth': 4,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.6273280598181127},
  'score': 0.9651923054186028,
  'high_variance_cv': False,
  'scores': [0.9504132231404958, 0.9752066115702479, 0.9699570815450643],
  'all_objective_scores': [OrderedDict([('F1', 0.9504132231404958),
                ('Precision', 0.9349593495934959),
                ('Recall', 0.9504132231404958),
                ('AUC', 0.984731920937389),
                ('Log Loss', 0.1536501646237938),
                ('MCC', 0.8644170412909863),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9752066115702479),
                ('Precision', 0.959349593495935),
                ('Recall', 0.9752066115702479),
                ('AUC', 0.9960350337318026),
                ('Log Loss', 0.10194972519713798),
       