*How to Use Your Jupyter Notebook:*

- *You can run a cell in the Notebook to the right by placing your cursor in the cell and clicking the `Run` button or the `Shift`+`Enter/Return` keys.*
- *When you are ready to evaluate the code in your Notebook, press the `Save` button at the top of the Notebook or use the `control/command`+`s` keys before clicking the `Test Work` button at the bottom. Be sure to save your solution code in the cell marked `## YOUR SOLUTION HERE ##` or it will not be evaluated.*
- *When you are ready to move on, click Next.*

![Screenshot of the buttons at the top of a Jupyter Notebook. The Run and Save buttons are highlighted](https://static-assets.codecademy.com/Paths/ds-python/jupyter-buttons.png)

### Setup

In [1]:
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import uniform

# Load the data set
cancer = load_breast_cancer()

# Split the data into training and testing sets
X = cancer.data
y = cancer.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

# Create distributions to draw hyperparameters from
distributions = {'penalty': ['l1', 'l2'], 'C': uniform(loc=0, scale=100)}

# The logistic regression model
lr = LogisticRegression(solver = 'liblinear', max_iter = 1000)

# Create a RandomizedSearchCV model
clf = RandomizedSearchCV(lr, distributions, n_iter=8)

### 1. Fit `clf` to training data and get best hyperparameters

In [2]:
## YOUR SOLUTION HERE ##

clf.fit(X_train, y_train)
best_model = clf.best_estimator_
print(best_model)
print(clf.best_params_)

LogisticRegression(C=31.769718491702726, max_iter=1000, penalty='l1',
                   solver='liblinear')
{'C': 31.769718491702726, 'penalty': 'l1'}


### 2. Calculate training and test scores of the best estimator

In [3]:
## YOUR SOLUTION HERE ##

best_score = clf.best_score_
test_score = clf.score(X_test, y_test)
print(best_score)
print(test_score)

0.969466484268126
0.951048951048951


### 3. Viewing random search Results

In [4]:
## YOUR SOLUTION HERE ##
hyperparameter_values = pd.DataFrame(clf.cv_results_['params'])
randomsearch_scores = pd.DataFrame(clf.cv_results_['mean_test_score'], columns = ['score'])

df = pd.concat([hyperparameter_values, randomsearch_scores], axis = 1)
print(df)

           C penalty     score
0  87.003879      l2  0.955376
1   5.133015      l1  0.957729
2  51.976782      l1  0.967114
3  71.640591      l1  0.967086
4  39.863819      l2  0.955376
5  31.769718      l1  0.969466
6  90.378809      l2  0.960082
7  33.530884      l1  0.969466
