# Warmup Activity: GridSearchCV
In this activity, you will create a grid search cross validation instance and use it to evaluate the SVC parameters (hyperparameters tuning) on the wine dataset built into Scikit-learn. 

## Instructions

The dataset you'll be working on is the wine dataset built into Scikit-learn. For each sample, there are data on traits such as color intensity, malic acid content, and magnesium content. Also available is the target column, which lists the type of wine. There are three types of wine in the dataset.

* You will perform the following tasks.

  * Split the data into training and testing sets.
  * Scale the data. See [here](https://machinelearningmastery.com/data-preparation-without-data-leakage/) for tips on avoiding data leakage.
  * Create a support vector machine model (support vector classifier) to classify a wine type based on its features.
  * Create a grid search cross-validation instance and use it to evaluate the SVC parameters (hyperparameter tuning). Feel free to use your own parameters, or use the ones provided in the notebook.
  * Predict classifications of the testing dataset.
  * Assess the accuracy score of the predicted classifications.

In [1]:
# import dependencies
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

In [2]:
from sklearn.datasets import load_iris, load_wine

In [3]:
wine_data = load_wine()

In [4]:
X = wine_data.data
y = wine_data.target

In [5]:
# feature names
wine_data.feature_names

['alcohol',
 'malic_acid',
 'ash',
 'alcalinity_of_ash',
 'magnesium',
 'total_phenols',
 'flavanoids',
 'nonflavanoid_phenols',
 'proanthocyanins',
 'color_intensity',
 'hue',
 'od280/od315_of_diluted_wines',
 'proline']

In [6]:
X[:3]

array([[1.423e+01, 1.710e+00, 2.430e+00, 1.560e+01, 1.270e+02, 2.800e+00,
        3.060e+00, 2.800e-01, 2.290e+00, 5.640e+00, 1.040e+00, 3.920e+00,
        1.065e+03],
       [1.320e+01, 1.780e+00, 2.140e+00, 1.120e+01, 1.000e+02, 2.650e+00,
        2.760e+00, 2.600e-01, 1.280e+00, 4.380e+00, 1.050e+00, 3.400e+00,
        1.050e+03],
       [1.316e+01, 2.360e+00, 2.670e+00, 1.860e+01, 1.010e+02, 2.800e+00,
        3.240e+00, 3.000e-01, 2.810e+00, 5.680e+00, 1.030e+00, 3.170e+00,
        1.185e+03]])

#### Split the data into training and testing sets.

In [7]:
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

#### Scale the data.

In [24]:
# Scale the data
scalar = StandardScaler()

In [25]:
scalar.fit(X_train)

StandardScaler()

In [27]:
X_train_scaled = scalar.transform(X_train)
X_test_scaled = scalar.transform(X_test)

#### Create a support vector machine model

In [28]:
# Instantiate a SVC model
from sklearn.svm import SVC
model = SVC()

In [29]:
# Create a SVC hyperparameter grid
param_grid ={
    'C': [0.001, 0.01, 0.1, 1, 10, 100],
    'gamma': [0.001, 0.01, 0.1, 1, 10, 100]
}

#### Create a grid search cross-validation instance and use it to evaluate the SVC parameters (hyperparameter tuning).

In [30]:
# Use GridSearchCV to tune your hyperparameters
grid_clf = GridSearchCV(model, param_grid)

#### Predict classifications of the testing dataset.

In [31]:
# The best parameters according to GridSearchCV
grid_clf.fit(X_train_scaled, y_train)

GridSearchCV(estimator=SVC(),
             param_grid={'C': [0.001, 0.01, 0.1, 1, 10, 100],
                         'gamma': [0.001, 0.01, 0.1, 1, 10, 100]})

In [32]:
# Evaluate the model
grid_clf.best_params_

{'C': 1, 'gamma': 0.01}

#### Assess the accuracy score of the predicted classifications.

In [33]:
# Accuracy Score
grid_clf.score(X_test_scaled, y_test)

0.9777777777777777