### Introduction

We used a Linear SVC classification algorithm (aka Scikit-learn estimator) to learn how to classify records in the Iris flowers.

In this section, we explore Linear SVC in a little more detail.  We can only cover things at a high level though, the detailed working of algorithms typically require strong mathematical knowledge.

Most of the algorithms have been implemented consistently.  If you understand how to work with one algorithm, you can easily use that knowledge to determine how to work with other algorithms.


### Linear SVC

The Linear SVC algorithm is documented at the following page: https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html

If you navigate to the page, you see there are a bunch of parameters that can be provided to alter the behavoir of the algoritm such as: 

`penalty, loss, dual, tol, ...`

When we created our LinearSVC model, we did not provide any **parameters** and therefore the default parameter values were provided:

`classifier = LinearSVC()`

Here is an example of specifying some parameters:

`classifier = LinearSVC(penalty='l1', loss='hinge')`

To understand what parameters values to use, a data scientist would need to understand the mathematical formulation of the model.  Note that each algorithm has a different mathematical formulation.


#### Linear SVC Mathematical Formulation

**IMPORTANT:**

- We only show the content below to give you an awareness of the concepts that data scientists will typically need to understand when building machine learning models
- You do NOT need to understand this on this machine learning 101. Our goal is to only make you aware of some of the things that data scientists do.

The following content is copied directly from the Scikit-Learn docs.  

> ---

> The primal problem can be equivalently formulated as
>
> <div class="math">
\begin{equation}
  \min_ {w, b} \frac{1}{2} w^T w + C \sum_{i=1}\max(0, 1 - y_i (w^T \phi(x_i) + b)),
\end{equation}
> </div>
>
> where we make use of the hinge loss. This is the form that is directly optimized by LinearSVC, but unlike the dual form, this one does not involve inner products between samples, so the famous kernel trick cannot be applied. This is why only the linear kernel is supported by LinearSVC ( is the identity function).
>
> Source: https://scikit-learn.org/stable/modules/svm.html#svm-mathematical-formulation

> ---

### Hyper-parameter tuning

It is important to know about algoritm parameters because another concept that you should know about is Hyper Parameter Tuning.

<div class="alert-info">

---
_**DEFINITION: Hyper Parameter Tuning**_
    
_Also known as: Hyper Parameter Optimisation or HPO_
    
_Hyper Parameters are parameters that the algorithm does not learn.  In Scikit-learn they are passed in as arguments to constructors, e.g._
    
_`classifier = LinearSVC(penalty='l1', dual=False)`_
    
_Hyper Parameter **Tuning** is the process (usually automated) to find the best values for the hyper parameters so that the prediction accuracy of the algorith is optimised._
    
_**Advanced Topic**: for more information on Hyper Parameter Tuning you can refer to the Scikit-learn docs: https://scikit-learn.org/stable/modules/grid_search.html#_
    
---
    
<div>



 ### Summary
 
In this notebook we focussed on parameters.  In future notebooks, we will explore how different parameters affect the prediction accuracy of our models.
 
 **Exercise 01:** Try running the code below using different parameter values. 

In [10]:
from sklearn.datasets import load_iris
iris = load_iris()

X, y = iris.data, iris.target

from sklearn.svm import LinearSVC
classifier = LinearSVC() ### TRY DIFFERENT PARAMETERS

# Train the model (i.e. learn from the iris data)
model = classifier.fit(X, y)



### Navigation

[Previous](./06_Sckit-Learn_algorithm_cheetsheet.ipynb) | [Next](./08_test_and_train_split.ipynb) notebook