# Ungraded Lab:  Overfitting in Logistic Regression.

The lectures describe **Overfitting**. This is when the model follows the data too closely and does not generalize well. In this lab we will explore overfitting in logistic regression and how regularization can improve situation.


## Goals
In this lab you will:
- use `map_features` to extend the features of a data set
- explore the resulting overfitting
- utilize regularization to reduce overfitting
- reduce features to match the data and reduce overfitting.

# Outline
- [Tools](#tools)
- [Dataset](#dataset)
- [Polynomial Feature Map](#FeatureMap)
- [Fit the Model](#FitModel)
- [Reducing Overfitting](#ReduceOverfitting)

# Overfitting
In this lab, we will explore how overfitting happens and what can be done about it.
- Create a logistic dataset with an irregular boundary
- Create an overfitting problem
    - polynomial Regression and Feature mapping
- Regularization to reduce overfitting
<a name='tools'></a>
## Tools 
- We have not yet developed all the capabilities to do gradient decent with regularization so we will utilized sklearn's LogisticRegression capabilities explored briefly in a previous lab. 
- Plotting is very useful when exploring decision boundaries. We will utilize matplotlib. Producing these plots is quite involved so helper routines are provided below.
- We will create a polynomial feature set. `map_features` is provided to simplify that process

In [None]:
import numpy as np
from IPython.display import Markdown as md
%matplotlib widget
import matplotlib.pyplot as plt
plt.style.use('./deeplearning.mplstyle')
plt.rcParams['font.size'] = 8
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LinearRegression, Ridge
from sklearn import preprocessing
from plt_overfit import map_feature, plot_decision_boundary, plt_overfit
from lab_utils_common import dlc, plot_data, zscore_normalize_features, gradient_descent, predict_logistic, predict_linear

In [None]:
def map_one_feature(X1, degree):
    """
    Feature mapping function to polynomial features    
    """
    X1 = np.atleast_1d(X1)
    out = []
    str = ""
    k = 0
    for i in range(1, degree+1):
        out.append((X1**i))
        str = str + f"w_{{{k}}}{munge('x_0',i)} + "
        k += 1
    str = str + ' b' #add b to text equation, not to data
    return np.stack(out, axis=1), str 

def munge(base,exp):
    if exp == 0:
        return ('')
    elif exp == 1:
        return (base)
    else:
        return (base + f'^{{{exp}}}')



In [None]:
a = np.array([1.,2,3])
a_mapped, a_eq = map_one_feature(a,3)
display(md(f"${a_eq}$"))
print(a_eq)
print(a_mapped.shape)
print(a_mapped)

<a name='dataset'></a>
##  Dataset
Below we create a logistic dataset with two features based on a quadratic. Random noise is added to create a scenario where the model can overfit. 

In [None]:
m = 50
n = 2
np.random.seed(2)
X_train = 2*(np.random.rand(m,n)-[0.5,0.5])
y_train = X_train[:,1]+0.5  > X_train[:,0]**2 + 0.5*np.random.rand(m) #quadratic + random
y_train = y_train + 0  #convert from boolean to integer

fig, ax = plt.subplots(1,1,figsize=(4,4))
plot_data(X_train, y_train, ax, s=10, loc='lower right')
ax.set_title("Logistic data set with noise")
plt.show()

In [None]:
from matplotlib.gridspec import GridSpec
from matplotlib.widgets import Button, CheckButtons
from matplotlib.patches import FancyArrowPatch
import math
# for debug
from ipywidgets import Output




In [None]:
output = Output() # sends hidden error messages to display when using widgets
display(output)


In [None]:
class button_manager:
    ''' Handles some missing features of matplotlib check buttons 
    on init: 
        creates button, links to button_click routine, 
        calls call_on_click with active index and firsttime=True
    on click:
        maintains single button on state, calls call_on_click
    '''
    @output.capture()  # debug
    def __init__(self,fig, dim, labels, init, call_on_click):
        ''' 
        dim: (list)     [leftbottom_x,bottom_y,width,height]
        labels: (list)  for example ['1','2','3','4','5','6']
        init: (list)    for example [True, False, False, False, False, False]
        '''
        self.fig = fig
        self.ax = plt.axes(dim)  #lx,by,w,h
        self.init_state = init    
        self.call_on_click = call_on_click
        self.button  = CheckButtons(self.ax,labels,init)
        self.button.on_clicked(self.button_click)
        self.status = self.button.get_status()
        self.call_on_click(self.status.index(True),firsttime=True)
        
    @output.capture()  # debug
    def reinit(self):
        self.status = self.init_state
        self.button.set_active(self.status.index(True))      #turn off old, will trigger update and set to status
  
    @output.capture()  # debug
    def button_click(self, event):
        ''' maintains one-on state. If on-button is clicked, will process correctly '''
        new_status = self.button.get_status()
        new = [self.status[i] ^ new_status[i] for i in range(len(self.status))]
        newidx = new.index(True)
        self.button.eventson = False
        self.button.set_active(self.status.index(True))  #turn off old or reenable if same
        self.button.eventson = True
        self.status = self.button.get_status()
        self.call_on_click(self.status.index(True))
        

In [None]:
class overfit_example():
    def __init__(self, X, y, w_in, b_in, regularize=False):
        self.X = X
        self.y = y
        self.w = w_in
        self.b = b_in
        self.regularize=regularize
        self.lambda_=0
        fig = plt.figure( figsize=(8,6))
        fig.canvas.toolbar_visible = False
        fig.canvas.header_visible = False
        fig.canvas.footer_visible = False
        fig.set_facecolor('#ffffff') #white
        gs  = GridSpec(5, 3, figure=fig)
        ax0 = fig.add_subplot(gs[0:3, :])
        ax1 = fig.add_subplot(gs[-2, :])
        ax2 = fig.add_subplot(gs[-1, :])
        ax1.set_axis_off()
        ax2.set_axis_off()
        self.ax = [ax0,ax1,ax2]
        self.fig = fig

        #pos = ax2.get_position().get_points()  ##[[lb_x,lb_y], [rt_x, rt_y]]
        #print(pos)
        self.axfitdata = plt.axes([0.26,0.124,0.10,0.1 ])  #lx,by,w,h
        self.bfitdata  = Button(self.axfitdata , 'fit data', color=dlc['dlblue'])
        self.bfitdata.label.set_fontsize(12)
        self.bfitdata.on_clicked(self.fitdata_clicked)

        self.cid = fig.canvas.mpl_connect('button_press_event', self.add_data)

        self.typebut = button_manager(fig, [0.4, 0.07,0.15,0.15], ["Regression", "Categorical"],
                                       [False,True], self.toggle_type)

        self.fig.text(0.1, 0.02+0.21, "Degree", fontsize=12)
        self.degrbut = button_manager(fig,[0.1,0.02,0.15,0.2 ], ['1','2','3','4','5','6'], 
                                        [True, False, False, False, False, False], self.update_equation)
        if self.regularize:
            self.fig.text(0.6, 0.02+0.21, r"lambda($\lambda$)", fontsize=12)
            self.lambut = button_manager(fig,[0.6,0.02,0.15,0.2 ], ['0.0','0.2','0.4','0.6','0.8','1'], 
                                        [True, False, False, False, False, False], self.updt_lambda)
   
        #self.regbut =  button_manager(fig, [0.8, 0.08,0.24,0.15], ["Regularize"],
        #                               [False], self.toggle_reg)
        #self.logistic_data()
    
    def updt_lambda(self, idx, firsttime=False):
        self.lambda_ = idx * 0.2
        
    def toggle_type(self, idx, firsttime=False):
        self.logistic = True if idx==1 else False
        self.ax[0].clear()
        if self.logistic:
            self.logistic_data()
        else:
            self.linear_data()
        if not firsttime: self.degrbut.reinit()
        
    def logistic_data(self,redraw=False):
        if not redraw:
            m = 50
            n = 2
            np.random.seed(2)
            X_train = 2*(np.random.rand(m,n)-[0.5,0.5])
            y_train = X_train[:,1]+0.5  > X_train[:,0]**2 + 0.5*np.random.rand(m) #quadratic + random
            y_train = y_train + 0  #convert from boolean to integer
            self.X = X_train
            self.y = y_train 

        #plot_data(X_train, y_train, self.ax[0], s=10, loc='lower right')
        plot_data(self.X, self.y, self.ax[0], s=10, loc='lower right')
        self.ax[0].set_title("Logistic data set with noise")
        self.ax[0].text(0.5,0.93, "Click on plot to add data. Hold [Shift] for blue(y=0) data.",
                        fontsize=12, ha='center',transform=self.ax[0].transAxes, color=dlc["dlblue"])
        self.ax[0].set_xlabel(r"$x_0$") 
        self.ax[0].set_ylabel(r"$x_1$")         
    
    def linear_data(self,redraw=False):
        if not redraw:
            m = 30
            n = 2
            c = 0
            x_train = np.arange(0,m,1)
            np.random.seed(1)
            y_ideal = x_train**2 + c
            y_train = y_ideal + 0.7 * y_ideal*(np.random.sample((m,))-0.5)
            self.x_ideal = x_train #for redraw when new data included in X
            self.X = x_train
            self.y = y_train
            self.y_ideal = y_ideal
        else:
            self.ax[0].set_xlim(self.xlim)
            self.ax[0].set_ylim(self.ylim)

        self.ax[0].scatter(self.X,self.y, label="y")
        self.ax[0].plot(self.x_ideal, self.y_ideal, "--", color = "orangered", label="y_ideal", lw=1)
        self.ax[0].set_title("OverFitting Example: Linear Data Set (quadratic with noise)",fontsize = 14)   
        self.ax[0].set_xlabel("x"); self.ax[0].set_ylabel("y")
        self.ax0ledgend = self.ax[0].legend(loc='lower right')
        self.ax[0].text(0.5,0.93, "Click on plot to add data",
                        fontsize=12, ha='center',transform=self.ax[0].transAxes, color=dlc["dlblue"])
        if not redraw:
            self.xlim = self.ax[0].get_xlim()
            self.ylim = self.ax[0].get_ylim()


    @output.capture()  # debug
    def add_data(self, event):
        if self.logistic:
            self.add_data_logistic(event)
        else:
            self.add_data_linear(event)

    @output.capture()  # debug
    def add_data_logistic(self, event):
        if event.inaxes == self.ax[0]:
            x0_coord = event.xdata
            x1_coord = event.ydata
            
            if event.key == None:
                self.ax[0].scatter(x0_coord, x1_coord, marker='x', s=10, c = 'red', label="y=1")
                self.y = np.append(self.y,1)
            else:
                self.ax[0].scatter(x0_coord, x1_coord, marker='o', s=10, label="y=0", facecolors='none',
                                   edgecolors=dlc['dlblue'],lw=3)
                self.y = np.append(self.y,0)
            self.X = np.append(self.X,np.array([[x0_coord, x1_coord]]),axis=0)
        self.fig.canvas.draw()
        
    def add_data_linear(self, event):
        if event.inaxes == self.ax[0]:
            x_coord = event.xdata
            y_coord = event.ydata
            
            self.ax[0].scatter(x_coord, y_coord, marker='o', s=10, facecolors='none',
                                   edgecolors=dlc['dlblue'],lw=3)
            self.y = np.append(self.y,y_coord)
            self.X = np.append(self.X,x_coord)
            self.fig.canvas.draw()

    @output.capture()  # debug
    def fitdata_clicked(self,event):
        if self.logistic == True:
            self.logistic_regression()
        else:
            self.linear_regression()
        
    def linear_regression(self):
        self.ax[0].clear()
        self.fig.canvas.draw()

        # create and fit the model using our mapped_X feature set.
        self.X_mapped, _ =  map_one_feature(self.X, self.degree)
        self.X_mapped_scaled, self.X_mu, self.X_sigma  = zscore_normalize_features(self.X_mapped)
        
        #linear_model = LinearRegression()
        linear_model = Ridge(alpha=self.lambda_, normalize=True, max_iter=10000)
        linear_model.fit(self.X_mapped_scaled, self.y ) 
        self.w = linear_model.coef_.reshape(-1,)
        self.b = linear_model.intercept_
        x = np.linspace(*self.xlim,30)  #plot line idependent of data which gets disordered
        xm, _ =  map_one_feature(x, self.degree)
        xms = (xm - self.X_mu)/ self.X_sigma
        y_pred = linear_model.predict(xms)
        
        #self.fig.canvas.draw()
        self.linear_data(redraw=True)
        self.ax0yfit = self.ax[0].plot(x, y_pred, color = "blue", label="y_fit")
        self.ax0ledgend = self.ax[0].legend(loc='lower right')
        self.fig.canvas.draw()

    def logistic_regression(self):
        self.ax[0].clear()
        self.fig.canvas.draw()

        # create and fit the model using our mapped_X feature set.
        self.X_mapped, _ =  map_feature(self.X[:, 0], self.X[:, 1], self.degree)
        self.X_mapped_scaled, self.X_mu, self.X_sigma  = zscore_normalize_features(self.X_mapped)
        if self.regularize == False or self.lambda_ == 0:
            lr = LogisticRegression(penalty='none', max_iter=10000)
        else:
            C = 1/self.lambda_
            lr = LogisticRegression(C=C, max_iter=10000)

        lr.fit(self.X_mapped_scaled,self.y)
        #print(lr.score(self.X_mapped_scaled, self.y))
        self.w = lr.coef_.reshape(-1,)
        self.b = lr.intercept_
        #print(self.w, self.b)
        self.logistic_data(redraw=True)
        self.contour = plot_decision_boundary(self.ax[0],[-1,1],[-1,1], self.y, predict_logistic, self.w, self.b, 
                       scaler=True, mu=self.X_mu, sigma=self.X_sigma, degree=self.degree )
        self.fig.canvas.draw()

    @output.capture()  # debug
    def update_equation(self, idx, firsttime=False):
        #print(f"Update equation, index = {idx}, firsttime={firsttime}")
        self.degree = idx+1
        if firsttime:
            self.eqtext = []
        else:
            for artist in self.eqtext:
                #print(artist)
                artist.remove()
            self.eqtext = []
        if self.logistic:
            _, equation =  map_feature(self.X[:, 0], self.X[:, 1], self.degree)
            str = 'f_{wb} = sigmoid(' 
        else:
            _, equation =  map_one_feature(self.X, self.degree)
            str = 'f_{wb} = ('
        bz = 10
        seq = equation.split('+')
        blks = math.ceil(len(seq)/bz)
        for i in range(blks):
            if i == 0:
                str = str +  '+'.join(seq[bz*i:bz*i+bz])
            else:
                str = '+'.join(seq[bz*i:bz*i+bz])
            str = str + ')' if i == blks-1 else str + '+'
            ei = self.ax[1].text(0.01,(0.75-i*0.25), f"${str}$",fontsize=9, transform = self.ax[1].transAxes, ma='left', va='top' )
            self.eqtext.append(ei)
        self.fig.canvas.draw()

plt.close("all")
w_in = np.zeros_like(y_train)
b_in = 0.
#ofit = overfit_example(X_train, y_train, w_in, b_in,True)
ofit = overfit_example(X_train, y_train, w_in, b_in,False)

plt.show()

In [None]:
np.linspace(*ofit.xlim,5)

In [None]:
class equation_manager:
    
    @output.capture()  # debug
    def __init__(self,ax, logistic=False):
        self.ax = ax
        self.fig = ax.figure
        self.logistic = logistic
        self.init_state = [True, False, False, False, False, False]
        self.axdegree = plt.axes([0.1,0.02,0.15,0.2 ])  #lx,by,w,h
        self.button  = CheckButtons(self.axdegree, ['1','2','3','4','5','6'], self.init_state)
        self.button.on_clicked(self.button_clicked)
        self.degreetxt = self.fig.text(0.1, 0.02+0.21, "Degree", fontsize=12)
        self.status = self.button.get_status()
        #self.update_equation(self.status.index(True)+1, firsttime=True)
        self.button_clicked(None, firsttime=True)
        
    def reinit(self, logistic):
        self.logistic = logistic
        self.button.eventson = False
        self.button.set_active(self.status.index(True))  #turn off old
        self.button.eventson = True
        self.button.set_active(self.init_state.index(True))  #turn on init, trigger update            
         
    @output.capture()  # debug
    def button_clicked(self, event, firsttime=False):
        ''' firsttime is from __init__, not button push '''
        new_status = self.button.get_status()
        new = [self.status[i] ^ new_status[i] for i in range(len(self.status))]
        newidx = new.index(True)
        self.button.eventson = False
        self.button.set_active(self.status.index(True))  #turn off old or reenable if same
        self.button.eventson = True
        self.status = self.button.get_status()
        self.update_equation(newidx+1,firstttime)
        self.degree = self.status.index(True)+1
   
    @output.capture()  # debug
    def update_equation(self, degree, firsttime=False):
        if firsttime:
            self.eqtext = []
        else:
            for artist in self.eqtext:
                #print(artist)
                artist.remove()
            self.eqtext = []

        self.X_mapped, equation =  map_feature(X_train[:, 0], X_train[:, 1], degree)
        bz = 10
        seq = equation.split('+')
        blks = math.ceil(len(seq)/bz)
        for i in range(blks):
            if i == 0:
                str = 'f_{wb} = sigmoid('  + '+'.join(seq[bz*i:bz*i+bz])
            else:
                str = '+'.join(seq[bz*i:bz*i+bz])
            str = str + ')' if i == blks-1 else str + '+'
            ei = self.ax.text(0.01,(0.75-i*0.25), f"${str}$",fontsize=9, transform = self.ax.transAxes, ma='left', va='top' )
            self.eqtext.append(ei)
        self.fig.canvas.draw()

    

<a name='FeatureMap'></a>
##  Create Overfitting...Polynomial Feature Mapping
In real data sets, the boundary between "True" and "False" features is rarely a straight line. To create a non-linear decision boundary, our model will need to support non-linear features. Concretely, if we have two features in our feature set $x_1$ and $x_2$ we can build a model of degree 2:
$$f_{\mathbf{w},b} = w_0x_1 + w_1x_2 + w_2x_1^2 + w_3x_1x_2 + w_4x_2^2 + b \tag{1} $$
To do this, we must convert our two feature data set into a feature set with all combinations of our features. The routine `map_feature` was provided above to do exactly this.

In [None]:
X_tmp = np.array([[2,0],[0,3],[2,3]] )  # values selected to illustrated equation
print("Shape before feature mapping:", X_tmp.shape)
print(X_tmp, "\n")

mapped_X, descrip =  map_feature(X_tmp[:, 0], X_tmp[:, 1],degree = 2)

print("Shape after feature mapping:", mapped_X.shape)
print(mapped_X)

Compare the results with equation (1) above.

Of course, we don't have to stop at two. The `degree` argument to map_features will determine the degree of the polynomial that is created. The degree will be determined by the complexity of the curve you are trying to follow. Increasing the degree will allow the model to follow more irregular boundaries, but can also allow for overfitting. The number of features/parameters grows exponentially as all of the cross terms are included. Sklearn [`PolynomialFeatures`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html) can also be used to create feature maps.

Lets convert our dataset above to support degree 6.

In [None]:
X_train = np.array([[2,0],[0,3],[2,3]] )  # values selected to illustrated equation
print("Original shape of data:", X_train.shape)
degree = 6
X_mapped, equation =  map_feature(X_train[:, 0], X_train[:, 1], degree)
print(equation)
print("Shape after feature mapping:", X_mapped.shape)
foo=md(f"equation: ${equation}$")
foo2 = f"{equation}"
foo2

Note, with a degree 6 polynomial, we now have 27 features!
<a name='FitModel'></a>
## Fit the model

We are going to use the `LogisticRegression` feature of SkLearn that was introduced in a previous lab. One thing to note, this routine has regularization built in. We will enable and disable that capability to highlight aspects of over fitting. To disable it, the command line argument `penalty` is set to `none`. When enabled, the `C` command line argument controls how much regularization is used. 

The first step is to scale the data. It turns out, with the quadratic terms, the model won't fit without regularization,which we aren't using in this first experiment, so we will scale the data. This is similar to the feature scaling/mean normalization introduced in the first week.

In [None]:
X_mapped_scaled, X_mu, X_sigma  = zscore_normalize_features(X_mapped)

In [None]:
w_in  = np.zeros_like(X_mapped_scaled[0])
b_in  = 0.
alpha = 0.01
num_iters = 1000000

w_out, b_out, _ = gradient_descent(X_mapped_scaled, y_train, w_in, b_in, alpha, num_iters, logistic=True) 
print(f"\nupdated parameters: w:{w_out}, b:{b_out}")

In [None]:
#w_out_5M = w_out
#b_out_5M = b_out
from lab_utils_common import compute_cost_matrix, sigmoid
compute_cost_matrix(X_mapped_scaled, y_train, w_out.reshape(-1,1), b_out, logistic=True, lambda_=0)

Now that we have a trained model, lets map the Original Data (not predicted) along with the decision boundary we derive from the model. Examine `plot_decision_boundary` above to see the details of how this is accomplished.

In [None]:
#plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict, scaler=scaler )
fig,ax = plt.subplots(1,1, figsize=(4,4))
plot_decision_boundary(ax,[-1,1],[-1,1], y_train, predict_logistic, w_out, 
                       b_out, scaler=True, mu=X_mu, sigma=X_sigma, degree=degree )
plot_data(X_train,y_train,ax,s=10)
ax.set_title(f"Example of overfitting, \ndegree {degree}, no regularization")
plt.show()

<details>
<summary>
    <b>**Expected Output**:</b>
</summary>

<center> <img  src="./images/C1_W3_Lab07_overfitting.PNG" width="440" height="440"/>   <center/>

In [None]:
# create and fit the model using our mapped_X feature set.
lr = LogisticRegression(penalty='none', max_iter=10000)
lr.fit(X_mapped_scaled,y_train)


In [None]:
print(lr.score(X_mapped_scaled, y_train))
w_lr = lr.coef_.reshape(-1,)
b_lr = lr.intercept_
print(w_lr,b_lr)
#plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict, scaler=scaler )
fig,ax = plt.subplots(1,1, figsize=(4,4))
plot_decision_boundary(ax,[-1,1],[-1,1], y_train, predict_logistic, w_lr, 
                       b_lr, scaler=True, mu=X_mu, sigma=X_sigma, degree=degree )
plot_data(X_train,y_train,ax,s=10)
ax.set_title(f"Example of overfitting, \ndegree {degree}, no regularization")
plt.show()

In [None]:
from lab_utils_common import compute_cost_matrix, sigmoid
compute_cost_matrix(X_mapped_scaled, y_train, w_lr.reshape(-1,1), b_lr, logistic=True, lambda_=0)

In [None]:
f_wb.shape

In [None]:
f_wb = X_mapped_scaled @ w_lr + b_lr
Blr = -(y_train * f_wb) + np.log(1+np.exp(f_wb))
for i in range(5,5+6):
    print(f_wb[i*6:i*6+6])
    print(y_train[i*6:i*6+6])
    print(Blr[i*6:i*6+6])
    print()

Wow, the model has done an amazing job of separating the data! However, that is probably not what is desired. 
We can take two approaches to reducing overfitting:
- regularization 
- reduce the degree of the polynomial.

<a name='ReduceOverfitting'></a>
## Reducing Overfitting using regularization
The next labs will cover regularization in more detail, so we will just explore this briefly.
Lets fit the model again, but this time include regularization. 

In [None]:
# create and fit the model using our mapped_X feature set.
lr = LogisticRegression(max_iter=1000, C=1)
lr.fit(mapped_X,y_train)

# print an evaluation of the fit, 1 is best.
print("fitting score:",lr.score(mapped_X, y_train))

In [None]:
plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict)
plot_data(X_train,y_train)
plt.title("Example of overfitting, degree 6, with regularization, C=1")
plt.show()

The decision boundary is much more reasonable with some regularizationg.
Change the value of `C` above to try more or less regularization. C must be strictly positive. Values less than 1 maximumize regularization while large values minimize regularization.

##### Reduce the degree of the polynomial
A degree 6 polynomial may be more than is required! We can reduce the values to limit the model.
To do this, we will need to regenerate our mapped data and refit the model.

In [None]:
print("Original shape of data:", X_train.shape)
degree = 2
mapped_X =  map_feature(X_train[:, 0], X_train[:, 1],degree)

print("Shape after feature mapping:", mapped_X.shape)

In [None]:
# create and fit the model using our mapped_X feature set.
lr = LogisticRegression(penalty='none', max_iter=1000, C=1)
lr.fit(mapped_X,y_train)

# print an evaluation of the fit, 1 is best.
print("fit score:", lr.score(mapped_X, y_train))

In [None]:
plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict)
plot_data(X_train,y_train)
plt.title("Example of overfitting, degree 2, with no regularization")
plt.show()

Not bad! Of course, in this case, we knew ahead of time the data was quadratic and that a degree two polynomial would be a good choice. Try varying `degree` above to see the impact of polynomial degree on overfitting.