## Building Stochastic Gradient Descent from Scratch! 

Use the following starter code to create stochastic gradient descent for OLS from scratch. After you've done that, use your new function to run a linear regression on randomized data. 

We'll give you a function to generate some fake data for you to test your GD on. 

In [2]:
# create fake data for later testing

def gen_data(rows = 200, gen_coefs = [2,4], gen_inter = 0):
    X = np.random.rand(rows,len(gen_coefs))
    y = np.sum(np.tile(np.array(gen_coefs),(X.shape[0],1))*X,axis=1)
    y = y + np.random.normal(0,0.5, size=X.shape[0])
    y = y + gen_inter
    return X, y

In [3]:
class SGD_for_OLS:
    
    def __init__(self, n_iter=100, alpha=0.001):
        """
        Inputs:
        n_iter: number of epochs to run to fit the data. Total number of steps
        will be n_iter * X.shape[0] . 
        
        alpha: The learning rate. Moderates the step size during the gradient descent algorithm.
        """
        self.coef_ = None
        self.trained = False
        self.n_iter = n_iter
        self.alpha_ = alpha
        
    def shuffle_data(self, X, y):
        """
        Given X and y, shuffle them to get a new_X and new_y that maintain feature-target relationship. 
        Do this between epochs. 
        """
        assert len(X) == len(y)
        permute = np.random.permutation(len(y))
        return X[permute], y[permute]
    
    def update(self, data_point, error):
        """
        Update the coefficients using one step of SGD. 
        Remember the formula for updating betas: B_i = B_i - alpha * dJ/dB_i --> use the gradient/derivative!
        ---
        Inputs:
        
        data_point: One observation from the data (this is stochasitc gradient descent)
        
        error: The residual for the current data point, given the current coefficients. Prediction - True
        for the current datapoint and coefficients.
        """
        self.coef_ = self.coef_ - self.alpha_*error*data_point
        # You'll want to change this to take b_0 into account 
        
    def fit(self, X, y):
        pass
    
    def init_coef(self):
        pass
    
    def predict(self, X):
        pass