# Regression Week 2: Multiple Regression (gradient descent)

In the first notebook we explored multiple regression using graphlab create. Now we will use graphlab along with numpy to solve for the regression weights with gradient descent.

In this notebook we will cover estimating multiple regression weights via gradient descent. You will:
* Add a constant column of 1's to a graphlab SFrame to account for the intercept
* Convert an SFrame into a Numpy array
* Write a predict_output() function using Numpy
* Write a numpy function to compute the derivative of the regression weights with respect to a single feature
* Write gradient descent function to compute the regression weights given an initial weight vector, step size and tolerance.
* Use the gradient descent function to estimate regression weights for multiple features

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt 
%matplotlib inline

# Load in house sales data

Dataset is from house sales in King County, the region where the city of Seattle, WA is located.

In [128]:
sales = pd.read_csv('kc_house_data.csv')
sales

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,20141013T000000,221900.0,3,1.00,1180,5650,1.0,0,0,...,7,1180,0,1955,0,98178,47.5112,-122.257,1340,5650
1,6414100192,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,0,...,7,2170,400,1951,1991,98125,47.7210,-122.319,1690,7639
2,5631500400,20150225T000000,180000.0,2,1.00,770,10000,1.0,0,0,...,6,770,0,1933,0,98028,47.7379,-122.233,2720,8062
3,2487200875,20141209T000000,604000.0,4,3.00,1960,5000,1.0,0,0,...,7,1050,910,1965,0,98136,47.5208,-122.393,1360,5000
4,1954400510,20150218T000000,510000.0,3,2.00,1680,8080,1.0,0,0,...,8,1680,0,1987,0,98074,47.6168,-122.045,1800,7503
5,7237550310,20140512T000000,1225000.0,4,4.50,5420,101930,1.0,0,0,...,11,3890,1530,2001,0,98053,47.6561,-122.005,4760,101930
6,1321400060,20140627T000000,257500.0,3,2.25,1715,6819,2.0,0,0,...,7,1715,0,1995,0,98003,47.3097,-122.327,2238,6819
7,2008000270,20150115T000000,291850.0,3,1.50,1060,9711,1.0,0,0,...,7,1060,0,1963,0,98198,47.4095,-122.315,1650,9711
8,2414600126,20150415T000000,229500.0,3,1.00,1780,7470,1.0,0,0,...,7,1050,730,1960,0,98146,47.5123,-122.337,1780,8113
9,3793500160,20150312T000000,323000.0,3,2.50,1890,6560,2.0,0,0,...,7,1890,0,2003,0,98038,47.3684,-122.031,2390,7570


If we want to do any "feature engineering" like creating new features or adjusting existing ones we should do this directly using the pandas as seen in the other Week 2 notebook. For this notebook, however, we will work with the existing features.

# Convert to Numpy Array

Although pandas offer a number of benefits to users (especially when using Big Data and built-in sklearn functions) in order to understand the details of the implementation of algorithms it's important to work with a library that allows for direct (and optimized) matrix operations. Numpy is a Python solution to work with matrices (or any multi-dimensional "array").

Recall that the predicted value given the weights and the features is just the dot product between the feature and weight vector. Similarly, if we put all of the features row-by-row in a matrix then the predicted value for *all* the observations can be computed by right multiplying the "feature matrix" by the "weight vector". 

First we need to take the SFrame of our data and convert it into a 2D numpy array (also called a matrix). To do this we use pandas built in pandas.dataframe() which converts the data into a Pandas (another python library) dataframe. We can then use Panda's .as_matrix() to convert the dataframe into a numpy matrix.

In [129]:
# test with both for understanding
#x = sales.drop(['id' , 'date','price'] , axis=1).set_index('sqft_living')
x = sales.drop(['id' , 'date','price'] , axis=1)
y = sales['price']
print(x.shape ,y.shape)

(21613, 18) (21613,)


In [130]:
print(x,y)

       bedrooms  bathrooms  sqft_living  sqft_lot  floors  waterfront  view  \
0             3       1.00         1180      5650     1.0           0     0   
1             3       2.25         2570      7242     2.0           0     0   
2             2       1.00          770     10000     1.0           0     0   
3             4       3.00         1960      5000     1.0           0     0   
4             3       2.00         1680      8080     1.0           0     0   
5             4       4.50         5420    101930     1.0           0     0   
6             3       2.25         1715      6819     2.0           0     0   
7             3       1.50         1060      9711     1.0           0     0   
8             3       1.00         1780      7470     1.0           0     0   
9             3       2.50         1890      6560     2.0           0     0   
10            3       2.50         3560      9796     1.0           0     0   
11            2       1.00         1160      6000   

Now we will write a function that will accept an pandas, a list of feature names (e.g. ['sqft_living', 'bedrooms']) and an target feature e.g. ('price') and will return two things:
* A numpy matrix whose columns are the desired features plus a constant column (this is how we create an 'intercept')
* A numpy array containing the values of the output

With this in mind, complete the following function (where there's an empty line you should write a line of code that does what the comment above indicates)


In [131]:
def get_numpy_data(DataFrame, features, output):
    DataFrame['constant'] = 1 # this is how you add a constant column to an pandas
    # add the column 'constant' to the front of the features list so that we can extract it along with the others:
    features = ['constant'] + features # this is how you combine two lists
    # select the columns of data_Frame given by the features list into the pandas features_pandas (now including constant):
    featuresframe=DataFrame[features]
    #featuresframe = featuresframe.set_index('constant')

    # the following line will convert the features_pandas into a numpy matrix:
    #feature_matrix = np.featuresframe
    # assign the column of data_frame associated with the output to the Array output_array
    outputarray=output

    # the following will convert the Array into a numpy array by first converting it to a list
    #output_array = np.outputarray
    return(featuresframe,outputarray)
    #return(feature_matrix, output_array)


In [132]:
all_features = x.columns
all_features

Index(['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot', 'floors',
       'waterfront', 'view', 'condition', 'grade', 'sqft_above',
       'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode', 'lat', 'long',
       'sqft_living15', 'sqft_lot15'],
      dtype='object')

For testing let's use the 'sqft_living' feature and a constant as our features and price as our output:

In [133]:
(example_features, example_output) = get_numpy_data(sales, ['sqft_living'], sales['price']) # the [] around 'sqft_living' makes it a list
print (example_features) # this accesses the first row of the data the ':' indicates 'all columns'
print (example_output) # and the corresponding output
#example_features ,example_output  check values of it all array 


       constant  sqft_living
0             1         1180
1             1         2570
2             1          770
3             1         1960
4             1         1680
5             1         5420
6             1         1715
7             1         1060
8             1         1780
9             1         1890
10            1         3560
11            1         1160
12            1         1430
13            1         1370
14            1         1810
15            1         2950
16            1         1890
17            1         1600
18            1         1200
19            1         1250
20            1         1620
21            1         3050
22            1         2270
23            1         1070
24            1         2450
25            1         1710
26            1         2450
27            1         1400
28            1         1520
29            1         2570
...         ...          ...
21583         1          710
21584         1         1260
21585         

In [134]:
all_feat = list(all_features)


(all_features, all_output) = get_numpy_data(sales, all_feat, sales['price']) # the [] around 'sqft_living' makes it a list
print (all_features.columns,all_features) # this accesses the first row of the data the ':' indicates 'all columns'
print (all_output)

Index(['constant', 'bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot',
       'floors', 'waterfront', 'view', 'condition', 'grade', 'sqft_above',
       'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode', 'lat', 'long',
       'sqft_living15', 'sqft_lot15'],
      dtype='object')        constant  bedrooms  bathrooms  sqft_living  sqft_lot  floors  \
0             1         3       1.00         1180      5650     1.0   
1             1         3       2.25         2570      7242     2.0   
2             1         2       1.00          770     10000     1.0   
3             1         4       3.00         1960      5000     1.0   
4             1         3       2.00         1680      8080     1.0   
5             1         4       4.50         5420    101930     1.0   
6             1         3       2.25         1715      6819     2.0   
7             1         3       1.50         1060      9711     1.0   
8             1         3       1.00         1780      7470     1.0   
9    

In [135]:
#all_features = ['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot', 'floors', 'waterfront', 'view', 'condition', 'grade',
#                'sqft_above', 'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode', 'lat', 'long', 'sqft_living15', 'sqft_lot15',]
(ex_features, ex_output) = get_numpy_data(x, all_feat, y)
print(ex_features.columns ,ex_features)
print(ex_output)

Index(['constant', 'bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot',
       'floors', 'waterfront', 'view', 'condition', 'grade', 'sqft_above',
       'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode', 'lat', 'long',
       'sqft_living15', 'sqft_lot15'],
      dtype='object')        constant  bedrooms  bathrooms  sqft_living  sqft_lot  floors  \
0             1         3       1.00         1180      5650     1.0   
1             1         3       2.25         2570      7242     2.0   
2             1         2       1.00          770     10000     1.0   
3             1         4       3.00         1960      5000     1.0   
4             1         3       2.00         1680      8080     1.0   
5             1         4       4.50         5420    101930     1.0   
6             1         3       2.25         1715      6819     2.0   
7             1         3       1.50         1060      9711     1.0   
8             1         3       1.00         1780      7470     1.0   
9    

# Predicting output given regression weights

Suppose we had the weights [1.0, 1.0] and the features [1.0, 1180.0] and we wanted to compute the predicted output 1.0\*1.0 + 1.0\*1180.0 = 1181.0 this is the dot product between these two arrays. If they're numpy arrayws we can use np.dot() to compute this:

In [137]:
features_len = len(all_feat)+1 # for constant as extra 1
print(features_len)
all_weights = np.array(len(x.columns)*[1.]) # x is adding constant as its feature 
print(len(all_weights),all_weights)

19
19 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]


In [138]:
my_weights = np.array([1. ,1.]) # the example weights
my_features = example_features # we'll use the first data point
predicted_value = np.dot(my_features, my_weights)
print (predicted_value)


[1181. 2571.  771. ... 1021. 1601. 1021.]


In [139]:
# now use all features 
all_fet = ex_features #its for all data  points
all_pred = np.dot(all_fet , all_weights)

print (all_pred)

[115074.2542 123721.652  122222.5049 ... 106511.0454 108260.9655
 105586.0451]


np.dot() also works when dealing with a matrix and a vector. Recall that the predictions from all the observations is just the RIGHT (as in weights on the right) dot product between the features *matrix* and the weights *vector*. With this in mind finish the following predict_output function to compute the predictions for an entire matrix of features given the matrix and the weights:

In [140]:
def predict_output(feature_matrix, weights):
    # assume feature_matrix is a numpy matrix containing the features as columns and weights is a corresponding numpy array
    # create the predictions vector by using np.dot()
    
    predictions=np.dot(feature_matrix,weights)
    #predictions = np.dot(weights ,feature_matrix)
    
    return(predictions)

If you want to test your code run the following cell:

In [141]:
test_predictions = predict_output(example_features, my_weights)
print (test_predictions[0])# should be 1181.0
print (test_predictions[1]) # should be 2571.0

1181.0
2571.0


In [142]:
# with all the variables from data using pandas (  x,  my_weights1)

my_weights1 = np.array([1]*len(x.columns))
print(my_weights1.shape,  my_weights1)
test_pred1 = predict_output(x , my_weights1)
print(test_pred1)

(19,) [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
[115074.2542 123721.652  122222.5049 ... 106511.0454 108260.9655
 105586.0451]


In [143]:
x.columns

Index(['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot', 'floors',
       'waterfront', 'view', 'condition', 'grade', 'sqft_above',
       'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode', 'lat', 'long',
       'sqft_living15', 'sqft_lot15', 'constant'],
      dtype='object')

# Computing the Derivative

We are now going to move to computing the derivative of the regression cost function. Recall that the cost function is the sum over the data points of the squared difference between an observed output and a predicted output.

Since the derivative of a sum is the sum of the derivatives we can compute the derivative for a single data point and then sum over data points. We can write the squared difference between the observed output and predicted output for a single point as follows:

(w[0]\*[CONSTANT] + w[1]\*[feature_1] + ... + w[i] \*[feature_i] + ... +  w[k]\*[feature_k] - output)^2

Where we have k features and a constant. So the derivative with respect to weight w[i] by the chain rule is:

2\*(w[0]\*[CONSTANT] + w[1]\*[feature_1] + ... + w[i] \*[feature_i] + ... +  w[k]\*[feature_k] - output)\* [feature_i]

The term inside the paranethesis is just the error (difference between prediction and output). So we can re-write this as:

2\*error\*[feature_i]

That is, the derivative for the weight for feature i is the sum (over data points) of 2 times the product of the error and the feature itself. In the case of the constant then this is just twice the sum of the errors!

Recall that twice the sum of the product of two vectors is just twice the dot product of the two vectors. Therefore the derivative for the weight for feature_i is just two times the dot product between the values of feature_i and the current errors. 

With this in mind complete the following derivative function which computes the derivative of the weight given the value of the feature (over all data points) and the errors (over all data points).

In [144]:
def feature_derivative(errors, feature):
    # Assume that errors and feature are both numpy arrays of the same length (number of data points)
    # compute twice the dot product of these vectors as 'derivative' and return the value
        derivative = 2*np.dot(errors, feature)
        #print(derivative)

        return(derivative)

To test your feature derivartive run the following:

In [146]:
(example_features, example_output) = get_numpy_data(sales, ['sqft_living'], sales['price']) 
my_weights = np.array([0., 0.]) # this makes all the predictions 0
print(example_features.columns, example_features, example_output)

Index(['constant', 'sqft_living'], dtype='object')        constant  sqft_living
0             1         1180
1             1         2570
2             1          770
3             1         1960
4             1         1680
5             1         5420
6             1         1715
7             1         1060
8             1         1780
9             1         1890
10            1         3560
11            1         1160
12            1         1430
13            1         1370
14            1         1810
15            1         2950
16            1         1890
17            1         1600
18            1         1200
19            1         1250
20            1         1620
21            1         3050
22            1         2270
23            1         1070
24            1         2450
25            1         1710
26            1         2450
27            1         1400
28            1         1520
29            1         2570
...         ...          ...
21583         1      

In [147]:
test_predictions = predict_output(example_features, my_weights) 
# just like dataframe to numpy arrays can be elementwise subtracted with '-': 
errors = (test_predictions - example_output) # prediction errors in this case is just the -example_output

print('error values after predicted',np.array(errors))

error values after predicted [-221900. -538000. -180000. ... -402101. -400000. -325000.]


In [148]:
# here selecting only the first feature so that getting the error *derivative  value correctly 

feature = example_features.iloc[:,:1]
print(feature)

       constant
0             1
1             1
2             1
3             1
4             1
5             1
6             1
7             1
8             1
9             1
10            1
11            1
12            1
13            1
14            1
15            1
16            1
17            1
18            1
19            1
20            1
21            1
22            1
23            1
24            1
25            1
26            1
27            1
28            1
29            1
...         ...
21583         1
21584         1
21585         1
21586         1
21587         1
21588         1
21589         1
21590         1
21591         1
21592         1
21593         1
21594         1
21595         1
21596         1
21597         1
21598         1
21599         1
21600         1
21601         1
21602         1
21603         1
21604         1
21605         1
21606         1
21607         1
21608         1
21609         1
21610         1
21611         1
21612         1

[21613 

In [149]:
# let's compute the derivative with respect to 'constant', the ":" indicates "all rows"
derivative = feature_derivative(errors, feature)
print (derivative)
print (-np.sum(example_output)*2) # should be the same as derivative

[-2.334585e+10]
-23345850016.0


In [152]:
der = feature_derivative(errors,example_features) # see the diff 
der

array([-2.33458500e+10, -5.87888151e+13])

# test with all_features ,for derivative
GO Below

# Gradient Descent

Now we will write a function that performs a gradient descent. The basic premise is simple. Given a starting point we update the current weights by moving in the negative gradient direction. Recall that the gradient is the direction of *increase* and therefore the negative gradient is the direction of *decrease* and we're trying to *minimize* a cost function. 

The amount by which we move in the negative gradient *direction*  is called the 'step size'. We stop when we are 'sufficiently close' to the optimum. We define this by requiring that the magnitude (length) of the gradient vector to be smaller than a fixed 'tolerance'.

With this in mind, complete the following gradient descent function below using your derivative function above. For each step in the gradient descent we update the weight for each feature befofe computing our stopping criteria

In [150]:
from math import sqrt 
# recall that the magnitude/length of a vector [g[0], g[1], g[2]] is sqrt(g[0]^2 + g[1]^2 + g[2]^2)

In [171]:
def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance):
    converged = False 
    weights = np.array(initial_weights)# make sure it's a numpy array
    
    while not converged:
        # compute the predictions based on feature_matrix and weights using your predict_output() function
        predictions = predict_output(feature_matrix, weights)
        #print("pred ",predictions)

        # compute the errors as predictions - output
        errors = (predictions - output)

        gradient_sum_squares = 0 # initialize the gradient sum of squares
        # while we haven't reached the tolerance yet, update each feature's weight
        
        for i in range(len(weights)): # loop over each weight
            # Recall that feature_matrix[:, i] is the feature column associated with weights[i]
            # compute the derivative for weight[i]:
            print("weights ::", weights[i])
            
            derivative = feature_derivative(errors, feature_matrix)
            #derivative = derivative[i]
            #print(derivative)
            derivative = np.int64(derivative[i])
            print("derivative ::", derivative)
        
            # add the squared value of the derivative to the gradient sum of squares (for assessing convergence)
            gradient_sum_squares = gradient_sum_squares + (derivative * derivative)
            #print(gradient_sum_squares)

            # subtract the step size times the derivative from the current weight
            #weights[i] = np.int64( (weights[i] - (step_size * derivative) ) )
            weights[i] = (weights[i] - (step_size * derivative))

        # compute the square-root of the gradient sum of squares to get the gradient magnitude:
        gradient_magnitude =np.sqrt(gradient_sum_squares)
        #print(type(gradient_magnitude))
        print('gradient_magnitude ::', gradient_magnitude)
        if gradient_magnitude < tolerance:
            converged = True
    
    return(weights)

In [168]:
# just for test this values are taken , to small values are to be taken for step_size and tolerance
step_size = 0.000001
tolerance = 0.0001
feature_matrix = ex_features
output = ex_output
initial_weights = np.array(len(x.columns)*[0])
gd1 = regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance)
print(gd1)

weights :: 0
[-2.33458500e+10 -8.32460570e+10 -5.57887714e+10 -5.87888151e+13
 -4.11618179e+14 -3.70863609e+10 -5.41771584e+08 -1.03012875e+10
 -7.99715182e+10 -1.91205870e+11 -4.97090985e+13 -9.07971655e+12
 -4.60399660e+13 -2.77634786e+12 -2.28966770e+15 -1.11100490e+12
  2.85313897e+12 -5.27444973e+13 -3.33813318e+14]
derivative :: -23345850016
weights :: 0
[-2.33458500e+10 -8.32460570e+10 -5.57887714e+10 -5.87888151e+13
 -4.11618179e+14 -3.70863609e+10 -5.41771584e+08 -1.03012875e+10
 -7.99715182e+10 -1.91205870e+11 -4.97090985e+13 -9.07971655e+12
 -4.60399660e+13 -2.77634786e+12 -2.28966770e+15 -1.11100490e+12
  2.85313897e+12 -5.27444973e+13 -3.33813318e+14]
derivative :: -83246056970
weights :: 0
[-2.33458500e+10 -8.32460570e+10 -5.57887714e+10 -5.87888151e+13
 -4.11618179e+14 -3.70863609e+10 -5.41771584e+08 -1.03012875e+10
 -7.99715182e+10 -1.91205870e+11 -4.97090985e+13 -9.07971655e+12
 -4.60399660e+13 -2.77634786e+12 -2.28966770e+15 -1.11100490e+12
  2.85313897e+12 -5.2744497



OverflowError: Python int too large to convert to C long

A few things to note before we run the gradient descent. Since the gradient is a sum over all the data points and involves a product of an error and a feature the gradient itself will be very large since the features are large (squarefeet) and the output is large (prices). So while you might expect "tolerance" to be small, small is only relative to the size of the features. 

For similar reasons the step size will be much smaller than you might expect but this is because the gradient has such large values.

# Running the Gradient Descent as Simple Regression

First let's split the data into training and test data.

In [169]:
from sklearn.model_selection import train_test_split
x_train, x_test ,y_train,y_test = train_test_split(x , y , test_size=.2,random_state=0)
print(x_train.shape, y_train.shape)
print(x_test.shape ,y_test.shape)

(17290, 19) (17290,)
(4323, 19) (4323,)


Although the gradient descent is designed for multiple regression since the constant is now a feature we can use the gradient descent function to estimate the parameters in the simple regression on squarefeet. The folowing cell sets up the feature_matrix, output, initial weights and step size for the first model:

In [170]:
# let's test out the gradient descent
simple_features = ['sqft_living']
my_output = 'price'
(simple_feature_matrix, output) = get_numpy_data(x_train, simple_features, y_train)
initial_weights = np.array([-47000., 1.])
print(initial_weights)
step_size = 7e-12
tolerance = 2.5e7

[-4.7e+04  1.0e+00]


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


Next run your gradient descent with the above parameters.

In [172]:
simple_weights = regression_gradient_descent(simple_feature_matrix, output, initial_weights, step_size, tolerance)
print ("final_simple_weights ::", simple_weights)
print ("rounded weights of position 1 ::", round(simple_weights[1], 1))


weights :: -47000.0
derivative :: -20323303000
weights :: 1.0
derivative :: -50641368745168
gradient_magnitude :: 2795541556.5639195
weights :: -46999.857736879
derivative :: 5191227778
weights :: 355.489581216176
derivative :: 12913091363166
gradient_magnitude :: nan
weights :: -46999.89407547344
derivative :: -1314746986
weights :: 265.097941674014
derivative :: -3292721601214
gradient_magnitude :: nan
weights :: -46999.88487224454
derivative :: 344217781
weights :: 288.146992882512
derivative :: 839614247264
gradient_magnitude :: nan
weights :: -46999.887281769006
derivative :: -78803242
weights :: 282.269693151664
derivative :: -214094048773
gradient_magnitude :: nan
weights :: -46999.88673014631
derivative :: 29063295
weights :: 283.768351493075
derivative :: 54592043995
gradient_magnitude :: nan
weights :: -46999.886933589376
derivative :: 1558302
weights :: 283.38620718511
derivative :: -13920481452
gradient_magnitude :: nan
weights :: -46999.88694449749
derivative :: 8571826
we



How do your weights compare to those achieved in week 1 (don't expect them to be exactly the same)? 

What is the value o
f the weight for sqft_living -- the second element of ‘simple_weights’ (rounded to 1 decimal place)?**

# here  for all features  using get_numpy_data 

In [174]:
simple_features1 = list(x.columns)[:-1]
print(simple_features1 )

['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot', 'floors', 'waterfront', 'view', 'condition', 'grade', 'sqft_above', 'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode', 'lat', 'long', 'sqft_living15', 'sqft_lot15']


In [175]:
my_output = 'price'
weight_len = len(simple_features1)+1
(simple_feature_matrix1, output1) = get_numpy_data(x_train, simple_features1, y_train)
initial_weights1 = np.array( [1]* weight_len)
print(initial_weights1, len(initial_weights1))

[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1] 19


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [176]:
step_size = 7e-14
tolerance = 2.5e7
simple_feature_matrix1.shape

(17290, 19)

As step_size is low and tolerance is high it will converge fast , 
if step_size is low and tol is low it take time slow converge

In [185]:
simple_weights1 = regression_gradient_descent(simple_feature_matrix1, output1, initial_weights1, step_size, tolerance = 2e8)
print (simple_weights1)
print (round(simple_weights1[1], 1))


weights :: 1
derivative :: -14125662206
weights :: 1
derivative :: -51256138320
weights :: 1
derivative :: -34933809136
weights :: 1
derivative :: -37299952365125
weights :: 1
derivative :: -172068014132290
weights :: 1
derivative :: -22939838637
weights :: 1
derivative :: -437701520
weights :: 1
derivative :: -7207741605
weights :: 1
derivative :: -48457781738
weights :: 1
derivative :: -117831032743
weights :: 1
derivative :: -31327283946909
weights :: 1
derivative :: -5972668418216
weights :: 1
derivative :: -27855167326016
weights :: 1
derivative :: -1915582061873
weights :: 1
derivative :: -1385400823712845
weights :: 1
derivative :: -672392068250
weights :: 1
derivative :: 1726407612447
weights :: 1
derivative :: -32916751528824
weights :: 1
derivative :: -153308351381931
gradient_magnitude :: nan
weights :: 1
derivative :: 322702091369
weights :: 1
derivative :: 1083996709266
weights :: 1
derivative :: 680450159404
weights :: 3
derivative :: 668298428125359
weights :: 13
derivat



 2374299367
weights :: 1
derivative :: 74675338537
weights :: 1
derivative :: 1098944984855
weights :: 1
derivative :: 2466143261668
weights :: 3
derivative :: 575376783409071
weights :: 1
derivative :: 92921644716288
weights :: 2
derivative :: 636066627127279
weights :: 1
derivative :: 27640964732096
weights :: 97
derivative :: 31649778449420448
weights :: 1
derivative :: 15346959746206
weights :: 0
derivative :: -39437810988537
weights :: 3
derivative :: 639525204549759
weights :: 11
derivative :: 4795660234124981
gradient_magnitude :: nan
weights :: 0
derivative :: -7571475074505
weights :: 0
derivative :: -25531693897970
weights :: 0
derivative :: -16105890294863
weights :: -43
derivative :: -15917132897137876
weights :: -407
derivative :: -150300246426371136
weights :: 0
derivative :: -11340034737203
weights :: 0
derivative :: -64218821559
weights :: 0
derivative :: -1862670697352
weights :: 0
derivative :: -25791900820320
weights :: 0
derivative :: -58135155062168
weights :: -37


derivative :: -9223372036854775808
weights :: 4279040
derivative :: -9223372036854775808
weights :: -7965
derivative :: 34048335592467652
weights :: -43
derivative :: 212045135819430
weights :: -1395
derivative :: 6091123357030570
weights :: -18121
derivative :: 77389154878378272
weights :: -41075
derivative :: 175774144456204736
weights :: 2323521
derivative :: -9223372036854775808
weights :: -1573718
derivative :: 6734141373587422208
weights :: 3554752
derivative :: -9223372036854775808
weights :: -473823
derivative :: 2030916499798947328
weights :: 5214951
derivative :: -9223372036854775808
weights :: -253100
derivative :: 1080877308525586688
weights :: 650382
derivative :: -2777448506250022912
weights :: 3555977
derivative :: -9223372036854775808
weights :: 4331266
derivative :: -9223372036854775808
gradient_magnitude :: 1702926875.968579
weights :: -6909
derivative :: 25714208405210560
weights :: -23385
derivative :: 87098401471668832
weights :: -14853
derivative :: 55434417506665

derivative :: -9223372036854775808
weights :: 10089764
derivative :: -9223372036854775808
weights :: -40703
derivative :: 74405782391910752
weights :: -241
derivative :: 469132275414245
weights :: -7287
derivative :: 13456139945042834
weights :: -92515
derivative :: 169058244134728448
weights :: -210174
derivative :: 384413261591561728
weights :: 8134245
derivative :: -9223372036854775808
weights :: 700151
derivative :: -9223372036854775808
weights :: 9365476
derivative :: -9223372036854775808
weights :: -2427493
derivative :: 4440398985585321984
weights :: 11025675
derivative :: -9223372036854775808
weights :: -1292252
derivative :: 2361310824497005056
weights :: 3320615
derivative :: -6067667038879336448
weights :: 9366701
derivative :: -9223372036854775808
weights :: 10141990
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -30640
derivative :: 52649714747738384
weights :: -103818
derivative :: 178476003073046816
weights :: -66075
derivative :: 113704781992233

derivative :: -9223372036854775808
gradient_magnitude :: 460468267.32926744
weights :: -65989
derivative :: 76601819993299456
weights :: -223672
derivative :: 259741416114188160
weights :: -142443
derivative :: 165527078078385888
weights :: 15164194
derivative :: -9223372036854775808
weights :: 15900488
derivative :: -9223372036854775808
weights :: -98882
derivative :: 114787384741014496
weights :: -601
derivative :: 726631847219501
weights :: -17826
derivative :: 20836131053729784
weights :: -224710
derivative :: 260806302481017920
weights :: -510824
derivative :: 593224475261476224
weights :: 13944969
derivative :: -9223372036854775808
weights :: 6510875
derivative :: -9223372036854775808
weights :: 15176200
derivative :: -9223372036854775808
weights :: -5898622
derivative :: 6844555232567360512
weights :: 16836399
derivative :: -9223372036854775808
weights :: -3138712
derivative :: 3642749383461560320
weights :: 8065317
derivative :: -9223372036854775808
weights :: 15177425
derivati

derivative :: -9223372036854775808
weights :: 20341288
derivative :: -9223372036854775808
weights :: -10253487
derivative :: 8973474637259031552
weights :: 22001487
derivative :: -9223372036854775808
weights :: -5457622
derivative :: 4781377785480482816
weights :: 13230405
derivative :: -9223372036854775808
weights :: 20342513
derivative :: -9223372036854775808
weights :: 21117802
derivative :: -9223372036854775808
gradient_magnitude :: 1684711053.3368487
weights :: -121787
derivative :: 103538302357422048
weights :: -412894
derivative :: 351132091327664128
weights :: -263039
derivative :: 223805870148916384
weights :: 20974918
derivative :: -9223372036854775808
weights :: 21711212
derivative :: -9223372036854775808
weights :: -182497
derivative :: 155153178719137024
weights :: -1123
derivative :: 983866899451973
weights :: -33014
derivative :: 28211884196709016
weights :: -414694
derivative :: 352519596034677376
weights :: -942997
derivative :: 801955117159625216
weights :: 19755693
d

derivative :: 275674082603963168
weights :: 26140006
derivative :: -9223372036854775808
weights :: 26876300
derivative :: -9223372036854775808
weights :: -278181
derivative :: 191076503098004128
weights :: -1723
derivative :: 1213884379126480
weights :: -50418
derivative :: 34785823345208888
weights :: -632096
derivative :: 434131947269198976
weights :: -1437598
derivative :: 987709738372139008
weights :: 24920781
derivative :: -9223372036854775808
weights :: 17486687
derivative :: -9223372036854775808
weights :: 26152012
derivative :: -9223372036854775808
weights :: -5716534
derivative :: -9223372036854775808
weights :: 27812211
derivative :: -9223372036854775808
weights :: -8828869
derivative :: 6063629435053425664
weights :: 19041129
derivative :: -9223372036854775808
weights :: 26153237
derivative :: -9223372036854775808
weights :: 26928526
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -194562
derivative :: 130506021076387008
weights :: -659722
derivative 

derivative :: -9223372036854775808
weights :: 22651775
derivative :: -9223372036854775808
weights :: 31317100
derivative :: -9223372036854775808
weights :: -551438
derivative :: -9223372036854775808
weights :: 32977299
derivative :: -9223372036854775808
weights :: -12503773
derivative :: 7203516302207436800
weights :: 24206217
derivative :: -9223372036854775808
weights :: 31318325
derivative :: -9223372036854775808
weights :: 32093614
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -273514
derivative :: 154476166604407264
weights :: -927517
derivative :: 523960260542750144
weights :: -591082
derivative :: 334023053064164928
weights :: 31950730
derivative :: -9223372036854775808
weights :: 32687024
derivative :: -9223372036854775808
weights :: -409869
derivative :: 231488286193639360
weights :: -2557
derivative :: 1472643352053555
weights :: -74401
derivative :: 42181192822822944
weights :: -931301
derivative :: 525941396462034240
weights :: -2118360
derivative :

derivative :: -9223372036854775808
weights :: -16817000
derivative :: 8343351245212767232
weights :: 29371305
derivative :: -9223372036854775808
weights :: 36483413
derivative :: -9223372036854775808
weights :: 37258702
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -365889
derivative :: 178445220401627968
weights :: -1240856
derivative :: 605285413522388480
weights :: -790838
derivative :: 385886343726605184
weights :: 37115818
derivative :: -9223372036854775808
weights :: 37852112
derivative :: -9223372036854775808
weights :: -548299
derivative :: 267408131819162976
weights :: -3437
derivative :: 1702641823829066
weights :: -99629
derivative :: 48754577192097632
weights :: -1245818
derivative :: 607545843926078976
weights :: -2834000
derivative :: 1382410518855495168
weights :: 35896593
derivative :: -9223372036854775808
weights :: 28462499
derivative :: -9223372036854775808
weights :: 37127824
derivative :: -9223372036854775808
weights :: 5259286
derivative 

derivative :: -9223372036854775808
weights :: 41647276
derivative :: -9223372036854775808
weights :: 9778738
derivative :: -9223372036854775808
weights :: 43307475
derivative :: -9223372036854775808
weights :: -20469039
derivative :: -9223372036854775808
weights :: 34536393
derivative :: -9223372036854775808
weights :: 41648501
derivative :: -9223372036854775808
weights :: 42423790
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -471687
derivative :: 202417473302576416
weights :: -1599736
derivative :: 686621337226610816
weights :: -1019636
derivative :: 437756407795933056
weights :: 42280906
derivative :: -9223372036854775808
weights :: 43017200
derivative :: -9223372036854775808
weights :: -706844
derivative :: 303332768711756032
weights :: -4445
derivative :: 1932666326931032
weights :: -128537
derivative :: 55328722003100768
weights :: -1606034
derivative :: 689161189558742272
weights :: -3653651
derivative :: 1568171855069282816
weights :: 41061681
derivati

derivative :: -9223372036854775808
weights :: 39701481
derivative :: -9223372036854775808
weights :: 46813589
derivative :: -9223372036854775808
weights :: 47588878
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -590913
derivative :: 226402162933269760
weights :: -2004176
derivative :: 767999142399258880
weights :: -1277491
derivative :: 489652813823999744
weights :: 47445994
derivative :: -9223372036854775808
weights :: 48182288
derivative :: -9223372036854775808
weights :: -885512
derivative :: 339276036188529984
weights :: -5581
derivative :: 2162792159430174
weights :: -161129
derivative :: 61905826427643120
weights :: -2011966
derivative :: 770818903056984320
weights :: -4577359
derivative :: 1754028455653715968
weights :: 46226769
derivative :: -9223372036854775808
weights :: 38792675
derivative :: -9223372036854775808
weights :: 47458000
derivative :: -9223372036854775808
weights :: 15589462
derivative :: -9223372036854775808
weights :: 49118199
derivati

derivative :: -9223372036854775808
weights :: 20754550
derivative :: -9223372036854775808
weights :: 54283287
derivative :: -9223372036854775808
weights :: -9493210
derivative :: -9223372036854775808
weights :: 45512205
derivative :: -9223372036854775808
weights :: 52624313
derivative :: -9223372036854775808
weights :: 53399602
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -741098
derivative :: 253384888649152832
weights :: -2513646
derivative :: 859549000468404352
weights :: -1602315
derivative :: 548036160340768768
weights :: 53256718
derivative :: -9223372036854775808
weights :: 53993012
derivative :: -9223372036854775808
weights :: -1110573
derivative :: 379712135269455104
weights :: -7012
derivative :: 2421683258242456
weights :: -202197
derivative :: 69305055587668544
weights :: -2523299
derivative :: 862683659794359680
weights :: -5740936
derivative :: 1963116739897875200
weights :: 52037493
derivative :: -9223372036854775808
weights :: 44603399
derivat

derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -888865
derivative :: 277369489179985984
weights :: -3014926
derivative :: 940926498529745408
weights :: -1921925
derivative :: 599932370342419200
weights :: 58421806
derivative :: -9223372036854775808
weights :: 59158100
derivative :: -9223372036854775808
weights :: -1332013
derivative :: 415655266158693632
weights :: -8420
derivative :: 2651808268073822
weights :: -242616
derivative :: 75882136336457520
weights :: -3026403
derivative :: 944341069384655232
weights :: -6885812
derivative :: 2148972644635986176
weights :: 57202581
derivative :: -9223372036854775808
weights :: 49768487
derivative :: -9223372036854775808
weights :: 58433812
derivative :: -9223372036854775808
weights :: 26565274
derivative :: -9223372036854775808
weights :: 60094011
derivative :: -9223372036854775808
weights :: -3682477
derivative :: -9223372036854775808
weights :: 51322929
derivative :: -9223372036854775808
weights :: 58435037
derivat

derivative :: 1025998335941024640
weights :: -8134768
derivative :: 2334828221875496960
weights :: 62367669
derivative :: -9223372036854775808
weights :: 54933575
derivative :: -9223372036854775808
weights :: 63598900
derivative :: -9223372036854775808
weights :: 31730362
derivative :: -9223372036854775808
weights :: 65259099
derivative :: -9223372036854775808
weights :: 1482616
derivative :: -9223372036854775808
weights :: 56488017
derivative :: -9223372036854775808
weights :: 63600125
derivative :: -9223372036854775808
weights :: 64375414
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -1071158
derivative :: 304352114651598336
weights :: -3633340
derivative :: 1032476011075335424
weights :: -2316224
derivative :: 658315496314755968
weights :: 64232530
derivative :: -9223372036854775808
weights :: 64968824
derivative :: -9223372036854775808
weights :: -1605192
derivative :: 456091211568243968
weights :: -10157
derivative :: 2910698441328037
weights :: -292489
d

derivative :: -9223372036854775808
weights :: 61653105
derivative :: -9223372036854775808
weights :: 68765213
derivative :: -9223372036854775808
weights :: 69540502
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -1247467
derivative :: 328336626069176448
weights :: -4231460
derivative :: 1113853201980221696
weights :: -2697592
derivative :: 710211510261561088
weights :: 69397618
derivative :: -9223372036854775808
weights :: 70133912
derivative :: -9223372036854775808
weights :: -1869404
derivative :: 492034205849858880
weights :: -11837
derivative :: 3140822628383911
weights :: -340733
derivative :: 89858395929479040
weights :: -4247330
derivative :: 1117862589838728704
weights :: -9664249
derivative :: 2543915354788998656
weights :: 68178393
derivative :: -9223372036854775808
weights :: 60744299
derivative :: -9223372036854775808
weights :: 69409624
derivative :: -9223372036854775808
weights :: 37541086
derivative :: -9223372036854775808
weights :: 71069823
der

derivative :: 3370946428302939
weights :: -392661
derivative :: 96435441857860960
weights :: -4893334
derivative :: 1199519552460615424
weights :: -11134372
derivative :: 2729770236122558464
weights :: 73343481
derivative :: -9223372036854775808
weights :: 65909387
derivative :: -9223372036854775808
weights :: 74574712
derivative :: -9223372036854775808
weights :: 42706174
derivative :: -9223372036854775808
weights :: 76234911
derivative :: -9223372036854775808
weights :: 12458428
derivative :: -9223372036854775808
weights :: 67463829
derivative :: -9223372036854775808
weights :: 74575937
derivative :: -9223372036854775808
weights :: 75351226
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -1461870
derivative :: 355319151295515584
weights :: -4958816
derivative :: 1205402368998953216
weights :: -3161368
derivative :: 768594415687384960
weights :: 75208342
derivative :: -9223372036854775808
weights :: 75944636
derivative :: -9223372036854775808
weights :: -219070

derivative :: 379303573613189248
weights :: -5653776
derivative :: 1286779252792380672
weights :: -3604491
derivative :: 820490233607674624
weights :: 80373430
derivative :: -9223372036854775808
weights :: 81109724
derivative :: -9223372036854775808
weights :: -2497686
derivative :: 568412855280560256
weights :: -15838
derivative :: 3629835240457473
weights :: -455483
derivative :: 103834605209610528
weights :: -5674748
derivative :: 1291383464461245440
weights :: -12912667
derivative :: 2938856586207508480
weights :: 79154205
derivative :: -9223372036854775808
weights :: 71720111
derivative :: -9223372036854775808
weights :: 80385436
derivative :: -9223372036854775808
weights :: 48516898
derivative :: -9223372036854775808
weights :: 82045635
derivative :: -9223372036854775808
weights :: 18269152
derivative :: -9223372036854775808
weights :: 73274553
derivative :: -9223372036854775808
weights :: 80386661
derivative :: -9223372036854775808
weights :: 81161950
derivative :: -922337203685

derivative :: -9223372036854775808
weights :: 53681986
derivative :: -9223372036854775808
weights :: 87210723
derivative :: -9223372036854775808
weights :: 23434240
derivative :: -9223372036854775808
weights :: 78439641
derivative :: -9223372036854775808
weights :: 85551749
derivative :: -9223372036854775808
weights :: 86327038
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -1913233
derivative :: 406285998602826624
weights :: -6490074
derivative :: 1378328074313151232
weights :: -4137742
derivative :: 878872918505187328
weights :: 86184154
derivative :: -9223372036854775808
weights :: 86920448
derivative :: -9223372036854775808
weights :: -2867102
derivative :: 608848493357250176
weights :: -18196
derivative :: 3888723562648054
weights :: -522966
derivative :: 111233754460597664
weights :: -6514036
derivative :: 1383247195460884480
weights :: -14822686
derivative :: 3147942521861435904
weights :: 84964929
derivative :: -9223372036854775808
weights :: 77530835
d

derivative :: 4118846104368859
weights :: -586865
derivative :: 117810764179048496
weights :: -7308656
derivative :: 1464903693283143680
weights :: -16631064
derivative :: 3333796338961192448
weights :: 90130017
derivative :: -9223372036854775808
weights :: 82695923
derivative :: -9223372036854775808
weights :: 91361248
derivative :: -9223372036854775808
weights :: 59492710
derivative :: -9223372036854775808
weights :: 93021447
derivative :: -9223372036854775808
weights :: 29244964
derivative :: -9223372036854775808
weights :: 84250365
derivative :: -9223372036854775808
weights :: 91362473
derivative :: -9223372036854775808
weights :: 92137762
derivative :: -9223372036854775808
gradient_magnitude :: 2352951058.239802
weights :: -2176744
derivative :: 433268370525334656
weights :: -7384052
derivative :: 1469876712920919808
weights :: -4707775
derivative :: 937255486650981632
weights :: 91994878
derivative :: -9223372036854775808
weights :: 92731172
derivative :: -9223372036854775808
wei

derivative :: 4377733501061744
weights :: -663153
derivative :: 125209886794913168
weights :: -8257263
derivative :: 1556767082387528960
weights :: -18789895
derivative :: 3542881491790753280
weights :: 95940741
derivative :: -9223372036854775808
weights :: 88506647
derivative :: -9223372036854775808
weights :: 97171972
derivative :: -9223372036854775808
weights :: 65303434
derivative :: -9223372036854775808
weights :: 98832171
derivative :: -9223372036854775808
weights :: 35055688
derivative :: -9223372036854775808
weights :: 90061089
derivative :: -9223372036854775808
weights :: 97173197
derivative :: -9223372036854775808
weights :: 97948486
derivative :: -9223372036854775808
gradient_magnitude :: 2681469548.820786
weights :: -2457253
derivative :: 460250689381814400
weights :: -8335703
derivative :: 1561425168619629312
weights :: -5314589
derivative :: 995637938047510528
weights :: 97805602
derivative :: -9223372036854775808
weights :: 98541896
derivative :: -9223372036854775808
wei

derivative :: 484234928260554304
weights :: -9230033
derivative :: 1642801420131906560
weights :: -5884856
derivative :: 1047533352388306176
weights :: 102970690
derivative :: -9223372036854775808
weights :: 103706984
derivative :: -9223372036854775808
weights :: -4077402
derivative :: 725662101948504960
weights :: -25918
derivative :: 4636620407792889
weights :: -744102
derivative :: 132608995310051424
weights :: -9263744
derivative :: 1648630290491276800
weights :: -21080449
derivative :: 3751966230190150144
weights :: 101751465
derivative :: -9223372036854775808
weights :: 94317371
derivative :: -9223372036854775808
weights :: 102982696
derivative :: -9223372036854775808
weights :: 71114158
derivative :: -9223372036854775808
weights :: 104642895
derivative :: -9223372036854775808
weights :: 40866412
derivative :: -9223372036854775808
weights :: 95871813
derivative :: -9223372036854775808
weights :: 102983921
derivative :: -9223372036854775808
weights :: 103759210
derivative :: -9223

derivative :: -9223372036854775808
weights :: 101036901
derivative :: -9223372036854775808
weights :: 108149009
derivative :: -9223372036854775808
weights :: 108924298
derivative :: -9223372036854775808
gradient_magnitude :: 2607814532.9221416
weights :: -3033484
derivative :: 511217146880384320
weights :: -10290629
derivative :: 1734349530332774144
weights :: -6561144
derivative :: 1105915583256653568
weights :: 108781414
derivative :: -9223372036854775808
weights :: 109517708
derivative :: -9223372036854775808
weights :: -4545886
derivative :: 766097423666184448
weights :: -28906
derivative :: 4895506824551498
weights :: -829713
derivative :: 140008089724149376
weights :: -10328100
derivative :: 1740493317590521344
weights :: -23502727
derivative :: 3961050554150550528
weights :: 107562189
derivative :: -9223372036854775808
weights :: 100128095
derivative :: -9223372036854775808
weights :: 108793420
derivative :: -9223372036854775808
weights :: 76924882
derivative :: -922337203685477

derivative :: -9223372036854775808
weights :: -4983702
derivative :: 802039863568061568
weights :: -31698
derivative :: 5125627672571113
weights :: -909725
derivative :: 146585050699173216
weights :: -11322778
derivative :: 1822149189729626112
weights :: -25766444
derivative :: 4146902938647663616
weights :: 112727277
derivative :: -9223372036854775808
weights :: 105293183
derivative :: -9223372036854775808
weights :: 113958508
derivative :: -9223372036854775808
weights :: 82089970
derivative :: -9223372036854775808
weights :: 115618707
derivative :: -9223372036854775808
weights :: 51842224
derivative :: -9223372036854775808
weights :: 106847625
derivative :: -9223372036854775808
weights :: 113959733
derivative :: -9223372036854775808
weights :: 114735022
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -3363102
derivative :: 538199312434960000
weights :: -11408897
derivative :: 1825897457627235072
weights :: -7274214
derivative :: 1164297697377405184
weights :: 

derivative :: -9223372036854775808
weights :: 120493520
derivative :: -9223372036854775808
weights :: -5500304
derivative :: 842475031628127744
weights :: -34992
derivative :: 5384513163848157
weights :: -1004141
derivative :: 153984118478626624
weights :: -12496448
derivative :: 1914011874940049152
weights :: -28437532
derivative :: 4355986479798187008
weights :: 118538001
derivative :: -9223372036854775808
weights :: 111103907
derivative :: -9223372036854775808
weights :: 119769232
derivative :: -9223372036854775808
weights :: 87900694
derivative :: -9223372036854775808
weights :: 121429431
derivative :: -9223372036854775808
weights :: 57652948
derivative :: -9223372036854775808
weights :: 112658349
derivative :: -9223372036854775808
weights :: 119770457
derivative :: -9223372036854775808
weights :: 120545746
derivative :: -9223372036854775808
gradient_magnitude :: 2343664835.4666944
weights :: -3709717
derivative :: 565181424923367808
weights :: -12584841
derivative :: 1917445202012

gradient_magnitude :: nan
weights :: -4032091
derivative :: 589165480363739520
weights :: -13678545
derivative :: 1998820821245021184
weights :: -8721475
derivative :: 1274574705511039488
weights :: 125567950
derivative :: -9223372036854775808
weights :: 126304244
derivative :: -9223372036854775808
weights :: -6042381
derivative :: 882910118339098368
weights :: -38450
derivative :: 5643398165159423
weights :: -1103218
derivative :: 161383172157240832
weights :: -13727992
derivative :: 2005874379148320000
weights :: -31240343
derivative :: 4565069606515091456
weights :: 124348725
derivative :: -9223372036854775808
weights :: 116914631
derivative :: -9223372036854775808
weights :: 125579956
derivative :: -9223372036854775808
weights :: 93711418
derivative :: -9223372036854775808
weights :: 127240155
derivative :: -9223372036854775808
weights :: 63463672
derivative :: -9223372036854775808
weights :: 118469073
derivative :: -9223372036854775808
weights :: 125581181
derivative :: -922337203

derivative :: 2087529786494199296
weights :: -33842312
derivative :: 4750920926793039872
weights :: 129513813
derivative :: -9223372036854775808
weights :: 122079719
derivative :: -9223372036854775808
weights :: 130745044
derivative :: -9223372036854775808
weights :: 98876506
derivative :: -9223372036854775808
weights :: 132405243
derivative :: -9223372036854775808
weights :: 68628760
derivative :: -9223372036854775808
weights :: 123634161
derivative :: -9223372036854775808
weights :: 130746269
derivative :: -9223372036854775808
weights :: 131521558
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -4410816
derivative :: 616147492616615552
weights :: -14963430
derivative :: 2090368220135948288
weights :: -9540799
derivative :: 1332956482356483072
weights :: 131378674
derivative :: -9223372036854775808
weights :: 132114968
derivative :: -9223372036854775808
weights :: -6609932
derivative :: 923345123702111488
weights :: -42077
derivative :: 5902282676508506
weights

weights :: -45437
derivative :: 6132401830829944
weights :: -1303082
derivative :: 175359123966917440
weights :: -16212145
derivative :: 2179391948814868736
weights :: -36893931
derivative :: 4960003270703053824
weights :: 135324537
derivative :: -9223372036854775808
weights :: 127890443
derivative :: -9223372036854775808
weights :: 136555768
derivative :: -9223372036854775808
weights :: 104687230
derivative :: -9223372036854775808
weights :: 138215967
derivative :: -9223372036854775808
weights :: 74439484
derivative :: -9223372036854775808
weights :: 129444885
derivative :: -9223372036854775808
weights :: 136556993
derivative :: -9223372036854775808
weights :: 137332282
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -4806540
derivative :: 643129451803952768
weights :: -16305989
derivative :: 2181915436119445760
weights :: -10396905
derivative :: 1391338142453694464
weights :: 137189398
derivative :: -9223372036854775808
weights :: 137925692
derivative :: -9223

derivative :: -9223372036854775808
weights :: 142497370
derivative :: -9223372036854775808
gradient_magnitude :: nan
weights :: -5172565
derivative :: 667113370977015296
weights :: -17547797
derivative :: 2263290585663315712
weights :: -11188764
derivative :: 1443232853417793024
weights :: 142354486
derivative :: -9223372036854775808
weights :: 143090780
derivative :: -9223372036854775808
weights :: -7751478
derivative :: 999722134104806144
weights :: -49370
derivative :: 6391285416698622
weights :: -1415625
derivative :: 182758136910218528
weights :: -17610878
derivative :: 2271253930134322688
weights :: -40077272
derivative :: 5169085200181516288
weights :: 141135261
derivative :: -9223372036854775808
weights :: 133701167
derivative :: -9223372036854775808
weights :: 142366492
derivative :: -9223372036854775808
weights :: 110497954
derivative :: -9223372036854775808
weights :: 144026691
derivative :: -9223372036854775808
weights :: 80250208
derivative :: -9223372036854775808
weights 

derivative :: -9223372036854775808
weights :: 85415296
derivative :: -9223372036854775808
weights :: 140420697
derivative :: -9223372036854775808
weights :: 147532805
derivative :: -9223372036854775808
weights :: 148308094
derivative :: -9223372036854775808
gradient_magnitude :: 2154051580.6821127
weights :: -5600396
derivative :: 694095229929702656
weights :: -18999299
derivative :: 2354837456155855872
weights :: -12114342
derivative :: 1501614292991217664
weights :: 148165210
derivative :: -9223372036854775808
weights :: 148901504
derivative :: -9223372036854775808
weights :: -8392621
derivative :: 1040156904462995968
weights :: -53465
derivative :: 6650168512612384
weights :: -1532830
derivative :: 190157135753001536
weights :: -19067485
derivative :: 2363115730456111104
weights :: -43392334
derivative :: 5378166715236640768
weights :: 146945985
derivative :: -9223372036854775808
weights :: 139511891
derivative :: -9223372036854775808
weights :: 148177216
derivative :: -922337203685

Use your newly estimated weights and your predict_output() function to compute the predictions on all the TEST data (you will need to create a numpy array of the test feature_matrix and test output first:

In [186]:
(test_simple_feature_matrix, test_output) = get_numpy_data(x_test, simple_features, y_test)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


Now compute your predictions using test_simple_feature_matrix and your weights from above.

In [187]:
simple_predictions = predict_output(test_simple_feature_matrix, simple_weights)
print(simple_predictions)


[ 358353.39059137 1276776.20181686  361188.02889762 ...  338510.92244761
  222290.7518913   417880.79502265]


**Quiz Question: What is the predicted price for the 1st house in the TEST data set for model 1 (round to nearest dollar)?**

In [188]:
print (round(simple_predictions[0]))


358353.0


In [189]:
test_errors = simple_predictions - test_output
RSS = sum(test_errors **2)
print(RSS)

267729995270519.06


Now that you have the predictions on test data, compute the RSS on the test data set. Save this value for comparison later. Recall that RSS is the sum of the squared errors (difference between prediction and output).

# here with all weights get test on data 

In [190]:
(test_matrix1, test_output1) = get_numpy_data(x_test, simple_features1, y_test)
test_matrix1.shape

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


(4323, 19)

In [192]:
simple_predictions1 = predict_output(test_matrix1, simple_weights1)
print(simple_predictions1[0],simple_predictions1 )


16180268041757.9 [1.61802680e+13 3.09404732e+13 1.68705986e+13 ... 1.67147845e+13
 1.69029933e+13 1.93411735e+13]


In [193]:
test_errors1 = simple_predictions1 - test_output1
RSS1 = sum(test_errors * test_errors)
print(RSS1)

267729995270519.06


# Running a multiple regression

Now we will use more than one actual feature. Use the following code to produce the weights for a second model with the following parameters:

In [194]:
model_features = ['sqft_living', 'sqft_living15'] # sqft_living15 is the average squarefeet for the nearest 15 neighbors. 
my_output = 'price'
(feature_matrix, output) = get_numpy_data(x_train, model_features, y_train)
initial_weights = np.array([-100000., 1., 1.])
step_size = 4e-12
tolerance = 1e9

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


Use the above parameters to estimate the model weights. Record these values for your quiz.

In [195]:
multiple_weights = regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance)
print (multiple_weights)

weights :: -100000.0
derivative :: -22087316112
weights :: 1.0
derivative :: -54296468061066
weights :: 1.0
derivative :: -49022103728294
gradient_magnitude :: 2574122726.3653183
weights :: -99999.91165073555
derivative :: 7021271330
weights :: 218.185872244264
derivative :: 15936571964641
weights :: 197.088414913176
derivative :: 15617455482706
gradient_magnitude :: 2877559531.8818035
weights :: -99999.93973582087
derivative :: -1860250025
weights :: 154.43958438570002
derivative :: -5462100488627
weights :: 134.618592982352
derivative :: -4106443214624
gradient_magnitude :: 2229250836.614814
weights :: -99999.93229482077
derivative :: 841189734
weights :: 176.28798634020802
derivative :: 1076471020219
weights :: 151.044365840848
derivative :: 1891611429802
gradient_magnitude :: 2839393077.2066383
weights :: -99999.9356595797
derivative :: 11253664
weights :: 171.982102259332
derivative :: -903084191803
weights :: 143.47792012164
derivative :: 47669530010
gradient_magnitude :: nan
wei



Use your newly estimated weights and the predict_output function to compute the predictions on the TEST data. Don't forget to create a numpy array for these features from the test set first!

In [196]:
(test_multiple_feature_matrix, test_output) = get_numpy_data(x_test, model_features, y_test)
multiple_predictions = predict_output(test_multiple_feature_matrix, multiple_weights)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


What is the predicted price for the 1st house in the TEST data set for model 2 (round to nearest dollar)?**

In [197]:
print (round(multiple_predictions[0]))
multiple_predictions


352911.0


array([ 352911.44049332, 1323572.31188305,  356078.65296591, ...,
        296674.33539719,  217287.68741076,  416899.44924817])

What is the actual price for the 1st house in the test data set?

In [198]:
y_test


17384     297000.0
722      1578000.0
2680      562100.0
18754     631500.0
14554     780000.0
16227     485000.0
6631      340000.0
19813     335606.0
3367      425000.0
21372     490000.0
3268      732000.0
20961     389700.0
21456     450000.0
3880      357000.0
17472     960000.0
7618      257000.0
1091      448000.0
1560      610000.0
8945      230950.0
8439      377500.0
13058     375000.0
12080     410000.0
7417      459000.0
3101      190000.0
18769     585000.0
7332      280000.0
19186     500000.0
11875     465000.0
14100     802000.0
10641     440000.0
           ...    
20973    1242000.0
10966     411500.0
380       270000.0
16870     725000.0
5617     2400000.0
17271     423000.0
5078      250000.0
10545     850000.0
10321     925000.0
6289      490000.0
3639      200000.0
3338     1675000.0
20476     375000.0
11879     339950.0
13129     516000.0
14660     802500.0
18064     210000.0
7340      565000.0
14360     420000.0
20380     500000.0
19125     390000.0
10033     22

Which estimate was closer to the true price for the 1st house on the TEST data set, model 1 or model 2?**

Now use your predictions and the output to compute the RSS for model 2 on TEST data.

In [201]:
multiple_test_errors = multiple_predictions - y_test
RSSm = (multiple_test_errors **2).sum()
print (RSSm)

263772457096127.34


Which model (1 or 2) has lowest RSS on all of the TEST data? **

In [200]:
RSS < RSSm # which is, it reduced diff of (predict-true) value in multiple_features with compared single_fe


False