# Hands-on exercise

In this simple example you are required to perform a simple linear regression with scipy. Find all the information on the function in the documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html

### Assignment

1) Load the provided .csv file with the used car data

2) Use a linear regression to estimate the car prices from the year, kilometers or engine power. You can make a simple 1D regression from each one of the parameters independently (as an optional task you can also try a 2D or 3D regression combining multiple cues)

3) Firstly perform the estimation using the scipy linregress function (or alternatively you can use the sklearn.linear_model.LinearRegression class).
NB: check the documentation of the two methods!! In particular be aware of the number of outputs (in case use "_" to avoid the return of a specific output).

4) Have a look at the correlation coefficient to see which of the 3 features works better

5) Then implement the least square algorithm: you should get exactly the same solution of linregress !

6) Plot the data and the lines representing the output of the linregress and least square algorithms


In [20]:
import matplotlib.pyplot as plt
import csv
from scipy import stats
import numpy as np
import sklearn as sl
from sklearn import linear_model

In [17]:
# Load the provided data file with the used car data (you can also have a look at it with any text editor)

filename = "data/km_year_power_price.csv"
lines = csv.reader(open(filename, newline=''), delimiter=',')

dataset = list(lines)
del dataset[0]

for i in range(len(dataset)):
    dataset[i] = [float(x) for x in dataset[i]]
    
dataset = np.array(dataset)

print(dataset)

data_km = dataset[:, 0:4:3]
data_year = dataset[:, 1:4:2]
data_power = dataset[:, 2:4]


[[1.250000e+05 2.001000e+03 4.000000e+01 1.371110e+03]
 [1.500000e+05 2.001000e+03 4.000000e+01 1.298700e+03]
 [5.000000e+03 2.001000e+03 6.000000e+01 1.232430e+03]
 ...
 [2.000000e+04 2.015000e+03 2.600000e+02 4.949238e+04]
 [1.000000e+04 2.015000e+03 3.000000e+02 3.854269e+04]
 [2.000000e+04 2.015000e+03 3.000000e+02 3.968332e+04]]


Use linear regression to estimate the car prices from the year, kilometers or engine power. 
You can make a simple 1D regression from each one of the parameters independently 




In [30]:
# linear regression with linregress (estimate price from year)

result_km = stats.linregress(x=data_km, y=None)
result_year = stats.linregress(x=data_year, y=None)
result_power = stats.linregress(x=data_power, y=None)

# print('Results km-price: ', result_km)
# print('Results year-price: ', result_year)
# print('Results power-price: ', result_power)

correlations = np.array([result_km.rvalue, result_year.rvalue, result_power.rvalue])
best_corr = np.argmax(correlations)

if best_corr == 0:
    print('Best correlation: km, rvalue = ', correlations[0])
elif best_corr == 1:
    print('Best correlation: year, rvalue = ', correlations[1])
elif best_corr == 2:
    print('Best correlation: power, rvalue = ', correlations[2])

Best correlation: power, rvalue =  0.7085500315263968


In [None]:
# (Optional) linear regression with linear_model.LinearRegression() (estimate price from year)
# Recall that in Python a mx1 matrix is different from a 1D array -> need to reshape

# your code.....

In [None]:
# (Optional) perform linear regression with a manually implemented least squares (estimate price from year)
# You should get exactly the same solution of linregress !

# your code.....

In [None]:
# Plot the data and the lines representing the output of the linregress and least square algorithms

# your code....


In [None]:
# linear regression with linregress (estimate price from power)

# your code.....

In [None]:
# linear regression with linregress (estimate price from km)

# your code...

In [None]:
# Have a look at the correlation coefficients to see which of the 3 features works better

# your code......

In [None]:
# (Optional) 2D linear regression with linear model (estimate price from year and power)


# your code......
