# Jewellery estimation

Use linear regression to estimate the price of jewellery.

## Dataset definition

A jeweller prices stones on the basis of quality and color. 
- quality estimate values ranges from 0 to 8, with 8 being flawless and 0 containing numerous imperfections
- color values ranges from 1 to 10, with 10 being pure white and 1 being yellow 
 
Based on the price per carat (in hundreds of euros) of the following 11 diamonds  
weighing between 1.0 and 1.5 carats, determine the relationship between quality, color and price.

|Color|Quality|Price/cr|
|:-----:|:--------:|:-------:|
|    7|       5|     65|
|    3|       7|     38|
|    5|       8|     51|
|    8|       1|     38|
|    9|       3|     55|
|    5|       4|     43|
|    4|       0|     25|
|    2|       6|     33|
|    8|       7|     71|
|    6|       4|     51|
|    9|       2|     49|


## Estimation

Estimate the price of the following stones based on Color and Quality:

|Color|Quality|Price/cr|
|:-----:|:--------:|:-------:|
|    8|       5|   xx |
|    1|       0|   xx |
|    1|       8|   xx |
|   10|       0|   xx |
|   10|       8|   xx |

Write down the regression line coefficients and intercepts of the model too.

In [4]:
import numpy as np
import pandas as pd

%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns

data = {
    'Color': [7, 3, 5, 8, 9, 5, 4, 2, 8, 6, 9],
    'Quality': [5, 7, 8, 1, 3, 4, 0, 6, 7, 4, 2],
    'Price/cr': [65, 38, 51, 38, 55, 43, 25, 33, 71, 51, 49]
}
estimate = {
    'Color': [8, 1, 1, 10, 10],
    'Quality': [5, 0, 8, 0, 8]
}
df = pd.DataFrame(data)
df_est = pd.DataFrame(estimate)

In [70]:
# Set X and y
X_train = df[['Color','Quality']]
y_train = df[['Price/cr']]
X_test = df_est[['Color','Quality']]

In [58]:
# Linear regression model
from sklearn.linear_model import LinearRegression
mlr = LinearRegression()
mlr.fit(X_train, y_train)

In [72]:
# Predict prices
y_pred_mlr = mlr.predict(X_test)
print(y_pred_mlr)

[[59.70578799]
 [ 6.64669202]
 [36.71401589]
 [50.7042873 ]
 [80.77161117]]


In [74]:
# Append to data
df_est['Price/cr'] = y_pred_mlr

In [86]:
print("Training set: \n", df)
print("\nPrice estimates: \n", df_est)
print("\nCoef: ", mlr.coef_)
print("\nIntercept: ", mlr.intercept_)

Training set: 
     Color  Quality  Price/cr
0       7        5        65
1       3        7        38
2       5        8        51
3       8        1        38
4       9        3        55
5       5        4        43
6       4        0        25
7       2        6        33
8       8        7        71
9       6        4        51
10      9        2        49

Price estimates: 
    Color  Quality   Price/cr
0      8        5  59.705788
1      1        0   6.646692
2      1        8  36.714016
3     10        0  50.704287
4     10        8  80.771611

Coef:  [[4.89528836 3.75841548]]

Intercept:  [1.75140366]
