# 회귀 (Regression)

- 회귀 평가지표 ='손실'
- MSE(mean_squared_error)   : 평균 제곱 오차
- MAE(mean_absolute_error)  : 평균 절대 오차
- R2 Score                  : 결정계수 (1에 가까울수록 좋음)

In [1]:
from sklearn.datasets import fetch_california_housing
import pandas as pd

In [2]:
housing = fetch_california_housing()
df = pd.DataFrame(housing.data, columns=housing.feature_names)
df.head()

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude
0,8.3252,41.0,6.984127,1.02381,322.0,2.555556,37.88,-122.23
1,8.3014,21.0,6.238137,0.97188,2401.0,2.109842,37.86,-122.22
2,7.2574,52.0,8.288136,1.073446,496.0,2.80226,37.85,-122.24
3,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25
4,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25


In [None]:
# 정답값
housing.target

array([4.526, 3.585, 3.521, ..., 0.923, 0.847, 0.894], shape=(20640,))

In [4]:
from sklearn.model_selection import train_test_split

X = housing.data
y = housing.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


## 선형회귀 (Linear Regression)

- 가장 기본적인 회귀

In [5]:
from sklearn.linear_model import LinearRegression

In [6]:
model = LinearRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
y_pred

array([0.71912284, 1.76401657, 2.70965883, ..., 4.46877017, 1.18751119,
       2.00940251], shape=(4128,))

In [8]:
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

In [17]:
print(mean_squared_error(y_pred, y_test))
print(mean_absolute_error(y_pred, y_test))
print(r2_score(y_test, y_pred))

0.5558034669932196
0.5332039182571576
0.5758549611440138


## 릿지회귀 (Ridge Regression)

- 선형회귀 + L2규제 (가중치 크기 제어)

In [11]:
from sklearn.linear_model import Ridge  # Ridge = L2 규제 (L2 : 모든 가중치를 제곱해서 더한값 / L1 : 절대값)

In [16]:
model = Ridge(alpha=1.0)        # 최적화된 Ridge 모델을 찾는건 alpha 값을 잘 조절해서 (alpha : 하이퍼파라미터)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print(mean_squared_error(y_pred, y_test))
print(mean_absolute_error(y_pred, y_test))
print(r2_score(y_test, y_pred))

0.5558034669932196
0.5332039182571576
0.5758549611440138


## 라쏘회귀 (Lasso Regression)

In [13]:
from sklearn.linear_model import Lasso

In [15]:
model = Lasso(alpha=0.1)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print(mean_squared_error(y_pred, y_test))
print(mean_absolute_error(y_pred, y_test))
print(r2_score(y_test, y_pred))

0.6135115198058131
0.5816074623949871
0.5318167610318159
