### Lasso Regresyon 
Amac hata kareler toplamini minimize eden katsayilari bu katsayilarra bir ceza uygulayarak bulmaktir. (Lasso: L1, Ridge: L2)
* Ridge regresyonunu ilgili-ilgisiz tum degiskenleri modelde birakma dezavantajini gidermek icin onerilmistir.
* Lasso'da katsayilari sifira yaklastirilir.
* Fakat L1 normu lambda yeteri kadar buyuk oldugunda bazi katsayilari sifir yapar. Boylece degisken secimi yapmis olur.
* Lambda'nin dogru secilmesi cok onemlidir, burada da CV kullanilir.
* Ridge ve Lasso yontemleri birbirinden ustun degildir


#### Model

In [None]:
import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge, Lasso
from sklearn.metrics import mean_squared_error,r2_score
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn import model_selection
import matplotlib.pyplot as plt
from sklearn.linear_model import RidgeCV, LassoCV

In [None]:
df =pd.read_csv('../input/hitters-baseball-data/Hitters.csv')
df = df.dropna()
dms = pd.get_dummies(df[['League','Division','NewLeague']])
y = df['Salary']
X_ = df.drop(['Salary','League','Division','NewLeague'],axis = 1).astype('float64')
X = pd.concat([X_, dms[['League_N','Division_W','NewLeague_N']]],axis=1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 42)

In [None]:
df.head()

In [None]:
lasso_model = Lasso().fit(X_train,y_train)
lasso_model

In [None]:
lasso_model.intercept_

In [None]:
lasso_model.coef_

In [None]:
#farkli lambda degerlerine karsilik katsayilar

In [None]:
lasso = Lasso()
coefs = []
alphas = 10 ** np.linspace(10,-2,100) * 0.5
for i in alphas :
    lasso.set_params(alpha = i)
    lasso.fit(X_train,y_train)
    coefs.append(lasso.coef_)

In [None]:
ax = plt.gca()
ax.plot(alphas,coefs)
ax.set_xscale('log')

Ridge de sifira yaklasir ancak 0 olmaz. Lassoda ise bir noktadan sonra 0 olur.

#### Tahmin

In [None]:
lasso_model.predict(X_train)[0:5]

In [None]:
y_pred = lasso_model.predict(X_test)
np.sqrt(mean_squared_error(y_test,y_pred))

In [None]:
# Bagimsiz degiskenlerce bagimli degiskendeki degisikligin aciklanma yuzdesidir
r2_score(y_test,y_pred)

#### Model Tuning

In [None]:
lasso_cv_model = LassoCV(cv = 10, max_iter = 100000).fit(X_train,y_train)

In [None]:
lasso_cv_model.alpha_

In [None]:
lasso_tuned = Lasso(alpha = lasso_cv_model.alpha_).fit(X_train,y_train)

In [None]:
y_pred = lasso_tuned.predict(X_test)
np.sqrt(mean_squared_error(y_test,y_pred))

In [None]:
alphas = 10**np.linspace(10,-2,100)*0.5
lasso_cv_model = LassoCV(alphas = alphas, cv = 10, max_iter = 100000).fit(X_train,y_train)
print('alpha: ',lasso_cv_model.alpha_)
lasso_tuned = Lasso(alpha = lasso_cv_model.alpha_).fit(X_train,y_train)
y_pred = lasso_tuned.predict(X_test)
np.sqrt(mean_squared_error(y_test,y_pred))

In [None]:
#katsayisi 0 olan degiskenler anlamsizdir.
pd.Series(lasso_tuned.coef_, index = X_train.columns)