When both features are highly correlated and have the same importance, using L1 regularization may lead to selecting only one feature and setting the other to zero. This is not desirable because both features are important and will help our model.

When two features are highly correlated but have different levels of importance, the model may assign more weight to the less important feature, resulting in an incorrect selection of the most important features. This is because the model cannot distinguish the true importance between each feature and target.

In cases where both features are highly correlated but have low importance, dropping both features will not result in a loss of information.

Overall, understanding the importance and correlation between features is critical when applying machine learning models, and regularization techniques such as L1 regularization should be used with caution to avoid losing important information or selecting the wrong subset of features.

In [1]:
import numpy as np

# First Case

In [2]:
X1 = [1, 2, 3, 4, 5]
X2 = [1.1, 2.1, 3.1, 4.1, 5.1]
y = [4, 8, 12, 16, 20]

In [3]:
from sklearn.linear_model import LinearRegression

X = np.array([X1]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)


[4.]


In [4]:
from sklearn.linear_model import LinearRegression

X = np.array([X2]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)


[4.]


In [5]:
from sklearn.linear_model import LinearRegression

X = np.array([X1, X2]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)

[2. 2.]


In [6]:
from sklearn.linear_model import Lasso

In [7]:
model=Lasso()
model.fit(X,y)
model.coef_

array([3.5, 0. ])

# Second Case 

In [8]:
X1 = [1, 2, 3, 4, 5]
X2 = [2, 4, 6, 8, 10]
y = [4, 8, 12, 16, 20]


In [9]:
from sklearn.linear_model import LinearRegression

X = np.array([X1]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)


[4.]


In [10]:
from sklearn.linear_model import LinearRegression

X = np.array([X2]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)


[2.]


In [11]:
from sklearn.linear_model import LinearRegression

X = np.array([X1,X2]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)

[0.8 1.6]


In [12]:
model=Lasso()

In [13]:
model.fit(X, y)

print(model.coef_)


[0.    1.875]


# Third Case

In [14]:
X1 = [15, 2, 4, 11, 9]
X2 = [15.1, 2.1, 4.1, 11.1, 9.1]
X3 = [2,4,6,8,10]
y = [4, 8, 12, 16, 20]

In [15]:
from sklearn.linear_model import LinearRegression

X = np.array([X1]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)


[-0.10830325]


In [16]:
from sklearn.linear_model import LinearRegression

X = np.array([X2]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)


[-0.10830325]


In [17]:
from sklearn.linear_model import LinearRegression

X = np.array([X3]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)


[2.]


In [18]:
from sklearn.linear_model import LinearRegression

X = np.array([X1, X2, X3]).T
y = np.array(y)

model = LinearRegression()
model.fit(X, y)

print(model.coef_)


[ 2.60910315e-17 -1.24900090e-16  2.00000000e+00]


In [19]:
model=Lasso()

In [20]:
model.fit(X, y)

print(model.coef_)


[-0.    -0.     1.875]
