<a href="https://colab.research.google.com/github/yachika-yashu/Machine-learning/blob/main/Multiple_linear_regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import pandas as pd
import numpy as np

import plotly.express as px
import plotly.graph_objects as go

from sklearn.datasets import make_regression

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error

Make regression is useful for generating synthetic datasets for regression problems

n_samples=1000: 1000 data points (rows).

n_features=2: Each sample has 2 features (columns in X).

n_informative=2: Both features are used to generate the target y.

n_targets=1: A single target variable y.

noise=50: Adds random Gaussian noise to the output y, simulating real-world variability.

In [2]:
X, y = make_regression(n_samples=1000, n_features=2, n_informative=2, n_targets=1, noise=50)
X

array([[ 1.78540633,  0.36232205],
       [-0.72215409, -2.55738654],
       [-0.05771018,  0.53923004],
       ...,
       [ 0.27809948, -0.57821425],
       [-0.1432113 ,  0.82806844],
       [-0.43394264,  0.40788954]])

In [3]:
df=pd.DataFrame({'feature1':X[:,0],'feature2':X[:,1],'target':y})
df.head()

Unnamed: 0,feature1,feature2,target
0,1.785406,0.362322,194.377853
1,-0.722154,-2.557387,-99.893369
2,-0.05771,0.53923,21.615778
3,-1.188319,0.885558,-172.996633
4,-1.988289,-1.34492,-143.060614


In [4]:
fig=px.scatter_3d(df,x='feature1',y='feature2',z='target',template='plotly_dark')
fig.show()

# **Regression starts here**

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
lr=LinearRegression()
lr.fit(X_train,y_train)

In [6]:
y_pred = lr.predict(X_test)

In [7]:
print("MAE",mean_absolute_error(y_test,y_pred))
print("MSE",mean_squared_error(y_test,y_pred))
print("R2 score",r2_score(y_test,y_pred))

MAE 38.93995457319874
MSE 2282.280309276649
R2 score 0.7628640248665736


building a prediction grid using numpy.meshgrid and using it to visualize or evaluate a trained regression model lr over a 2D grid.


In [8]:
x = np.linspace(-5, 5, 10)
y = np.linspace(-5, 5, 10) #Generates 10 equally spaced points between -5 and 5 for both x and y.


xGrid, yGrid = np.meshgrid(y, x) #Creates 2D coordinate matrices from x and y vectors., xGrid and yGrid are both 10x10 arrays, holding the X and Y coordinates of a grid.

final = np.vstack((xGrid.ravel().reshape(1,100),yGrid.ravel().reshape(1,100))).T    #xGrid.ravel() and yGrid.ravel() flatten the 2D grids into 1D arrays (length 100)., These are stacked into a shape of (2, 100) and then transposed to get a (100, 2) array. final now contains 100 coordinate pairs — each a 2D input vector like [-5.0, -5.0], [-3.888, -5.0], ...].
z_final = lr.predict(final).reshape(10,10) #Uses your trained linear regression model lr to make predictions for each 2D point in final.The predictions are reshaped back to a 10x10 grid (z_final) to match the xGrid/yGrid shape — great for surface or contour plots.

z = z_final


In [9]:
fig = px.scatter_3d(df, x='feature1', y='feature2', z='target')

fig.add_trace(go.Surface(x = x, y = y, z =z ))

fig.show()

In [10]:
lr.coef_
lr.intercept_

np.float64(1.9809615366351498)