# **Multiple Regression**

Multiple regression is a statistical method for modeling the relationship between multiple independent variables and a dependent variable. In multiple regression, the goal is to predict the value of the dependent variable based on the values of the independent variables.

The basic equation for multiple regression is given by:

`y = b0 + b1 * x1 + b2 * x2 + ... + bn * xn`

where y is the dependent variable, b0 is the intercept, x1, x2, ..., xn are the independent variables, and b1, b2, ..., bn are the coefficients that represent the effect of each independent variable on the dependent variable.

The coefficients are estimated using a statistical technique such as least squares regression. The goal is to find the coefficients that minimize the sum of the squared differences between the predicted values of the dependent variable and the actual values.

Once the coefficients are estimated, the multiple regression model can be used to make predictions for new data by plugging in values for the independent variables. The predictions can be used for a variety of purposes, such as understanding the relationship between the independent variables and the dependent variable, making decisions based on the independent variables, and testing hypotheses about the relationships between the variables.

In [2]:
from sklearn import linear_model
import pandas as pd

In [3]:
data = '/content/drive/MyDrive/Machine_Learning/vgsales.csv'

In [4]:
data = pd.read_csv(data)

In [6]:
data.columns

Index(['Rank', 'Name', 'Platform', 'Year', 'Genre', 'Publisher', 'NA_Sales',
       'EU_Sales', 'JP_Sales', 'Other_Sales', 'Global_Sales'],
      dtype='object')

In [7]:
# get the x and y from the data

x = data[['NA_Sales','EU_Sales', 'JP_Sales', 'Other_Sales']]
y = data['Global_Sales']

From the sklearn module we will use the `LinearRegression()` method to create a linear regression object.

This object has a method called `fit()` that takes the independent and dependent values as parameters and fills the regression object with data that describes the relationship:

In [11]:
regr = linear_model.LinearRegression()

regr.fit(x, y)

LinearRegression()

In [9]:
# Get the co-effiecient of the equation

print(regr.coef_)

[0.99994058 0.99998758 0.99988382 0.99958749]


The intercept:

In [15]:
regr.intercept_

0.0003229496920061914

In [16]:
# get the intercept and coefficients
b0 = regr.intercept_
b = regr.coef_

The equation:


In [26]:
# print the equation of the line of best fit
print("y = {:.8f} + {:.8f} * x1 + {:.8f} * x2 + {:.8f} * x3 ".format(b0, b[0], b[1], b[2]))

y = 0.00032295 + 0.99994058 * x1 + 0.99998758 * x2 + 0.99988382 * x3 


Predict NA,EU,JP and Other at 50 at all point

In [13]:
predicted_global_sales = regr.predict([[50, 50, 50, 50]])

print(predicted_global_sales)

[199.97029652]




In [14]:
predicted_global_sales

array([199.97029652])