Multiple Linear Regression is one of the important regression algorithms which models the linear relationship between a single dependent continuous variable and more than one independent variable.
o For MLR, the dependent or target variable(Y) must be the continuous/real, but the predictor or independent variable may be of continuous or categorical form.
o Each feature variable must model the linear relationship with the dependent variable.
o MLR tries to fit a regression line through a multidimensional space of data-points.
o A linear relationship should exist between the Target and predictor variables.
o The regression residuals must be normally distributed.
o MLR assumes little or no multicollinearity (correlation between the independent variable) in data.
This particular dataset holds data from 50 startups in New York, California, and Florida. The features in this dataset are R&D spending, Administration Spending, Marketing Spending, and location features, while the target variable is: Profit.
- R&D spending: The amount which startups are spending on Research and development.
- Administration spending: The amount which startups are spending on the Admin panel.
- Marketing spending: The amount which startups are spending on marketing strategies. f
- State: To which state that particular startup belongs.
- Profit: How much profit that particular startup is making.
Numpy
Pandas
Seaborn
Matplotlib
Scikit learn