In this notebook, we learn how to use scikit-learn to implement simple linear regression. We download a dataset that is related to fuel consumption and Carbon dioxide emission of cars. Then, we split our data into training and test sets, create a model using training set, evaluate your model using test set, and finally use model to predict unknown value.
Table of contents:
*Understanding the Data *Reading the data in *Data Exploration *Simple Regression Model
We have downloaded a fuel consumption dataset, FuelConsumption.csv, which contains model-specific fuel consumption ratings and estimated carbon dioxide emissions for new light-duty vehicles for retail sale in Canada.
1.MODELYEAR e.g. 2014 2.MAKE e.g. Acura 3.MODEL e.g. ILX 4.VEHICLE CLASS e.g. SUV 5.ENGINE SIZE e.g. 4.7 6.CYLINDERS e.g 6 7.TRANSMISSION e.g. A6 8.FUEL CONSUMPTION in CITY(L/100 km) e.g. 9.9 9.FUEL CONSUMPTION in HWY (L/100 km) e.g. 8.9 10.FUEL CONSUMPTION COMB (L/100 km) e.g. 9.2 11.CO2 EMISSIONS (g/km) e.g. 182 --> low --> 0