This code is a simple example of using Linear Regression to predict car prices based on various features. It takes advantage of Python's powerful libraries like pandas, matplotlib, and scikit-learn. In this README, we will explain the code step by step and provide a clear understanding of what it does.
Before running the code, ensure that you have the required libraries installed. You can install them using pip:
pip install pandas scikit-learn matplotlib
The code reads car data from a CSV file named "car_data.csv" using the pandas library. It then drops the "Owner" column, as it is not used for prediction. The target variable (dependent variable) is the "Selling_Price," and the features (independent variables) are all other columns in the dataset.
The code performs some data preprocessing steps:
-
Encoding Categorical Variables:
The code uses LabelEncoder from scikit-learn to convert categorical variables like "Car_Name," "Transmission," "Selling_type," and "Fuel_Type" into numerical values. This is necessary as machine learning models require numerical input.
-
Splitting Data:
The dataset is split into training and testing sets using train_test_split from scikit-learn. It allocates 70% of the data for training and 30% for testing, ensuring that the data is randomly shuffled for better model generalization.
The code uses a simple linear regression model to predict car prices. The steps for building the model are as follows:
- Import the LinearRegression class from scikit-learn.
- Create a LinearRegression model (regressor).
- Fit the model to the training data using the fit() method.
The code uses the trained model to make predictions on the test data. It uses the predict() method on the model and stores the predictions in the "Y_predict" variable.
The code calculates the accuracy of the model using the score() method. This method calculates the coefficient of determination (R-squared) for the model's predictions. The R-squared value measures how well the model fits the data, with higher values indicating a better fit.
This code showcases a basic example of using linear regression to predict car prices. It provides a foundation for more complex machine learning projects and can be extended to include additional features or explore different regression techniques.