So, since our task is to classify the price range of mobile phones and not to predict the actual prices, so here I am going to train a classification model to classify the price range of mobile phones as:

0 (low cost)
1 (medium cost)
2 (high cost)
3 (very high cost)
I will start the task of mobile price classification with machine learning by importing the necessary Python libraries and the dataset:

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
sns.set()

data = pd.read_csv("mobile_prices.csv")
print(data.head())

So the dataset contains 21 columns and luckily this dataset has no missing values, so we can just start by training the model, but before that let’s take a look at the correlation between the features in the dataset:

In [None]:
plt.figure(figsize=(12, 10))
sns.heatmap(data.corr(), annot=True, cmap="coolwarm", linecolor='white', linewidths=1)

Data Preparation
This dataset has no categorical features, so we can just use this data without any transformation because all the features of the dataset are numeric. But to train a model, it is very important to standardize or normalize the data and break it up into training and testing sets.

So let’s standardize the dataset and divide the data into 80% training and 20% testing:

In [None]:
x = data.iloc[:, :-1].values
y = data.iloc[:, -1].values
x = StandardScaler().fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.20, random_state=0)

Now let’s train the mobile price classification model using Python. As this is a problem of classification, I will be using the Logistic Regression algorithm provided by Scikit-learn:

In [None]:
from sklearn.linear_model import LogisticRegression
lreg = LogisticRegression()
lreg.fit(x_train, y_train)
y_pred = lreg.predict(x_test)

Now let’s have a look at the accuracy of the model:

In [None]:
accuracy = accuracy_score(y_test, y_pred) * 100
print("Accuracy of the Logistic Regression Model: ",accuracy)

So the model gives an accuracy of about 95.5% which is great. Now let’s have a look at the predictions made by the model:

In [None]:
print(y_pred)

The above output shows the price range classified by the model. Let’s have a look at the number of mobile phones classified for each price range:

In [None]:
(unique, counts) = np.unique(y_pred, return_counts=True)
price_range = np.asarray((unique, counts)).T
print(price_range)