Skip to content

Hariharanm95/Telecommunication-visualisation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Telecommunication-visualisation

INTRODUCTION:

Customer churn means shifting from one service provider to its competitor in the market. ... The telecom service providers strive very hard to sustain in this competition. So to sustain this competition they often try to retain their customers than acquiring new ones as it proved to be much costlier.

UNDERSTAND MORE ABOUT THE DATA:

Breakdown of Our Features:

State: 51 Unique States name. Account Length: Length of The Account. Area Code: Code Number of Area having some States. International Plan: Yes Indicate International Plan is Present and No Indicates no subscription for International Plan. Voice Mail Plan: Yes Indicates Voice Mail Plan is Present and No Indicates no subscription for Voice Mail Plan. Number vmail messages: Number of Voice Mail Messages ranging from 0 to 50. Total day calls: Total Number of Calls made in Morning. Total day charge: Total Charge to the Customers in Morning. Total eve calls: Total Number of Calls made r in Evening. Total eve charge: Total Charge to the Customers in Morning. Total night calls: Total Number of Calls made in Night. Total night charge: Total Charge to the Customers in Night. Customer service calls: Number of customer service calls made by customer. Churn: Customer Churn, True means churned customer, False means retained customer.

CHECKING FOR MISSING AND DUPLICATE VALUES

image

• As of now There are 3333 rows and 17 columns in above dataset. • out of which there are 1 Boolean data type i.e. churn • 8 float data type, • 8 integer data type, • 3 object data type i.e. categorical value are there. • There are some missing value present, • And there are no duplicate value present.

EXPLORATORY DATA ANALYSIS OF THE DATA SET

ANALYZING WHAT THE DEPENDENT VARIABLE SAID TO US I.E 'CHURN'.

image image image After analysing the churn column, we had little to say like almost 15% of customers have churned. let's see what other features say to us and what relation we get after correlated with churn.

ANALYSING STATE COLUMN

image image image

There is 51 unique state present who have different churn rate.

From the above analysis CA, NJ, TX, MD, SC, MI are the ones who have a higher churn rate of more than 21.

The reason for this churn rate from a particular state may be due to the low coverage of the cellular network.

ANALYSING "INTERNATIONAL PLAN" COLUMN

image image From the above data we get There are 3010 customers who don't have a international plan. There are 323 customers who have a international plan. Among those who have a international plan 42.4 % people churn. Whereas among those who don't have a international plan only 11.4 % people churn. So basically the people who bought International plans are churning in big numbers. Probably because of connectivity issues or high call charge.

ANALYSING "CUSTOMER SERVICE CALLS" COLUMN

image It is observed from the above analysis that, mostly because of bad customer service, people tend to leave the operator. The above data indicating that those customers who called the service centre 5 times or above those customer churn percentage is higher than 60%, And customers who have called once also have a high churn rate indicating their issue was not solved in the first attempt. So operator should work to improve the service call.

ANALYSING ALL CALLS, ALL CALLS CHARGE TOGETHER

As these data sets are numerical data type, so for analysing with the 'churn' which is a categorical data set, We are using mean, median, and box plots. image image image image image image image image image After analysing the above dataset we have noticed that total day/night/eve call/charges are not put any kind of cause for churn rate. But international call charges are high as compare to others it's an obvious thing but that may be a cause for international plan customers to churn out.

GRAPHICAL ANALYSIS

UNIVARIATE ANALYSIS

In Univariate Analysis we analyse data over a single column from the numerical dataset, for this we use 3 types of plot which are box plot, strip plot, dis plot.

image image image image image image image image image image image image

image image image image image image image image image image image image

image image image image image image

BIVARIATE ANALYSIS

In Bivariate Analysis we analyse data by taking two columns into consideration from a dataset, here we only take numerical data type column, for this visualization we use Box plot, scatter plot

image image image image

image

MULTIVARIATE ANALYSIS

In Multivariate Analysis we analyse data by taking more than two columns into consideration from a dataset, for this we using correlation plot, correlation matrix, correlation heatmap, pair plot.

image image image image

PROPOSED METHODS LINEAR REGRESSION Linear regression analysis is used to predict the value of a variable based on the value of another variable. Formula: y = a + bx Mean absolute error =0.2415396741634622

LOGISTIC REGRESSION It is used in statistical software to understand the relationship between the dependent variable and one or more independent variables by estimating probabilities using a logistic regression equation. Accuracy =0.8530734632683659

DECISION TREE  A Decision tree is a flowchart like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (terminal node) holds a class label. Mean absolute error = 0.2302893989957953 Accuracy = 0.8440779610194903

SUPPORT VECTOR MACHINE “Support Vector Machine” (SVM) is a supervised machine learning algorithm that can be used for both classification or regression challenges. ... Support Vectors are simply the coordinates of individual observation. Mean absolute error = 0.4490695205336077 Accuracy = 0.8530734632683659 ![image](https://user-images.githubusercontent.com/100566501/177751012-c31098ea-6889-49a0-ab5b-e501b9376f08.png

NAIVE BAYES The Naive Bayes classification algorithm is a probabilistic classifier. It is based on probability models that incorporate strong independence assumptions. P(H|X)= P(X|H) P(H)/ P(X) Accuracy = 0.8380809595202399

RANDOM FOREST Random forest builds decision trees on different samples and takes their majority vote for classification and average in case of regression. Mean absolute error = 0.1399455059099384 Accuracy = 0.862134632683659

MEAN ABSOLUTE ERROR

image

In terms of mean absolute error Random forest gives the best result

ACCURACY

image

In terms of accuracy also Random forest gives the best result.

RECOMMENDATIONS: Improve network coverage churned state. In international plan provide some discount plan to the customer. Improve the voice mail quality or take feedback from the customer. Improve the service of call centre and take frequently feedback from the customer regarding their issue and try to solve it as soon as possible.

CONCLUSION: After performing exploratory data analysis on the data set, this is what we have incurred from data: There are some states where the churn rate is high as compared to others may be due to low network coverage. Area code and Account length do not play any kind of role regarding the churn rate so,it's redundant data column. In the International plan those customers who have this plan are churn more and also the international calling charges are also high so the customer who has the plan unsatisfied with network issues and high call charge. In the voice mail section when there are more than 20 voice-mail messages then there is a churn so it basically means that the quality of voice mail is not good. Total day call minutes, total day calls, Total day charge, Total eve minutes, Total eve calls, Total eve charge, Total night minutes, Total night calls, Total night charge, these columns didn't play any kind of role regarding the churn rate. In international calls data shows that the churn rate of those customers is high, those who take the international plan so it means that in international call charges are high also there is a call drop or network issue. In Customer service calls data shows us that whenever an unsatisfied customer called the service centre the churn rate is high, which means the service center didn't resolve the customer issue.

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published