# App User Segmentation
In the problem of app user segmentation, we need to group users based on how they engage with an application.

So to solve this problem, we need to have data about the users based on how they engage with the application.

In [21]:
#Import Libraries
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
import pandas as pd
pio.templates.default = "plotly_white"

In [22]:
user_behaviour_data = pd.read_csv('/kaggle/input/user-behaviour/user behaviour/userbehaviour.csv')

In [23]:
print(user_behaviour_data)

     userid  Average Screen Time  Average Spent on App (INR)  Left Review  \
0      1001                 17.0                       634.0            1   
1      1002                  0.0                        54.0            0   
2      1003                 37.0                       207.0            0   
3      1004                 32.0                       445.0            1   
4      1005                 45.0                       427.0            1   
..      ...                  ...                         ...          ...   
994    1995                 38.0                       938.0            0   
995    1996                 43.0                        61.0            0   
996    1997                 47.0                       761.0            0   
997    1998                  6.0                        73.0            1   
998    1999                  9.0                        54.0            0   

     Ratings  New Password Request  Last Visited Minutes       Status  
0  

In [24]:
user_behaviour_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 999 entries, 0 to 998
Data columns (total 8 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   userid                      999 non-null    int64  
 1   Average Screen Time         999 non-null    float64
 2   Average Spent on App (INR)  999 non-null    float64
 3   Left Review                 999 non-null    int64  
 4   Ratings                     999 non-null    int64  
 5   New Password Request        999 non-null    int64  
 6   Last Visited Minutes        999 non-null    int64  
 7   Status                      999 non-null    object 
dtypes: float64(2), int64(5), object(1)
memory usage: 62.6+ KB


Let’s start by looking at the highest, lowest, and average screen time of all the users:

In [25]:
print(f'Average Screen time {user_behaviour_data["Average Screen Time"].mean()}')
print(f'Highest Screen time {user_behaviour_data["Average Screen Time"].max()}')
print(f'Lowest Screen time {user_behaviour_data["Average Screen Time"].min()}')

Average Screen time 24.39039039039039
Highest Screen time 50.0
Lowest Screen time 0.0


Checking the highest, lowest, and the average amount spent by all the users:

In [26]:
print(f"Average amount spent on the App {user_behaviour_data['Average Spent on App (INR)'].mean()}")
print(f"Highest amount spent on the App {user_behaviour_data['Average Spent on App (INR)'].max()}")
print(f"Lowest amount spent on the App {user_behaviour_data['Average Spent on App (INR)'].min()}")

Average amount spent on the App 424.4154154154154
Highest amount spent on the App 998.0
Lowest amount spent on the App 0.0


Checking the relationship between the spending capacity and screen time of the active users and the users who have uninstalled the app:

In [27]:
fig = px.scatter(data_frame = user_behaviour_data, x='Average Screen Time',
                y='Average Spent on App (INR)',
                size = 'Average Spent on App (INR)', color = 'Status',
                title = 'Relationship Between Spending Capacity and Screentime', trendline = 'ols')
fig.show()

Users who uninstalled the application had an average screen time of fewer than 5 minutes a day,
and the average spent was less than 100. We can also see a linear relationship between 
the average screen time and the average spending of the users still using the app.
    

Now let’s have a look at the relationship between the ratings given by users and the average screen time:

In [28]:
fig = px.scatter(user_behaviour_data, x='Average Screen Time', y='Ratings', size='Ratings',
                 color = 'Status',
                 title='Relationship Between Ratings and Screentime', trendline = 'ols')
fig.show()

So we can see that users who uninstalled the app gave the app a maximum of five ratings.
Their screen time is very low compared to users who rated more. So, this describes that users who don’t like to spend more time rate the app low and uninstall it at some point.

# App User Segmentation to Find Retained and Lost Users

Moving forward to Application User segmentation to find the users the app retained and lost forever.
I will be using the K-means clustering algorithm in Machine Learning for this task:

In [29]:
clustering_data = user_behaviour_data[['Average Screen Time','Left Review','Ratings',
                                       'Last Visited Minutes','Average Spent on App (INR)','New Password Request']]

In [30]:
from sklearn.preprocessing import MinMaxScaler
for i in clustering_data.columns:
    MinMaxScaler(i)

In [31]:
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters = 3 )
clusters = kmeans.fit_predict(clustering_data)
user_behaviour_data['Segments'] = clusters
print(user_behaviour_data.head(10))

   userid  Average Screen Time  Average Spent on App (INR)  Left Review  \
0    1001                 17.0                       634.0            1   
1    1002                  0.0                        54.0            0   
2    1003                 37.0                       207.0            0   
3    1004                 32.0                       445.0            1   
4    1005                 45.0                       427.0            1   
5    1006                 28.0                       599.0            0   
6    1007                 49.0                       887.0            1   
7    1008                  8.0                        31.0            0   
8    1009                 28.0                       741.0            1   
9    1010                 28.0                       524.0            1   

   Ratings  New Password Request  Last Visited Minutes       Status  Segments  
0        9                     7                  2990    Installed         0  
1        4    





In [32]:
user_behaviour_data['Segments'].value_counts()

Segments
0    910
1     45
2     44
Name: count, dtype: int64

Renaming the segments for a better understanding:

In [33]:
user_behaviour_data['Segments'] = user_behaviour_data['Segments'].map({0:"Retained",1:"Churn",2:"Needs Attention"})

Visualizing the segments:

In [37]:
fig = go.Figure()
for i in list(user_behaviour_data['Segments'].unique()):
    fig.add_trace(go.Scatter(x= user_behaviour_data[user_behaviour_data['Segments']==i]['Last Visited Minutes'],
                            y=user_behaviour_data[user_behaviour_data['Segments']==i]['Average Spent on App (INR)'],
                            mode='markers',marker_size= 6,marker_line_width=1, name= str(i)))
fig.update_traces(hovertemplate = 'Last Visit Minutes : %{x} <br> Average Spent on App (INR): %{y}')

fig.update_layout(width=800,height=800, autosize=True,showlegend= True,
                 yaxis_title = 'Average Spent on Time',xaxis_title = 'Last Visited Minutes',
                 scene = dict(xaxis=dict(title= 'Last Visited Minutes',titlefont_color='black'),
                             yaxis= dict(title='Average Spent on App (INR)', titlefont_color ='black')))

The blue segment shows the segment of users the app has retained over time.

The red segment indicates the segment of users who just uninstalled the app or are about to uninstall it soon. And the green segment indicates the segment of users that the application has lost.

# Summary

So this is how you can segment users based on how they engage with the application. 
Application users segmentation helps businesses find retained users, find the user segment for a marketing campaign, and solve many other business problems where you need to search for users based on similar characteristics. I hope you liked this article on App Users Segmentation with Machine Learning using Python. Feel free to ask valuable questions in the comments section below.