<a href="https://colab.research.google.com/github/Muhammadridho100902/google_collab/blob/main/App_User_Segmentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **App User Segmentation**

---


In the problem of app user segmentation, we need to group users based on how they engage with the app. So to solve this problem, we need to have data about the users based on how they engage with the app.

In [None]:
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
import pandas as pd
pio.templates.default = "plotly_white"

data = pd.read_csv("userbehaviour.csv")
print(data.head())

   userid  Average Screen Time  Average Spent on App (INR)  Left Review  \
0    1001                 17.0                       634.0            1   
1    1002                  0.0                        54.0            0   
2    1003                 37.0                       207.0            0   
3    1004                 32.0                       445.0            1   
4    1005                 45.0                       427.0            1   

   Ratings  New Password Request  Last Visited Minutes       Status  
0        9                     7                  2990    Installed  
1        4                     8                 24008  Uninstalled  
2        8                     5                   971    Installed  
3        6                     2                   799    Installed  
4        5                     6                  3668    Installed  


Start lookin the average, highest and lowest screen time

In [None]:
print(f'Average Screen Time = {data["Average Screen Time"].mean()}')
print(f'Highest Screen Time = {data["Average Screen Time"].max()}')
print(f'Lowest Screen Time = {data["Average Screen Time"].min()}')

Average Screen Time = 24.39039039039039
Highest Screen Time = 50.0
Lowest Screen Time = 0.0


lookin the average, highest and lowest spent time

In [None]:
print(f'Average Spend of the Users = {data["Average Spent on App (INR)"].mean()}')
print(f'Highest Spend of the Users = {data["Average Spent on App (INR)"].max()}')
print(f'Lowest Spend of the Users = {data["Average Spent on App (INR)"].min()}')

Average Spend of the Users = 424.4154154154154
Highest Spend of the Users = 998.0
Lowest Spend of the Users = 0.0


# Clustering the Data

In [None]:
clustering_data = data[["Average Screen Time", "Left Review",
                        "Ratings", "Last Visited Minutes",
                        "Average Spent on App (INR)",
                        "New Password Request"]]

from sklearn.preprocessing import MinMaxScaler

for i in clustering_data.columns:
  MinMaxScaler(i)

In [None]:
from sklearn.cluster import KMeans
from sklearn.model_selection import GridSearchCV

param_grid = {
    'n_clusters': [2, 3, 4, 5],  # Number of clusters
    'init': ['k-means++', 'random'],  # Initialization method
    'max_iter': [100, 200, 300],  # Maximum number of iterations
}

kmeans = KMeans()
grid = GridSearchCV(kmeans, param_grid, cv=5)
grid.fit(clustering_data)

print(grid.best_params_)



{'init': 'k-means++', 'max_iter': 300, 'n_clusters': 5}


In [None]:
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=3, init='k-means++', max_iter=100)
clusters = kmeans.fit_predict(clustering_data)
data['Segment'] = clusters

data.head()



Unnamed: 0,userid,Average Screen Time,Average Spent on App (INR),Left Review,Ratings,New Password Request,Last Visited Minutes,Status,Segment
0,1001,17.0,634.0,1,9,7,2990,Installed,0
1,1002,0.0,54.0,0,4,8,24008,Uninstalled,2
2,1003,37.0,207.0,0,8,5,971,Installed,0
3,1004,32.0,445.0,1,6,2,799,Installed,0
4,1005,45.0,427.0,1,5,6,3668,Installed,0


In [None]:
data.Segment.value_counts()

0    910
1     45
2     44
Name: Segment, dtype: int64

In [None]:
PLOT = go.Figure()
for i in list(data["Segment"].unique()):


    PLOT.add_trace(go.Scatter(x = data[data["Segment"]== i]['Last Visited Minutes'],
                                y = data[data["Segment"] == i]['Average Spent on App (INR)'],
                                mode = 'markers',marker_size = 6, marker_line_width = 1,
                                name = str(i)))
PLOT.update_traces(hovertemplate='Last Visited Minutes: %{x} <br>Average Spent on App (INR): %{y}')


PLOT.update_layout(width = 800, height = 800, autosize = True, showlegend = True,
                   yaxis_title = 'Average Spent on App (INR)',
                   xaxis_title = 'Last Visited Minutes',
                   scene = dict(xaxis=dict(title = 'Last Visited Minutes', titlefont_color = 'black'),
                                yaxis=dict(title = 'Average Spent on App (INR)', titlefont_color = 'black')))

The blue segment shows the segment of users the app has retained over time. The red segment indicates the segment of users who just uninstalled the app or are about to uninstall it soon. And the green segment indicates the segment of users that the application has lost.

# **Summary**

---


So this is how you can segment users based on how they engage with the app. App users segmentation helps businesses find retained users, find the user segment for a marketing campaign, and solve many other business problems where you need to search for users based on similar characteristics.