# Introduction

In this notebook, we will do a comprehensive analysis of the Android app market by comparing thousands of apps in the Google Play store.

# About the Dataset of Google Play Store Apps & Reviews

**Data Source:** <br>
App and review data was scraped from the Google Play Store by Lavanya Gupta in 2018. Original files listed [here](
https://www.kaggle.com/lava18/google-play-store-apps).

# Import Statements

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


# Notebook Presentation

In [2]:
# Show numeric output in decimal format e.g., 2.15
pd.options.display.float_format = '{:,.2f}'.format

# Read the Dataset

In [3]:
df_apps = pd.read_csv('./apps.csv')

# Data Cleaning

In [4]:
df_apps.head()

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres,Last_Updated,Android_Ver
0,Ak Parti YardÄ±m Toplama,SOCIAL,,0,8.7,0,Paid,$13.99,Teen,Social,"July 28, 2017",4.1 and up
1,Ain Arabic Kids Alif Ba ta,FAMILY,,0,33.0,0,Paid,$2.99,Everyone,Education,"April 15, 2016",3.0 and up
2,Popsicle Launcher for Android P 9.0 launcher,PERSONALIZATION,,0,5.5,0,Paid,$1.49,Everyone,Personalization,"July 11, 2018",4.2 and up
3,Command & Conquer: Rivals,FAMILY,,0,19.0,0,,0,Everyone 10+,Strategy,"June 28, 2018",Varies with device
4,CX Network,BUSINESS,,0,10.0,0,Free,0,Everyone,Business,"August 6, 2018",4.1 and up


In [5]:
df_apps.shape

(10841, 12)

In [6]:
df_apps.sample(5)

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres,Last_Updated,Android_Ver
6736,Stock Trainer: Virtual Trading (Stock Markets),FINANCE,4.4,42809,23.0,1000000,Free,0,Everyone,Finance,"July 27, 2018",4.1 and up
179,Bh Public School,FAMILY,5.0,2,8.7,10,Free,0,Everyone,Education,"March 21, 2018",4.0.3 and up
119,USMLE CK Clinical Knowledge Flashcards 2018 Ed,FAMILY,,0,3.6,5,Free,0,Everyone,Education,"June 21, 2018",4.0.3 and up
9979,Ninja Turtles: Legends,FAMILY,4.3,344283,95.0,10000000,Free,0,Teen,Role Playing,"December 18, 2017",4.1 and up
2851,M-Sight Pro,VIDEO_PLAYERS,3.3,30,15.0,5000,Free,0,Everyone,Video Players & Editors,"July 25, 2018",4.0 and up


### Drop Unused Columns

In [7]:
df_apps.drop(['Android_Ver','Last_Updated'],axis=1, inplace=True)
df_apps.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10841 entries, 0 to 10840
Data columns (total 10 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   App             10841 non-null  object 
 1   Category        10841 non-null  object 
 2   Rating          9367 non-null   float64
 3   Reviews         10841 non-null  int64  
 4   Size_MBs        10841 non-null  float64
 5   Installs        10841 non-null  object 
 6   Type            10840 non-null  object 
 7   Price           10841 non-null  object 
 8   Content_Rating  10841 non-null  object 
 9   Genres          10841 non-null  object 
dtypes: float64(2), int64(1), object(7)
memory usage: 847.1+ KB


In [8]:
df_apps.isna().sum()

App                  0
Category             0
Rating            1474
Reviews              0
Size_MBs             0
Installs             0
Type                 1
Price                0
Content_Rating       0
Genres               0
dtype: int64

In [9]:
df_apps_clean = df_apps.dropna()
df_apps_clean.info()
df_apps_clean.isna().sum()

<class 'pandas.core.frame.DataFrame'>
Index: 9367 entries, 21 to 10840
Data columns (total 10 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   App             9367 non-null   object 
 1   Category        9367 non-null   object 
 2   Rating          9367 non-null   float64
 3   Reviews         9367 non-null   int64  
 4   Size_MBs        9367 non-null   float64
 5   Installs        9367 non-null   object 
 6   Type            9367 non-null   object 
 7   Price           9367 non-null   object 
 8   Content_Rating  9367 non-null   object 
 9   Genres          9367 non-null   object 
dtypes: float64(2), int64(1), object(7)
memory usage: 805.0+ KB


App               0
Category          0
Rating            0
Reviews           0
Size_MBs          0
Installs          0
Type              0
Price             0
Content_Rating    0
Genres            0
dtype: int64

### Find and Remove Duplicates



In [10]:
df_apps_clean.duplicated().sum()

np.int64(476)

In [11]:
df_apps_clean = df_apps_clean.drop_duplicates()
df_apps_clean.duplicated().sum()

np.int64(0)

In [12]:
df_apps_clean.head()

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres
21,KBA-EZ Health Guide,MEDICAL,5.0,4,25.0,1,Free,0,Everyone,Medical
28,Ra Ga Ba,GAME,5.0,2,20.0,1,Paid,$1.49,Everyone,Arcade
47,Mu.F.O.,GAME,5.0,2,16.0,1,Paid,$0.99,Everyone,Arcade
82,Brick Breaker BR,GAME,5.0,7,19.0,5,Free,0,Everyone,Arcade
99,Anatomy & Physiology Vocabulary Exam Review App,MEDICAL,5.0,1,4.6,5,Free,0,Everyone,Medical


In [13]:
df_apps_clean.reset_index(drop=True, inplace=True)
df_apps_clean.head()

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres
0,KBA-EZ Health Guide,MEDICAL,5.0,4,25.0,1,Free,0,Everyone,Medical
1,Ra Ga Ba,GAME,5.0,2,20.0,1,Paid,$1.49,Everyone,Arcade
2,Mu.F.O.,GAME,5.0,2,16.0,1,Paid,$0.99,Everyone,Arcade
3,Brick Breaker BR,GAME,5.0,7,19.0,5,Free,0,Everyone,Arcade
4,Anatomy & Physiology Vocabulary Exam Review App,MEDICAL,5.0,1,4.6,5,Free,0,Everyone,Medical


# Find Highest Rated Apps


In [14]:
df_apps_clean.sort_values(by='Rating', ascending=False).head()

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres
0,KBA-EZ Health Guide,MEDICAL,5.0,4,25.0,1,Free,0,Everyone,Medical
501,FHR 5-Tier 2.0,MEDICAL,5.0,2,1.2,500,Paid,$2.99,Everyone,Medical
267,BG Guide,TRAVEL_AND_LOCAL,5.0,3,2.4,100,Free,0,Everyone,Travel & Local
266,Morse Player,FAMILY,5.0,12,2.4,100,Paid,$1.99,Everyone,Education
264,DG TV,NEWS_AND_MAGAZINES,5.0,3,5.7,100,Free,0,Everyone,News & Magazines


# Find 5 Largest Apps in terms of Size (MBs)

 

In [15]:
df_apps_clean.sort_values(by='Size_MBs', ascending=False).head()

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres
6248,Post Bank,FINANCE,4.5,60449,100.0,1000000,Free,0,Everyone,Finance
8074,Gangster Town: Vice District,FAMILY,4.3,65146,100.0,10000000,Free,0,Mature 17+,Simulation
8072,Talking Babsy Baby: Baby Games,LIFESTYLE,4.0,140995,100.0,10000000,Free,0,Everyone,Lifestyle;Pretend Play
8075,Ultimate Tennis,SPORTS,4.3,183004,100.0,10000000,Free,0,Everyone,Sports
8758,Hungry Shark Evolution,GAME,4.5,6074334,100.0,100000000,Free,0,Teen,Arcade


# Find the 5 App with Most Reviews



In [16]:
df_apps_clean.sort_values(by='Reviews', ascending=False).head(5)

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres
8859,Facebook,SOCIAL,4.1,78158306,5.3,1000000000,Free,0,Teen,Social
8864,Facebook,SOCIAL,4.1,78128208,5.3,1000000000,Free,0,Teen,Social
8844,WhatsApp Messenger,COMMUNICATION,4.4,69119316,3.5,1000000000,Free,0,Everyone,Communication
8854,WhatsApp Messenger,COMMUNICATION,4.4,69109672,3.5,1000000000,Free,0,Everyone,Communication
8862,Instagram,SOCIAL,4.5,66577446,5.3,1000000000,Free,0,Teen,Social


# Plotly Pie and Donut Charts - Visualise Categorical Data: Content Ratings

In [17]:
import plotly.express as px

In [18]:
ratings = df_apps_clean.Content_Rating.value_counts()
print(ratings)

Content_Rating
Everyone           7094
Teen               1022
Mature 17+          411
Everyone 10+        360
Adults only 18+       3
Unrated               1
Name: count, dtype: int64


In [19]:
fig = px.pie(values=ratings.values, names=ratings.index ,title='Distribution of Content Ratings in Google Play Store Apps',hole=0.4)

fig.update_traces(textposition='outside', textinfo='percent+label')
fig.show()

# Numeric Type Conversion: Examine the Number of Installs


In [20]:
df_apps_clean.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8891 entries, 0 to 8890
Data columns (total 10 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   App             8891 non-null   object 
 1   Category        8891 non-null   object 
 2   Rating          8891 non-null   float64
 3   Reviews         8891 non-null   int64  
 4   Size_MBs        8891 non-null   float64
 5   Installs        8891 non-null   object 
 6   Type            8891 non-null   object 
 7   Price           8891 non-null   object 
 8   Content_Rating  8891 non-null   object 
 9   Genres          8891 non-null   object 
dtypes: float64(2), int64(1), object(7)
memory usage: 694.7+ KB


In [21]:
df_apps_clean[['App','Installs']].groupby('Installs').count().sort_values(by='Installs', ascending=False).head()

Unnamed: 0_level_0,App
Installs,Unnamed: 1_level_1
500000000,61
500000,516
500,199
50000000,272
50000,462


In [22]:
df_apps_clean.Installs = df_apps_clean.Installs.astype(str).str.replace(',', "")
df_apps_clean.Installs = pd.to_numeric(df_apps_clean.Installs)
df_apps_clean[['App', 'Installs']].groupby('Installs').count()

Unnamed: 0_level_0,App
Installs,Unnamed: 1_level_1
1,3
5,9
10,69
50,56
100,303
500,199
1000,699
5000,426
10000,989
50000,462


# Find the Most Expensive Apps, Filter out the Junk, and Calculate a (ballpark) Sales Revenue Estimate


In [23]:
df_apps_clean.Price.describe()

count     8891
unique      73
top          0
freq      8278
Name: Price, dtype: object

In [24]:
df_apps_clean.Price = df_apps_clean.Price.astype(str).str.replace('$', "")
df_apps_clean.Price = pd.to_numeric(df_apps_clean.Price)
 
df_apps_clean.sort_values('Price', ascending=False).head(20)

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres
2469,I'm Rich - Trump Edition,LIFESTYLE,3.6,275,7.3,10000,Paid,400.0,Everyone,Lifestyle
4205,I am rich,LIFESTYLE,3.8,3547,1.8,100000,Paid,399.99,Everyone,Lifestyle
1397,I Am Rich Pro,FAMILY,4.4,201,2.7,5000,Paid,399.99,Everyone,Entertainment
370,most expensive app (H),FAMILY,4.3,6,1.5,100,Paid,399.99,Everyone,Entertainment
2104,ðŸ’Ž I'm rich,LIFESTYLE,3.8,718,26.0,10000,Paid,399.99,Everyone,Lifestyle
1721,I am rich(premium),FINANCE,3.5,472,0.94,5000,Paid,399.99,Everyone,Finance
1795,I am Rich Plus,FAMILY,4.0,856,8.7,10000,Paid,399.99,Everyone,Entertainment
3100,I Am Rich Premium,FINANCE,4.1,1867,4.7,50000,Paid,399.99,Everyone,Finance
1690,I am Rich,FINANCE,4.3,180,3.8,5000,Paid,399.99,Everyone,Finance
1094,I am Rich!,FINANCE,3.8,93,22.0,1000,Paid,399.99,Everyone,Finance


### The most expensive apps sub $250

In [25]:
df_apps_clean = df_apps_clean[df_apps_clean['Price'] < 250]
df_apps_clean.sort_values('Price', ascending=False).head(5)

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres
1012,Vargo Anesthesia Mega App,MEDICAL,4.6,92,32.0,1000,Paid,79.99,Everyone,Medical
408,LTC AS Legal,MEDICAL,4.0,6,1.3,100,Paid,39.99,Everyone,Medical
1268,I am Rich Person,LIFESTYLE,4.2,134,1.8,1000,Paid,37.99,Everyone,Lifestyle
1158,A Manual of Acupuncture,MEDICAL,3.5,214,68.0,1000,Paid,33.99,Everyone,Medical
2773,Golfshot Plus: Golf GPS,SPORTS,4.1,3387,25.0,50000,Paid,29.99,Everyone,Sports


### Highest Grossing Paid Apps (ballpark estimate)

In [26]:
df_apps_clean['Revenue_Estimate'] = df_apps_clean.Installs.mul(df_apps_clean.Price)
df_apps_clean.sort_values('Revenue_Estimate', ascending=False)[:10]

Unnamed: 0,App,Category,Rating,Reviews,Size_MBs,Installs,Type,Price,Content_Rating,Genres,Revenue_Estimate
7408,Minecraft,FAMILY,4.5,2375336,19.0,10000000,Paid,6.99,Everyone 10+,Arcade;Action & Adventure,69900000.0
7404,Minecraft,FAMILY,4.5,2376564,19.0,10000000,Paid,6.99,Everyone 10+,Arcade;Action & Adventure,69900000.0
7063,Hitman Sniper,GAME,4.6,408292,29.0,10000000,Paid,0.99,Mature 17+,Action,9900000.0
5504,Grand Theft Auto: San Andreas,GAME,4.4,348962,26.0,1000000,Paid,6.99,Mature 17+,Action,6990000.0
5814,Facetune - For Free,PHOTOGRAPHY,4.4,49553,48.0,1000000,Paid,5.99,Everyone,Photography,5990000.0
6297,Sleep as Android Unlock,LIFESTYLE,4.5,23966,0.85,1000000,Paid,5.99,Everyone,Lifestyle,5990000.0
4983,DraStic DS Emulator,GAME,4.6,87766,12.0,1000000,Paid,4.99,Everyone,Action,4990000.0
4500,Weather Live,WEATHER,4.5,76593,4.75,500000,Paid,5.99,Everyone,Weather,2995000.0
5228,Threema,COMMUNICATION,4.5,51110,3.5,1000000,Paid,2.99,Everyone,Communication,2990000.0
5403,Tasker,TOOLS,4.6,43045,3.4,1000000,Paid,2.99,Everyone,Tools,2990000.0


# Plotly Bar Charts & Scatter Plots: Analysing App Categories

In [27]:
Top_10_category = df_apps_clean.Category.value_counts().head(10)
print(Top_10_category)

Category
FAMILY             1714
GAME               1074
TOOLS               733
PRODUCTIVITY        334
FINANCE             311
PERSONALIZATION     310
COMMUNICATION       307
PHOTOGRAPHY         304
MEDICAL             302
LIFESTYLE           301
Name: count, dtype: int64


In [28]:
bar = px.bar(x = Top_10_category.index, 
             y = Top_10_category.values,title='Top 10 Categories of Apps in Google Play Store',
             labels={'x':'Category','y':'Number of Apps'}) 
             
bar.show()

### Vertical Bar Chart - Highest Competition (Number of Apps)

In [29]:
category_installs = df_apps_clean.groupby('Category').agg({'Installs': pd.Series.sum})
category_installs.sort_values('Installs', ascending=False, inplace=True)
print(category_installs)

                        Installs
Category                        
GAME                 31543862717
COMMUNICATION        24152241530
SOCIAL               12513841475
PRODUCTIVITY         12463070180
TOOLS                11440724500
FAMILY               10041105490
PHOTOGRAPHY           9721243130
TRAVEL_AND_LOCAL      6361859300
VIDEO_PLAYERS         6221897200
NEWS_AND_MAGAZINES    5393110650
SHOPPING              2563331540
ENTERTAINMENT         2455660000
PERSONALIZATION       2074352930
BOOKS_AND_REFERENCE   1916291655
SPORTS                1528531465
HEALTH_AND_FITNESS    1361006220
BUSINESS               863518120
FINANCE                770249400
MAPS_AND_NAVIGATION    724267560
LIFESTYLE              534611120
EDUCATION              533852000
WEATHER                426096500
FOOD_AND_DRINK         257777750
DATING                 206522410
HOUSE_AND_HOME         125082000
ART_AND_DESIGN         124233100
LIBRARIES_AND_DEMO      62083000
COMICS                  56036100
AUTO_AND_V

### Horizontal Bar Chart - Most Popular Categories (Highest Downloads)

In [30]:
h_bar = px.bar(x = category_installs.Installs,
               y = category_installs.index,
               orientation='h',
               title='Category Popularity')
 
h_bar.update_layout(xaxis_title='Number of Downloads', yaxis_title='Category')
h_bar.show()

### Category Concentration - Downloads vs. Competition

In [31]:
cat_number = df_apps_clean.groupby('Category').agg({'App': pd.Series.count})

In [32]:
cat_merged_df = pd.merge(cat_number, category_installs, on='Category', how="inner")
print(f'The dimensions of the DataFrame are: {cat_merged_df.shape}')
cat_merged_df.sort_values('Installs', ascending=False).head()

The dimensions of the DataFrame are: (33, 2)


Unnamed: 0_level_0,App,Installs
Category,Unnamed: 1_level_1,Unnamed: 2_level_1
GAME,1074,31543862717
COMMUNICATION,307,24152241530
SOCIAL,244,12513841475
PRODUCTIVITY,334,12463070180
TOOLS,733,11440724500


In [33]:
scatter = px.scatter(cat_merged_df,
                    x='App',
                    y='Installs',
                    size='App',
                    title='Category Popularity vs Number of Apps',
                    hover_name=cat_merged_df.index,
                    color='Installs',
                    labels={'App':'Number of Apps','Installs':'Number of Downloads'},)
scatter.update_layout(xaxis_title="Number of Apps (Lower=More Concentrated)",
                      yaxis_title="Installs",
                      yaxis=dict(type='log'))
scatter.show()

# Extracting Nested Data from a Column

**Challenge**: How many different types of genres are there? Can an app belong to more than one genre? Check what happens when you use .value_counts() on a column with nested values? See if you can work around this problem by using the .split() function and the DataFrame's [.stack() method](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.stack.html). 


In [34]:
len(df_apps_clean.Genres.unique())

115

In [35]:
df_apps_clean.groupby('Genres').agg({'App': pd.Series.count}).sort_values(by='App', ascending=False)

Unnamed: 0_level_0,App
Genres,Unnamed: 1_level_1
Tools,732
Entertainment,494
Education,446
Action,349
Productivity,334
...,...
Lifestyle;Education,1
Parenting;Brain Games,1
Lifestyle;Pretend Play,1
Card;Brain Games,1


In [36]:
# Split the strings on the semi-colon and then .stack them.
stack = df_apps_clean.Genres.str.split(';', expand=True).stack()
print(f'We now have a single column with shape: {stack.shape}')
num_genres = stack.value_counts()
print(f'Number of genres: {len(num_genres)}')
num_genres.head(10)

We now have a single column with shape: (9323,)


Number of genres: 53


Tools              733
Education          626
Entertainment      534
Action             364
Productivity       334
Finance            311
Personalization    310
Communication      308
Photography        304
Sports             303
Name: count, dtype: int64

# Colour Scales in Plotly Charts - Competition in Genres

In [37]:
bar = px.bar(x = num_genres.index[:15], # index = category name
             y = num_genres.values[:15], # count
             title='Top 15 Genres',
             hover_name=num_genres.index[:15],
             color=num_genres.values[:15],
             color_continuous_scale='Plasma')
bar.update_layout(xaxis_title='Genre',
yaxis_title='Number of Apps',
coloraxis_showscale=False)
 
bar.show()

# Grouped Bar Charts: Free vs. Paid Apps per Category

In [38]:
df_apps_clean.Type.value_counts()

Type
Free    8278
Paid     598
Name: count, dtype: int64

In [39]:
df_free_vs_paid = df_apps_clean.groupby(["Category", "Type"], as_index=False).agg({'App': pd.Series.count})
df_free_vs_paid.head()

Unnamed: 0,Category,Type,App
0,ART_AND_DESIGN,Free,59
1,ART_AND_DESIGN,Paid,3
2,AUTO_AND_VEHICLES,Free,72
3,AUTO_AND_VEHICLES,Paid,1
4,BEAUTY,Free,42


In [40]:
import plotly.graph_objects as go

In [41]:
g_bar = px.bar(df_free_vs_paid,
               x='Category',
               y='App',
               title='Free vs Paid Apps by Category',
               color='Type',
               barmode='group')
 
g_bar.update_layout(xaxis_title='Category',
                    yaxis_title='Number of Apps',
                    xaxis={'categoryorder':'total descending'},
                    yaxis=dict(type='log'))
 
g_bar.show()

# Plotly Box Plots: Lost Downloads for Paid Apps


In [42]:
box = px.box(df_apps_clean,
             y='Installs',
             x='Type',
             color='Type',
             notched=True,
             points='all',
             title='How Many Downloads are Paid Apps Giving Up?')
 
box.update_layout(yaxis=dict(type='log'),)
 
box.show()

# Plotly Box Plots: Revenue by App Category


In [43]:
df_paid_apps = df_apps_clean[df_apps_clean['Type'] == 'Paid']
box = px.box(df_paid_apps, 
             x='Category', 
             y='Revenue_Estimate',
             title='How Much Can Paid Apps Earn?')
 
box.update_layout(xaxis_title='Category',
                  yaxis_title='Paid App Ballpark Revenue',
                  xaxis={'categoryorder':'min ascending'},
                  yaxis=dict(type='log'))
 
 
box.show()

# How Much Can You Charge? Examine Paid App Pricing Strategies by Category


In [46]:
df_paid_apps = df_apps_clean[df_apps_clean['Type'] == 'Paid']
box = px.box(df_paid_apps, 
             x='Category', 
             y='Revenue_Estimate',
             title='How Much Can Paid Apps Earn?')
 
box.update_layout(xaxis_title='Category',
                  yaxis_title='Paid App Ballpark Revenue',
                  xaxis={'categoryorder':'min ascending'},
                  yaxis=dict(type='log'))
 
 
box.show()

In [47]:
box = px.box(df_paid_apps,
             x='Category',
             y="Price",
             title='Price per Category')
 
box.update_layout(xaxis_title='Category',
                  yaxis_title='Paid App Price',
                  xaxis={'categoryorder':'max descending'},
                  yaxis=dict(type='log'))
 
box.show()

# Android Market Analytics: Strategic Conclusions
1. The "Paid" Model is a Barrier to Scale The ecosystem rejects upfront costs. Free apps outnumber paid apps 14 to 1. Unless you are solving a critical professional problem, "Pay-to-Download" is a terminal friction point. You must utilize Freemium or Ad-supported models to survive.

2. Volume Displaces Margin High sticker prices do not equate to high revenue. Novelty apps priced at $400 failed, while volume-drivers like Minecraft ($6.99) generated ~$70M. Practical price elasticity ends quickly; lower barriers to entry generate superior lifetime value through scale.

3. Category Saturation vs. Leverage

Oversaturated: Tools and Family have immense competition but fail to command proportional download shares. Avoid these unless you have external distribution dominance.

High Leverage: Communication and Social apps demonstrate maximum efficiencyâ€”fewer apps commanding massive install bases (Winner-Takes-All).

Niche Premium: Medical is the only category successfully sustaining high price points (up to $80), proving professional utility is the only exception to the race-to-the-bottom.

4. The Directive To maximize market penetration, build for Games or Communication with a free-to-play model. To maximize unit margins with lower overhead, target Medical or specialized Business verticals.