# Profitable App profiles analysis for the App Store and Google Play markets
This project aims to analyze app data from Play Store (android) and Apple app store (iOS), only free apps are considered having main income of revenue as advetisements. The income of such apps would primarily depend on the number of users using the app and engaging with the ads. 

The goal of this project is to analyze the data so that developers can use it to find potential categories or types of apps that can be profitable for them

In [2]:
#Imports
from csv import reader

In [3]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

In [4]:
apple_data = list(reader(open('AppleStore.csv',encoding='utf-8')))
google_data = list(reader(open('googleplaystore.csv',encoding='utf-8')))

In [5]:
explore_data(apple_data,1,3,True)
explore_data(google_data,1,3,True)

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


Number of rows: 7198
Number of columns: 16
['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


Number of rows: 10842
Number of columns: 13


In [6]:
apple_data[0]

['id',
 'track_name',
 'size_bytes',
 'currency',
 'price',
 'rating_count_tot',
 'rating_count_ver',
 'user_rating',
 'user_rating_ver',
 'ver',
 'cont_rating',
 'prime_genre',
 'sup_devices.num',
 'ipadSc_urls.num',
 'lang.num',
 'vpp_lic']

Following columns can be helpful in the analysis
- [Apple App Store](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps)
    - track_name (App name)
    - ratingcounttot (Total Rating count)
    - ratingcountver (Rating count for current version)
    - user_rating (Average user rating for all versions)
    - cont_rating (Content Rating)
    - prime_genre (Primary Genre)
    - sup_devices.num (Number of supported devices)
    
    
- [Google Play Store](https://www.kaggle.com/lava18/google-play-store-apps)
    - app
    - category
    - rating (overall user rating)
    - reviews (number of user reviews)
    - installs (number of installs)
    - content rating
    - genres
    - android version

# Data Cleaning 


We opened the two data sets and performed a brief exploration of the data. Before beginning our analysis, we need to make sure the data we analyze is accurate, otherwise the results of our analysis will be wrong. This means that we need to:

- Detect inaccurate data, and correct or remove it.
- Detect duplicate data, and remove the duplicates.

At our company, we only build apps that are free to download and install, and that are directed toward an English-speaking audience. This means that we'll need to:

- Remove non-English apps like 爱奇艺PPS -《欢乐颂2》电视剧热播.
- Remove apps that aren't free.


In [7]:
print(google_data[10473])
print(google_data[10473][8])

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']



Checking the row no 10473 (including header), it can be seen that it is missing the `Rating` and a column shift happens to the next columns. 
> To fix this, removing this row


In [8]:
del google_data[10473]

## Removing Duplicate Apps
On observing, we found that there are many apps with duplicate entries, for example the Instagram app has 4 entries, while to clean the data we only need one entry per application to make the analysis smooth and accurate.

In [9]:
for app in google_data:
    name = app[0]
    if name == 'Instagram':
        print(app)

['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']


Which shows that Instagram has 4 entries

### Counting number of duplicates

In [10]:
def play_store_duplicate_finder(dataset):
    duplicate_apps = []
    unique_apps = []
    for app in dataset:
        name = app[0]
        if name in unique_apps:
            duplicate_apps.append(name)
        else:
            unique_apps.append(name)
    return duplicate_apps,unique_apps

def apple_store_duplicate_finder(dataset):
    duplicate_apps = []
    unique_apps = []
    for app in dataset:
        name = app[1]
        if name in unique_apps:
            duplicate_apps.append(name)
        else:
            unique_apps.append(name)
    return duplicate_apps,unique_apps

In [11]:
google_dup, google_unique = play_store_duplicate_finder(google_data[1:])
apple_dup, apple_unique = apple_store_duplicate_finder(apple_data[1:])

In [12]:
print("Number of duplicate apps (Play Store): {}".format(len(google_dup)))
print("Number of duplicate apps (App Store) : {} ".format(len(apple_dup)))

Number of duplicate apps (Play Store): 1181
Number of duplicate apps (App Store) : 2 


This shows that we have 1181 duplicate entries, they must be removed from the dataset to make the analysis smooth

### Removing the duplicates
Duplicates can not be randomly and carelessly removed. There must be a proper criteria to remove them. Some things must be kept in mind:
- The duplicate entries are the entries of the same app taken at different times, so the most recent one must be maintained
- If the recent one is removed and an old entry is maintained, it means that we will not have up to date informatino to perform the analysis and results may be faulty

In [13]:
google_data[1][3]

'159'

In [14]:
google_reviews_max = {}
for app in google_data[1:]:
    name = app[0]
    n_reviews = float(app[3])
    if (name in google_reviews_max):
        if ((n_reviews > google_reviews_max[name])):
            google_reviews_max[name] = n_reviews
    elif (name not in google_reviews_max):
        google_reviews_max[name] = n_reviews

In [15]:
# Number of unique apps
len(google_reviews_max)

9659

Creating a new dataset with only unique entries having the most up to date data

In [16]:
android_clean = []
already_added = []

for app in google_data[1:]:
    name = app[0]
    n_reviews = float(app[3])
    if (n_reviews == google_reviews_max[name] and (name not in already_added)):
        android_clean.append(app)
        already_added.append(name)

In [17]:
print(android_clean[0])
print(len(android_clean))

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']
9659


Hence, `android_clean` is now the dataset with all unique entries having up to date values, and as we can see the number of entries are 9659 which is same as the number we found earlier

## Removing non-English apps
We use English for the apps we develop at our company, and we'd like to analyze only the apps that are directed toward an English-speaking audience. However, if we explore the data long enough, we'll find that both data sets have apps with names that suggest they are not directed toward an English-speaking audience.
We're not interested in keeping these apps, so we'll remove them. One way to go about this is to remove each app with a name containing a symbol that is not commonly used in English text — English text usually includes letters from the English alphabet, numbers composed of digits from 0 to 9, punctuation marks (., !, ?, ;), and other symbols (+, *, /).

Behind the scenes, each character we use in a string has a corresponding number associated with it. For instance, the corresponding number for character `a` is 97, character `A` is 65, and character `爱` is 29,233. We can get the corresponding number of each character using the `ord()` built-in function.

The numbers corresponding to the characters we commonly use in an English text are all in the range 0 to 127, according to the ASCII (American Standard Code for Information Interchange) system. Based on this number range, we can build a function that detects whether a character belongs to the set of common English characters or not. If the number is equal to or less than 127, then the character belongs to the set of common English characters.

If an app name contains a character that is greater than 127, then it probably means that the app has a non-English name. Our app names, however, are stored as strings, so how could we take each individual character of a string and check its corresponding number?

In Python, strings are indexable and iterable, which means we can use indexing to select an individual character, and we can also iterate on the string using a for loop.

### Starting with the process

In [18]:
# A function to check whether a string is in English or not
def english_word(string):
    for letter in string:
        if ord(letter) > 127:
            return False
    return True

In [19]:
#Testing on a few examples
print(english_word('Instagram'))
print(english_word('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(english_word('Docs To Go™ Free Office Suite'))
print(english_word('Instachat 😜'))

True
False
False
False


`english_word` function detects non-English app names, but we saw that the function couldn't correctly identify certain English app names like 'Docs To Go™ Free Office Suite' and 'Instachat 😜'. This is because emojis and characters like ™ fall outside the ASCII range and have corresponding numbers over 127.

If we're going to use the function we've created, we'll lose useful data since many English apps will be incorrectly labeled as non-English. To minimize the impact of data loss, we'll only remove an app if its name has more than three characters with corresponding numbers falling outside the ASCII range. This means all English apps with up to three emoji or other special characters will still be labeled as English. Our filter function is still not perfect, but it should be fairly effective.

In [20]:
# Rewriting the function according to the new requirement
def english_word(string):
    count = 0
    for letter in string:
        if ord(letter) > 127:
            count+=1
    if count >= 3:
        return False
    return True
    

In [21]:
# Testing again on a few examples
print(english_word('Instagram'))
print(english_word('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(english_word('Docs To Go™ Free Office Suite'))
print(english_word('Instachat 😜'))

True
False
True
True


### Extracting English Apps
Creating a filter that will use `english_word` function and extract the english apps from any dataset of android play store

In [22]:
def english_filter(dataset,index):
    output = []
    for app in dataset:
        name = app[index]
        if (english_word(name)):
            output.append(app)
    return output

In [23]:
android_english = english_filter(android_clean,0) 
ios_english = english_filter(apple_data[1:],1)

In [24]:
# Before removing non-English apps
print(len(android_clean))
print(len(apple_data[1:]))

9659
7197


In [25]:
# After removing non-English apps
print(len(android_english))
print(len(ios_english))

9597
6155


## Extracting free apps
So far in the data cleaning process, we:

- Removed inaccurate data
- Removed duplicate app entries
- Removed non-English apps

As we mentioned earlier, we only build apps that are free to download and install, and our main source of revenue consists of in-app ads. Our data sets contain both free and non-free apps; we'll need to isolate only the free apps for our analysis.

### Finding required column
Checking which column will tell us whether the app is free or not

In [26]:
print(google_data[0])
print(android_english[0][6])

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']
Free


In [27]:
print(apple_data[0])
print(ios_english[0][4])

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']
0.0


As we can see that the index 6 for Android store data indicates the `type`, whether the app is free or paid and index 4 for Apple store indicates the `price`, which tells whether app is free or not. We can use that to do further analysis

## Extracting free apps
Writing a function that will be able to extract free apps from any google play store dataset

In [28]:
def android_free_extractor(dataset):
    output = []
    for app in dataset:
        app_type = app[6]
        if (app_type == 'Free'):
            output.append(app)
    return output

def apple_free_extractor(dataset):
    output= []
    for app in dataset:
        app_price = float(app[4])
        if (app_price == 0.0):
            output.append(app)
    return output

In [29]:
android_free = android_free_extractor(android_english)
ios_free = apple_free_extractor(ios_english)

In [30]:
print(len(android_free))
print(len(ios_free))

8847
3203


## Finding most common genres
So far, we spent a good amount of time on cleaning data, and:

- Removed inaccurate data
- Removed duplicate app entries
- Removed non-English apps
- Isolated the free apps

As we mentioned earlier, our aim is to determine the kinds of apps that are likely to attract more users because our revenue is highly influenced by the number of people using our apps.

To minimize risks and overhead, our validation strategy for an app idea is comprised of three steps:

- Build a minimal Android version of the app, and add it to Google Play.
- If the app has a good response from users, we develop it further.
- If the app is profitable after six months, we build an iOS version of the app and add it to the App Store.

Because our end goal is to add the app on both Google Play and the App Store, we need to find app profiles that are successful on both markets. For instance, a profile that works well for both markets might be a productivity app that makes use of gamification.

Let's begin the analysis by getting a sense of what are the most common genres for each market. For this, we'll need to build frequency tables for a few columns in our data sets.

### Identifying useful columns
Checking both datasets to find which columns can help us in finding the most popular genres in each market

In [31]:
print(google_data[0])
print(android_free[0][1])

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']
ART_AND_DESIGN


In [32]:
print(apple_data[0])
print(ios_free[1][11])

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']
Photo & Video


Thus, `Category` column in Play store and `primay_genre` column in App Store can help us

### Building Genre Frequency Tables
Writing function to build frequency tables for the genres so that further analysis can take place

In [33]:
def freq_table(dataset,index):
    freq_table = {}
    for app in dataset:
        genre = app[index]
        if (genre in freq_table):
            freq_table[genre] += 1
        else:
            freq_table[genre] = 1
    #Converting to propoetions
    total = len(dataset)
    for entry in freq_table:
        freq_table[entry] = float(freq_table[entry]) / total
    return freq_table

In [34]:
def display_table(dataset,index):
    table = freq_table(dataset,index)
    table_display = []
    for key in table:
        table_display.append((table[key],key))
    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1],':',entry[0])
    

In [35]:
display_table(android_free,1)

FAMILY : 0.18932971628800724
GAME : 0.09698202780603594
TOOLS : 0.0845484344975698
BUSINESS : 0.046004295241324746
PRODUCTIVITY : 0.038996269922007464
LIFESTYLE : 0.03888323725556686
FINANCE : 0.03707471459251724
MEDICAL : 0.03537922459590822
SPORTS : 0.0339097999321804
PERSONALIZATION : 0.033231603933536795
COMMUNICATION : 0.032327342602011984
HEALTH_AND_FITNESS : 0.030857917938284164
PHOTOGRAPHY : 0.029501525940996948
NEWS_AND_MAGAZINES : 0.02803210127726913
SOCIAL : 0.026675709279981915
TRAVEL_AND_LOCAL : 0.023397761953204477
SHOPPING : 0.022493500621679666
BOOKS_AND_REFERENCE : 0.021363173957273652
DATING : 0.01865038996269922
VIDEO_PLAYERS : 0.01797219396405561
MAPS_AND_NAVIGATION : 0.013903017972193964
FOOD_AND_DRINK : 0.012433593308466146
EDUCATION : 0.011642364643381937
ENTERTAINMENT : 0.009607776647451114
LIBRARIES_AND_DEMO : 0.009381711314569911
AUTO_AND_VEHICLES : 0.00926867864812931
HOUSE_AND_HOME : 0.008025319317282694
WEATHER : 0.007912286650842093
EVENTS : 0.007121057985

In [36]:
display_table(ios_free,11)

Games : 0.5825788323446769
Entertainment : 0.07836403371838901
Photo & Video : 0.049953168904152356
Education : 0.036840462066812366
Social Networking : 0.033093974399000935
Shopping : 0.025913206369029034
Utilities : 0.024664377146425227
Sports : 0.021542304089915705
Music : 0.020605682172962846
Health & Fitness : 0.020293474867311895
Productivity : 0.017483609116453323
Lifestyle : 0.015610365282547611
News : 0.013424914142990947
Travel : 0.012488292226038089
Finance : 0.010927255697783328
Weather : 0.008741804558226662
Food & Drink : 0.008117389946924758
Reference : 0.005307524196066188
Business : 0.005307524196066188
Book : 0.003746487667811427
Navigation : 0.0018732438339057135
Medical : 0.0018732438339057135
Catalogs : 0.001248829222603809


### Observation from genre table
 Observing the apple app store applications, it can be seen that `Games` and `Entertainment` are the most popular genres on iOS market, and `Family` and `Games` are the most popular on Android store. Moreover, there is another pattern that shows that most applications which are popular are enterntainment and fun based, like gaming, photo and videos, etc.

Although, `Games` genre has the highest number of apps, but it doesn't necessarily mean that it would have the highest number of users as well. Moreover, at this stage, we can not suggest an app profile to our developers, further analysis is required.

## Finding average number of users per genre
The frequency tables we analyzed previously showed us that the App Store is dominated by apps designed for fun, while Google Play shows a more balanced landscape of both practical and fun apps. Now, we'd like to get an idea about the kind of apps with the most users.

One way to find out what genres are the most popular (have the most users) is to calculate the average number of installs for each app genre. For the Google Play data set, we can find this information in the `Installs` column, but this information is missing for the App Store data set. As a workaround, we'll take the total number of user ratings as a proxy, which we can find in the `rating_count_tot` app.

In [40]:
apple_genres = freq_table(ios_free,11)

In [41]:
def apple_genre_users(freq_table,dataset):
    avg_user_rating = []
    for genre in freq_table:
        total = 0
        len_genre = 0
        for app in dataset:
            genre_app = app[11]
            n_user_rating = float(app[5]) # user rating count is on index 5
            if (genre == genre_app):
                total += n_user_rating
                len_genre += 1
        avg_user_rating.append((float(total) / len_genre,genre))  
    # Sorting the results    
    avg_user_rating = sorted(avg_user_rating,reverse = True)
    for entry in avg_user_rating:
        print(entry[1],':',entry[0])

In [42]:
apple_genre_users(apple_genres,ios_free)

Navigation : 86090.33333333333
Reference : 79350.4705882353
Social Networking : 71548.34905660378
Music : 57326.530303030304
Weather : 52279.892857142855
Book : 46384.916666666664
Food & Drink : 33333.92307692308
Finance : 32367.02857142857
Photo & Video : 28441.54375
Travel : 28243.8
Shopping : 27230.734939759037
Health & Fitness : 23298.015384615384
Sports : 23008.898550724636
Games : 22886.36709539121
News : 21248.023255813954
Productivity : 21028.410714285714
Utilities : 19156.493670886077
Lifestyle : 16815.48
Entertainment : 14195.358565737051
Business : 7491.117647058823
Education : 7003.983050847458
Catalogs : 4004.0
Medical : 612.0


### Observation
`Navigation` clearly dominates the market when it comes to average number of users per genre, although there is one problem with that genre, it is fundamentally dominated by a few giants like Google Maps and trying to enter this market could prove to be a bad and expensive move.

`Social Networking` is also dominated by Facebook, Whatsapp, Snapchat and Instagram, etc. `Music` which is a better opportunity for our developers to enter. After that `Reference`,`Weather` and `Book` would not be bad choices either.

In [43]:
android_genres = freq_table(android_free,1)

In [44]:
def android_installs(freq_table,dataset):
    avg_installs_category = []
    for category in android_genres:
        total = 0
        len_category = 0
        for app in dataset:
            app_category = app[1]
            if (category == app_category):
                installs = app[5].replace('+','')
                installs = float(installs.replace(',',''))
                total += installs
                len_category+=1
        avg_installs_category.append((float(total) / len_category,category))
    # Sorting the results    
    avg_installs_category = sorted(avg_installs_category,reverse = True)
    for entry in avg_installs_category:
        print(entry[1],':',entry[0])

In [45]:
android_installs(android_genres,android_free)

COMMUNICATION : 38590581.08741259
VIDEO_PLAYERS : 24727872.452830188
SOCIAL : 23253652.127118643
PHOTOGRAPHY : 17840110.40229885
PRODUCTIVITY : 16787331.344927534
GAME : 15544014.51048951
TRAVEL_AND_LOCAL : 13984077.710144928
ENTERTAINMENT : 11640705.88235294
TOOLS : 10830251.970588235
NEWS_AND_MAGAZINES : 9549178.467741935
BOOKS_AND_REFERENCE : 8814199.78835979
SHOPPING : 7036877.311557789
PERSONALIZATION : 5201482.6122448975
WEATHER : 5145550.285714285
HEALTH_AND_FITNESS : 4188821.9853479853
MAPS_AND_NAVIGATION : 4049274.6341463416
FAMILY : 3697848.1731343283
SPORTS : 3650602.276666667
ART_AND_DESIGN : 1986335.0877192982
FOOD_AND_DRINK : 1924897.7363636363
EDUCATION : 1833495.145631068
BUSINESS : 1712290.1474201474
LIFESTYLE : 1446158.2238372094
FINANCE : 1387692.475609756
HOUSE_AND_HOME : 1360598.042253521
DATING : 854028.8303030303
COMICS : 832613.8888888889
AUTO_AND_VEHICLES : 647317.8170731707
LIBRARIES_AND_DEMO : 638503.734939759
PARENTING : 542603.6206896552
BEAUTY : 513151.886

In [51]:
def display_apps_android(category,dataset):
    count = 10
    for app in dataset:
        if (app[1] == category):
            print(app[0],':',app[5])
            count -= 1
        if count == 0:
            break

In [52]:
display_apps_android('COMMUNICATION',android_free)

WhatsApp Messenger : 1,000,000,000+
Messenger for SMS : 10,000,000+
My Tele2 : 5,000,000+
imo beta free calls and text : 100,000,000+
Contacts : 50,000,000+
Call Free – Free Call : 5,000,000+
Web Browser & Explorer : 5,000,000+
Browser 4G : 10,000,000+
MegaFon Dashboard : 10,000,000+
ZenUI Dialer & Contacts : 10,000,000+


In [53]:
display_apps_android('VIDEO_PLAYERS',android_free)

YouTube : 1,000,000,000+
All Video Downloader 2018 : 1,000,000+
Video Downloader : 10,000,000+
HD Video Player : 1,000,000+
Iqiyi (for tablet) : 1,000,000+
Video Player All Format : 10,000,000+
Motorola Gallery : 100,000,000+
Free TV series : 100,000+
Video Player All Format for Android : 500,000+
VLC for Android : 100,000,000+


In [57]:
display_apps_android('SOCIAL',android_free)

Facebook : 1,000,000,000+
Facebook Lite : 500,000,000+
Tumblr : 100,000,000+
Social network all in one 2018 : 100,000+
Pinterest : 100,000,000+
TextNow - free text + calls : 10,000,000+
Google+ : 1,000,000,000+
The Messenger App : 1,000,000+
Messenger Pro : 1,000,000+
Free Messages, Video, Chat,Text for Messenger Plus : 1,000,000+


In [59]:
display_apps_android('PHOTOGRAPHY',android_free)

TouchNote: Cards & Gifts : 1,000,000+
FreePrints – Free Photos Delivered : 1,000,000+
Groovebook Photo Books & Gifts : 500,000+
Moony Lab - Print Photos, Books & Magnets ™ : 50,000+
LALALAB prints your photos, photobooks and magnets : 1,000,000+
Snapfish : 1,000,000+
Motorola Camera : 50,000,000+
HD Camera - Best Cam with filters & panorama : 5,000,000+
LightX Photo Editor & Photo Effects : 10,000,000+
Sweet Snap - live filter, Selfie photo edit : 10,000,000+


In [60]:
display_apps_android('PRODUCTIVITY',android_free)

Microsoft Word : 500,000,000+
All-In-One Toolbox: Cleaner, Booster, App Manager : 10,000,000+
AVG Cleaner – Speed, Battery & Memory Booster : 10,000,000+
QR Scanner & Barcode Scanner 2018 : 10,000,000+
Chrome Beta : 10,000,000+
Microsoft Outlook : 100,000,000+
Google PDF Viewer : 10,000,000+
My Claro Peru : 5,000,000+
Power Booster - Junk Cleaner & CPU Cooler & Boost : 1,000,000+
Google Assistant : 10,000,000+


### Analysis
After checking out some applications in the top categories, it is clear that `COMMUNICATION` category is dominated by giants like `Whatsapp Messenger`, category of `VIDEO_PLAYERS` is dominated by giants like `Youtube`, same goes for `SOCIAL` which is dominated by 