# App Store Data

The purpose of this project is to collect data from the Apple App Store and the Google Play Store to gather information on various apps. We will then use it to look for trends and see what insights can be found that would help developers of free apps increase user counts in improve revenue.

## Importing and Exploring Data

We were able to obtain data available online that satisfies the requirements of this project. While they don't include all apps available on the platform, they provide a large enough sample for our purposes.
- Google Play data can be downloaded [here](https://dq-content.s3.amazonaws.com/350/googleplaystore.csv).
- Apple Store data can be downloaded [here](https://dq-content.s3.amazonaws.com/350/AppleStore.csv).

In [1]:
#Import Data
from csv import reader

# Google Play Store Data
opened_file = open('/Users/Cody/Documents/Datasets/googleplaystore.csv', encoding="utf8")
read_file = reader(opened_file)
android = list(read_file)
android_header = android[0]
android = android[1:]

# Apple Store Data
opened_file = open('/Users/Cody/Documents/Datasets/AppleStore.csv', encoding="utf8")
read_file = reader(opened_file)
ios = list(read_file)
ios_header = ios[0]
ios = ios[1:]

In [2]:
# Function for exploring data
def explore_data(dataset, start, end, rows_and_columns=False):
    ''' This function takes 4 arguments and displays
    a spaced, sliced version of a list of lists (dataset)
    starting from (start) and finishing on (end).
    The last argument determines whether to print the size'''
    
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

In [3]:
# Explore each data set
print('Google Play')
print(android_header)
print('\n')
explore_data(android,0,4,True)

print('\n')
print('\n')
print('\n')

print('Apple')
print(ios_header)
print('\n')
explore_data(ios,0,4,True)

Google Play
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 10841
Number of columns: 13






Apple
['id', 'track_name', 'size_bytes', 'currenc

## Cleaning Data

From the online discussion we know that one column of data is incorrect on the Google Play data. We will verify and delete, as well as search the rest of the data for potential errors.

In [4]:
print(android_header) #header
print('\n')
print(android[10472]) #incorrect data

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']


In [5]:
print(len(android))
del android[10472] #Delete due to incorrect rating (19)
print(len(android))

10841
10840


No errors found in Apple Store Data discussions.

## Removing Duplicates

A quick loop through the data will reveal whether we have duplicate values.

In [6]:
unique_apps = []
duplicate_apps = []

for app in android:
    name = app[0]
    if name in unique_apps:
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)

print('Number of duplicates: ',len(duplicate_apps))
print('\n')
print('Examples:')
print(duplicate_apps[:4])

Number of duplicates:  1181


Examples:
['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings']



We can now see that we have a number of duplicate values. We can look at a few more in depth.

In [7]:
print(android_header)
print('\n')
for app in android:
    name = app[0]
    if name == 'Quick PDF Scanner + OCR FREE':
        print(app)
        print('\n')
        
for app in android:
    name = app[0]
    if name == 'ZOOM Cloud Meetings':
        print(app)
        print('\n')

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']


['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']


['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80804', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']


['ZOOM Cloud Meetings', 'BUSINESS', '4.4', '31614', '37M', '10,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 20, 2018', '4.1.28165.0716', '4.0 and up']


['ZOOM Cloud Meetings', 'BUSINESS', '4.4', '31614', '37M', '10,000,000+', 'Free', '0', 'Everyone', 

The duplicates are very similar to each other, but vary slightly in the number of ratings. Since it's likely these entries were created at slightly different times, we will remove duplicates but keep the one with the most ratings, since this is the most likely to be up to date.

In [8]:
print('Expected Length: ', len(android)-1181) # Expected length after removal of duplicates

Expected Length:  9659


Next we will loop through the data to create a dictionary which will store the maximum number of reviews for each app. It will be named *max_reviews*. Underneath we will look at the length and check a known example for reassurance.

In [9]:
max_reviews = {}
for app in android:
    name = app[0]
    n_reviews = float(app[3])
    if name not in max_reviews:
        max_reviews[name] = n_reviews
    elif max_reviews[name] < n_reviews:
        max_reviews[name] = n_reviews #assign the higher of the current value or the new value
    # Do nothing if max_reviews is greater than or equal to n_reviews
    
print('Length: ', len(max_reviews))
print('Quick PDF Scanner: ', max_reviews['Quick PDF Scanner + OCR FREE'])
print('Instagram: ', max_reviews['Instagram'])

Length:  9659
Quick PDF Scanner:  80805.0
Instagram:  66577446.0


The length is the same as expected. Spot checking the Quick PDF Scanner from the previous example, the maximum review count is the same as expected as well. Now we will use this dictionary to remove the duplicates.

The code below will create a new list, *android_clean*, which we will fill by looping through the data set and appending only unique apps which match the number of ratings stored in *max_ratings*.

In [10]:
android_clean = []
already_added = []

for app in android:
    name = app[0]
    n_reviews = float(app[3])
    if name not in already_added and n_reviews == max_reviews[name]:
        android_clean.append(app)
        already_added.append(name)
        
print('Length: ', len(android_clean))
print('\n')
print(android_header)
print('\n')
explore_data(android_clean, 0, 3)

Length:  9659


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']




## Removing Non-English Apps

Since we are primarily concerned with english-speaking audiences for the apps we are looking at, we will remove apps that are targeted at other demographics.

The ASCII values corresponding to the most frequently used characters in English are 0 to 127. We will begin by writing a function that searches a string for values outside of this range. If there are more than 3 of them, it will return *False*.

In [11]:
def english_chars(a_string):
    count = 0
    for char in a_string:
        if ord(char) > 127:
            count += 1
        if count > 3:
            return False
    return True

In [12]:
# Test the new function
print(english_chars('Instagram'))
print(english_chars('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(english_chars('Docs To Go™ Free Office Suite'))
print(english_chars('Instachat 😜'))

True
False
True
True


We will now use this function to create a new list of apps which is filtered futher.

In [13]:
android_english = []
ios_english = []

for app in android_clean:
    name = app[0]
    if english_chars(name):
        android_english.append(app)
        
for app in ios:
    name = app[1]
    if english_chars(name):
        ios_english.append(app)

print('Google Play')
print(android_header)
print('\n')
explore_data(android_english, 0, 3, True)
print('\n')
print('\n')
print('Apple Store')
print(ios_header)
print('\n')
explore_data(ios_english, 0, 3, True)

Google Play
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9614
Number of columns: 13




Apple Store
['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num',

## Removing Paid Apps

Since this project is meant to aid the development of free apps, we are going to focus the analysis on other free apps. We will loop through the data and save our final lists to *android_total* and *ios_total*. 

In [14]:
android_final = []
ios_final = []

for app in android_english:
    price = app[7]
    if price == '0':
        android_final.append(app)

for app in ios_english:
    price = app[4]
    if price == '0.0':
        ios_final.append(app)

print('Google Play')
print(len(android_final))
print('Apple Store')
print(len(ios_final))

Google Play
8864
Apple Store
3222


## Most Common Apps by Genre

Our goal is to gain insights on the free app market in order to develop apps that will be more successful. The ad revenue generated by free apps is heavily dependant on the number of users.

We plan on going through a 3-step validation process for the developed app:
1. Build a minimal Android version of the app, and add it to Google Play.
2. If the app has a good response from users, we develop it further.
3. If the app is profitable after six months, we build an iOS version of the app and add it to the App Store

Since we want to develop an app that can be successful on both markets, our analysis will use both data sets. We are going to start with frequency tables showing the number of apps in each genre.

In [15]:
# Function to generate frequency table
def freq_table(dataset, index):
    table = {}
    # Count number of values in each category
    for row in dataset:
        value = row[index]
        if value in table:
            table[value] += 1
        else:
            table[value] = 1
    
    # Convert counts to percentages
    for entry in table:
        table[entry] = table[entry] / len(dataset) * 100
        
    return table

In [16]:
# Function to display frequency tables in descending order
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])

In [17]:
# Looking further into different columns that might be useful
print('Google Play: Category')
display_table(android_final, 1)
print('\n')
print('\n')
print('Google Play: Genre')
display_table(android_final, 9)
print('\n')
print('\n')
print('Apple Store: Prime Genre')
display_table(ios_final, 11)

Google Play: Category
FAMILY : 18.907942238267147
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.9034296028880866
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.654332129963

A few takeaways from this look:
- The Apple Store--at least the free, English part--seems to be dominated by games. The runner-up is entertainment, which shows that there are a lot more apps designed for fun than for productivity.
- The Google Play store seems to have a much lower percentage of games, but the categorizations are much more granular. Games may fall under other categories as well (such as 'Family'). Still, without knowing the details about these apps, the split between fun apps and productive apps looks much more even, with categories like 'Tools', 'Business', 'Productivity', and 'Finance' all showing strong representation.
- In order to have more general categories and less granularity, we will be using the 'Category' column in the Google Play data moving forward, and disregarding the 'Genre' category.

Having more apps in a category doesn't necessarily mean more users. Now we are going to inspect the number of downloads (Google) and number of ratings (Apple) to show us which categories likely have the most users.


**Apple Store**

In [18]:
ios_genre = freq_table(ios_final, 11)

ios_avg_ratings = {}
for genre in ios_genre:
    total = 0
    len_genre = 0
    for app in ios_final:
        n_ratings = float(app[5])
        genre_app = app[11]
        if genre_app == genre:
            total += n_ratings
            len_genre += 1
    ios_avg_ratings[genre] = total/len_genre
    print('Genre: ', genre)
    print('Total Ratings: ', total)
    print('Number of Apps: ', len_genre)
    print('Average Ratings: ', total/len_genre)
    print('\n')
    

Genre:  Social Networking
Total Ratings:  7584125.0
Number of Apps:  106
Average Ratings:  71548.34905660378


Genre:  Photo & Video
Total Ratings:  4550647.0
Number of Apps:  160
Average Ratings:  28441.54375


Genre:  Games
Total Ratings:  42705967.0
Number of Apps:  1874
Average Ratings:  22788.6696905016


Genre:  Music
Total Ratings:  3783551.0
Number of Apps:  66
Average Ratings:  57326.530303030304


Genre:  Reference
Total Ratings:  1348958.0
Number of Apps:  18
Average Ratings:  74942.11111111111


Genre:  Health & Fitness
Total Ratings:  1514371.0
Number of Apps:  65
Average Ratings:  23298.015384615384


Genre:  Weather
Total Ratings:  1463837.0
Number of Apps:  28
Average Ratings:  52279.892857142855


Genre:  Utilities
Total Ratings:  1513441.0
Number of Apps:  81
Average Ratings:  18684.456790123455


Genre:  Travel
Total Ratings:  1129752.0
Number of Apps:  40
Average Ratings:  28243.8


Genre:  Shopping
Total Ratings:  2261254.0
Number of Apps:  84
Average Ratings:  269

The genres that have the most ratings per app are:
- Navigation (86k)
- Reference (75k)
- Social Networking (71k)

All of these are potentially good candidates for an app. Reference might be an especially good genre to focus on, due to the lengthy development that would be necessary to create a competitive navigation app and the potential difficulty of reaching a wide enough scale to experience positive network effects in the Social Networking category.

In [19]:
print('Navigation')
for app in ios_final:
    if app[11] == 'Navigation':
        print(app[1]," : ", app[5])
print('\n')
print('Reference')
for app in ios_final:
    if app[11] == 'Reference':
        print(app[1]," : ", app[5])
print('\n')
print('Social Networking')
for app in ios_final:
    if app[11] == 'Social Networking':
        print(app[1]," : ", app[5])

Navigation
Waze - GPS Navigation, Maps & Real-time Traffic  :  345046
Google Maps - Navigation & Transit  :  154911
Geocaching®  :  12811
CoPilot GPS – Car Navigation & Offline Maps  :  3582
ImmobilienScout24: Real Estate Search in Germany  :  187
Railway Route Search  :  5


Reference
Bible  :  985920
Dictionary.com Dictionary & Thesaurus  :  200047
Dictionary.com Dictionary & Thesaurus for iPad  :  54175
Google Translate  :  26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran  :  18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition  :  17588
Merriam-Webster Dictionary  :  16849
Night Sky  :  12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE)  :  8535
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools  :  4693
GUNS MODS for Minecraft PC Edition - Mods Tools  :  1497
Guides for Pokémon GO - Pokemon GO News and Cheats  :  826
WWDC  :  762
Horror Maps for Minecraft PE - Download The Scari

While all of these genres are heavily influenced by the top apps, Navigation and Social Networking especially are dominate by a few apps which a high percentage struggle to get users. The reference category shows some promise in this regard.


**Google Play**

While the Google Play data does provide information about the number of downloads, it is not precise or in the format we desire (see below). 

In [20]:
display_table(android_final, 5) #number of installs

1,000,000+ : 15.726534296028879
100,000+ : 11.552346570397113
10,000,000+ : 10.548285198555957
10,000+ : 10.198555956678701
1,000+ : 8.393501805054152
100+ : 6.915613718411552
5,000,000+ : 6.825361010830325
500,000+ : 5.561823104693141
50,000+ : 4.7721119133574
5,000+ : 4.512635379061372
10+ : 3.5424187725631766
500+ : 3.2490974729241873
50,000,000+ : 2.3014440433213
100,000,000+ : 2.1322202166064983
50+ : 1.917870036101083
5+ : 0.78971119133574
1+ : 0.5076714801444043
500,000,000+ : 0.2707581227436823
1,000,000,000+ : 0.22563176895306858
0+ : 0.04512635379061372
0 : 0.01128158844765343


Since this analysis does not require precise counts, we will remove the '+' symbols provided and use the rounded numbers for our analysis.

In [21]:
android_category = freq_table(android_final, 1)

android_avg_downloads = {}
for cat in android_category:
    total = 0
    len_cat = 0
    for app in android_final:
        n_downloads = float(app[5].replace('+','').replace(',','')) #remove + and , signs from numbers
        cat_app = app[1]
        if cat_app == cat:
            total += n_downloads
            len_cat += 1
    android_avg_downloads[cat] = total/len_cat
    print('Category: ', cat)
    print('Total Downloads: ', total)
    print('Number of Apps: ', len_cat)
    print('Average Downloads: ', total/len_cat)
    print('\n')

Category:  ART_AND_DESIGN
Total Downloads:  113221100.0
Number of Apps:  57
Average Downloads:  1986335.0877192982


Category:  AUTO_AND_VEHICLES
Total Downloads:  53080061.0
Number of Apps:  82
Average Downloads:  647317.8170731707


Category:  BEAUTY
Total Downloads:  27197050.0
Number of Apps:  53
Average Downloads:  513151.88679245283


Category:  BOOKS_AND_REFERENCE
Total Downloads:  1665884260.0
Number of Apps:  190
Average Downloads:  8767811.894736841


Category:  BUSINESS
Total Downloads:  696902090.0
Number of Apps:  407
Average Downloads:  1712290.1474201474


Category:  COMICS
Total Downloads:  44971150.0
Number of Apps:  55
Average Downloads:  817657.2727272727


Category:  COMMUNICATION
Total Downloads:  11036906201.0
Number of Apps:  287
Average Downloads:  38456119.167247385


Category:  DATING
Total Downloads:  140914757.0
Number of Apps:  165
Average Downloads:  854028.8303030303


Category:  EDUCATION
Total Downloads:  188850000.0
Number of Apps:  103
Average Downloa

The categories that have the most downloads per app are:
- Communication (38.5 M)
- Video Players (24.7 M)
- Social (23.3 M)
- Photography (17.8 M)
- Productivity (16.8 M)
- Game (15.6 M)
- Travel and Local (14.0 M)
- Entertainment (11.6 M)

Let's inspect the top 3 below:

In [22]:
print('Communication')
for app in android_final:
    if app[1] == 'COMMUNICATION':
        print(app[0]," : ", app[5])
print('\n')
print('Video Players')
for app in android_final:
    if app[1] == 'VIDEO_PLAYERS':
        print(app[0]," : ", app[5])
print('\n')
print('Social')
for app in android_final:
    if app[1] == 'SOCIAL':
        print(app[0]," : ", app[5])
print('\n')

Communication
WhatsApp Messenger  :  1,000,000,000+
Messenger for SMS  :  10,000,000+
My Tele2  :  5,000,000+
imo beta free calls and text  :  100,000,000+
Contacts  :  50,000,000+
Call Free – Free Call  :  5,000,000+
Web Browser & Explorer  :  5,000,000+
Browser 4G  :  10,000,000+
MegaFon Dashboard  :  10,000,000+
ZenUI Dialer & Contacts  :  10,000,000+
Cricket Visual Voicemail  :  10,000,000+
TracFone My Account  :  1,000,000+
Xperia Link™  :  10,000,000+
TouchPal Keyboard - Fun Emoji & Android Keyboard  :  10,000,000+
Skype Lite - Free Video Call & Chat  :  5,000,000+
My magenta  :  1,000,000+
Android Messages  :  100,000,000+
Google Duo - High Quality Video Calls  :  500,000,000+
Seznam.cz  :  1,000,000+
Antillean Gold Telegram (original version)  :  100,000+
AT&T Visual Voicemail  :  10,000,000+
GMX Mail  :  10,000,000+
Omlet Chat  :  10,000,000+
My Vodacom SA  :  5,000,000+
Microsoft Edge  :  5,000,000+
Messenger – Text and Video Chat for Free  :  1,000,000,000+
imo free video ca

Omlet Arcade - Stream, Meet, Play  :  10,000,000+
VUE: video editor & camcorder  :  1,000,000+
Magisto Video Editor & Maker  :  10,000,000+
Dubsmash  :  100,000,000+
DU Recorder – Screen Recorder, Video Editor, Live  :  50,000,000+
KineMaster – Pro Video Editor  :  50,000,000+
Mobizen Screen Recorder for SAMSUNG  :  10,000,000+
Mobizen Screen Recorder for LG - Record, Capture  :  1,000,000+
M-Sight Pro  :  5,000+
Sketch 'n' go  :  100,000+
Q-See Plus  :  5,000+
Ustream  :  10,000,000+
VMate  :  50,000,000+
All Video Downloader  :  10,000,000+
VidPlay  :  1,000,000+
HD Video Downloader : 2018 Best video mate  :  50,000,000+
VivaVideo - Video Editor & Photo Movie  :  100,000,000+
VideoShow-Video Editor, Video Maker, Beauty Camera  :  100,000,000+
W Box VMS  :  10,000+
W Box VMS HD  :  5,000+
AB Repeat Player  :  100,000+
A-B repeater  :  5,000+
Ez Screen Recorder (no ad)  :  100,000+
Adobe Premiere Clip  :  5,000,000+
FilmoraGo - Free Video Editor  :  10,000,000+
ActionDirector Video Edi

Digi-TV.ch  :  10,000+
Students.ch  :  1,000+
CJ Gospel Hour  :  100+
Pekalongan CJ  :  0+
Hashtags For Likes.co  :  50,000+
CP Dialer  :  50,000+
C.P. CERVANTES (TOBARRA)  :  5+
Cyprus Police  :  10,000+
Rande.cz  :  50,000+
signály.cz  :  1,000+
DB Event App  :  5,000+
DC Comics Amino  :  1,000+
DF BugMeNot  :  500+
Noticias DF  :  1,000+
Periscope - Live Video  :  10,000,000+
Daddyhunt: Gay Dating  :  500,000+
Free phone calls, free texting SMS on free number  :  10,000,000+
Phone Tracker : Family Locator  :  10,000,000+
HOLLA Live: Meet New People via Random Video Chat  :  5,000,000+
Dating.dk  :  100,000+
DK Murali  :  500+
GirlTalk.dk  :  100+
+Download 4 Instagram Twitter  :  1,000,000+
DM Me - Chat  :  10,000+
DM for IG 😘 - Image & Video Saver for Instagram  :  5,000+
Auto DM for Twitter 🔥  :  1,000+
DM Storage (for twitter)  :  100+
Fake Chat (Direct Message)  :  500,000+
Otto DM  :  10+
DN Blog  :  10+
DP and Status for WhatsApp 2018  :  100,000+
Dp For Whatsapp  :  5,000+
Pr

All 3 of these categories are dominated by a few giants. Most of the apps do not appear to have very many users. (Perhaps this analysis should be repeated using median values instead of mean values.) 2 categories that seem to fare rather well in this regard are 'Travel and Local' and 'Books and Reference'. Let's inspect these below:

In [23]:
print('Travel and Local')
for app in android_final:
    if app[1] == 'TRAVEL_AND_LOCAL':
        print(app[0]," : ", app[5])
print('\n')
print('Books and Reference')
for app in android_final:
    if app[1] == 'BOOKS_AND_REFERENCE':
        print(app[0]," : ", app[5])
print('\n')

Travel and Local
trivago: Hotels & Travel  :  50,000,000+
Hopper - Watch & Book Flights  :  5,000,000+
TripIt: Travel Organizer  :  1,000,000+
Trip by Skyscanner - City & Travel Guide  :  500,000+
CityMaps2Go Plan Trips Travel Guide Offline Maps  :  1,000,000+
KAYAK Flights, Hotels & Cars  :  10,000,000+
World Travel Guide by Triposo  :  500,000+
Booking.com Travel Deals  :  100,000,000+
Hostelworld: Hostels & Cheap Hotels Travel App  :  1,000,000+
Google Trips - Travel Planner  :  5,000,000+
GPS Map Free  :  5,000,000+
GasBuddy: Find Cheap Gas  :  10,000,000+
Southwest Airlines  :  5,000,000+
AT&T Navigator: Maps, Traffic  :  10,000,000+
VZ Navigator  :  50,000,000+
KakaoMap - Map / Navigation  :  10,000,000+
AirAsia  :  10,000,000+
Expedia Hotels, Flights & Car Rental Travel Deals  :  10,000,000+
Goibibo - Flight Hotel Bus Car IRCTC Booking App  :  10,000,000+
Allegiant  :  1,000,000+
Amtrak  :  1,000,000+
JAL (Domestic and international flights)  :  1,000,000+
Flight & Hotel Booking

Flybook  :  500,000+
All Maths Formulas  :  1,000,000+
Ancestry  :  5,000,000+
HTC Help  :  10,000,000+
English translation from Bengali  :  100,000+
Pdf Book Download - Read Pdf Book  :  100,000+
Free Book Reader  :  100,000+
eBoox new: Reader for fb2 epub zip books  :  50,000+
Only 30 days in English, the guideline is guaranteed  :  500,000+
Moon+ Reader  :  10,000,000+
SH-02J Owner's Manual (Android 8.0)  :  50,000+
English-Myanmar Dictionary  :  1,000,000+
Golden Dictionary (EN-AR)  :  1,000,000+
All Language Translator Free  :  1,000,000+
Azpen eReader  :  500,000+
URBANO V 02 instruction manual  :  100,000+
Bible  :  100,000,000+
C Programs and Reference  :  50,000+
C Offline Tutorial  :  1,000+
C Programs Handbook  :  50,000+
Amazon Kindle  :  100,000,000+
Aab e Hayat Full Novel  :  100,000+
Aldiko Book Reader  :  10,000,000+
Google I/O 2018  :  500,000+
R Language Reference Guide  :  10,000+
Learn R Programming Full  :  5,000+
R Programing Offline Tutorial  :  1,000+
Guide for 