# Analyzing Mobile App Data
For this project, we will be analyzing data for a fictional company that builds Android and iOS mobile apps. The company makes their apps available on Google Play and in the App Store.

They only build apps that are free to download and install, and their main source of revenue consists of in-app ads. This means that the number of users of their apps determines their revenue for any given app — the more users who see and engage with the ads, the better.

The goal for this project is to analyze data to help the company's developers understand what type of apps are likely to attract more users.

### Loading & Exploring the Datasets
We create functions to load and explore the datasets. Then we apply those functions to the datasets.

In [1]:
def open_data(file):
    from csv import reader

    opened_file = open(file)
    read_file = reader(opened_file)
    data = list(read_file)

    return data

In [2]:
# Load the datasets
ios_apps = open_data('AppleStore.csv')[1:]
ios_h = open_data('AppleStore.csv')[0]
android_apps = open_data('googleplaystore.csv')[1:]
android_h = open_data('googleplaystore.csv')[0]

In [3]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

In [4]:
# Explore the datasets
print(explore_data(ios_apps, 0, 4))
print(explore_data(android_apps, 0, 4))

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


None
['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESI

Let's examine the headers to see the columns relevant to our analysis.

In [5]:
# View Headers
print(ios_h)
print('\n')
print(android_h)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


### Cleaning the Dataset - Removing Errors & Duplicates
#### Deleting Errors
One row in the android dataset is incorrectly saved as it does not have a 'Category' column. We will delete that row and then confirm it's been updated.


In [6]:
# Review wrong row - old view
explore_data(android_apps, 10472, 10474)

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']


['osmino Wi-Fi: free WiFi', 'TOOLS', '4.2', '134203', '4.1M', '10,000,000+', 'Free', '0', 'Everyone', 'Tools', 'August 7, 2018', '6.06.14', '4.4 and up']




In [7]:
# Delete wrong row and review dataset - new view
del android_apps[10472]
explore_data(android_apps, 10472, 10474)

['osmino Wi-Fi: free WiFi', 'TOOLS', '4.2', '134203', '4.1M', '10,000,000+', 'Free', '0', 'Everyone', 'Tools', 'August 7, 2018', '6.06.14', '4.4 and up']


['Sat-Fi Voice', 'COMMUNICATION', '3.4', '37', '14M', '1,000+', 'Free', '0', 'Everyone', 'Communication', 'November 21, 2014', '2.2.1.5', '2.2 and up']




#### Removing Duplicate Entries
The android dataset has duplicate entries. We will create a function to iterate through the dataset and seperate duplicate apps from unique apps.

<br> Criteria for removing duplicates

We will remove duplicate rows based on the number of reviews in the dataset. The higher the number of reviews, the more recent the entry. Recent entries will be retained and older entries will be removed

<br> The remove_dups() function has 3 parameters - the dataset, the name index and the review index. It will operate as follows:

* create a dictionary to highlight unique rows and rows with highest review counts
* create 2 empty lists - one for the newly cleaned dataset of unique rows and one to track names already included in the first list.
* iterate through the dataset and determine whether the app is in the dictionary AND if the app is already included in the names list.
* Depending on the response to both if statements, we append rows to the clean dataset list to create a clean dataset with duplicates removed.

In [8]:
print(len(android_apps))
print(len(ios_apps))

10840
7197


In [9]:
# remove duplicate apps
def remove_dups(dataset, name_index, review_index):
    reviews_max = dict()
    for row in dataset:
        name = row[name_index]
        n_review = float(row[review_index])
        if name in reviews_max and reviews_max[name] < n_review:
            reviews_max[name] = n_review
        elif name not in reviews_max:
            reviews_max[name] = n_review

    android_clean_ds = list()
    already_added_ls = list()
    for row in dataset:
        name = row[name_index]
        n_review = float(row[review_index])
        if n_review == reviews_max[name] and name not in already_added_ls:
            android_clean_ds.append(row)
            already_added_ls.append(name)

    return android_clean_ds, already_added_ls

In [10]:
android_clean, already_added = remove_dups(android_apps, 0, 3)

In [11]:
print(len(android_clean))
print(android_clean[0:3])

9659
[['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'], ['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up'], ['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']]


The ios dataset has no duplicates as all id numbers are unique. We will check this below

In [12]:
print(len(ios_apps))
ios_clean, already_added = remove_dups(ios_apps, 0, 5)
print(len(ios_clean))

7197
7197


## Removing Non English Apps

First we create a function to test the unicode points of a character. Characters with unicodes in excess of 127 are non-english characters.

In [13]:
def ord_check(a_string):
    for i in a_string:
        if ord(i) > 127:
            return False
    return True

In [14]:
print(ord_check('Instagram'))
print(ord_check('Docs To Go™ Free Office Suite'))
print(ord_check('Instachat 😜'))
print(ord_check('爱奇艺PPS -《欢乐颂2》电视剧热播'))

True
False
False
False


We amend this function to exclude strings where the non-english characters are less than 3.

In [15]:
def ord_check(a_string):
    non_english = 0
    for i in a_string:
        if ord(i) > 127:
            non_english += 1
    if non_english > 3:
        return False
    else:
        return True

In [16]:
print(ord_check('Docs To Go™ Free Office Suite'))
print(ord_check('Instachat 😜'))
print(ord_check('爱奇艺PPS -《欢乐颂2》电视剧热播'))

True
True
False


We apply this function to our datasets and separate the english from non-english apps

In [17]:
android_eng = list()
ios_eng = list()

for row in android_clean:
    name = row[0]
    if ord_check(name):
        android_eng.append(row)

for row in ios_apps:
    name = row[1]
    if ord_check(name):
        ios_eng.append(row)

Let's review the new datasets. As we can see we've lost 45 rows from the android dataset and 1014 rows from the ios dataset to the non-english apps cleaning function

In [18]:
explore_data(android_eng, 0, 3, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9614
Number of columns: 13


In [19]:
explore_data(ios_eng, 0, 3, True)

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


Number of rows: 6183
Number of columns: 16


### Isolating Free Apps
Since our goal is to highlight free apps, we will separate the free apps from the dataset using a function

In [20]:
def free_apps(dataset, price_index):
    free_apps = list()
    for row in dataset:
        price = row[price_index]
        if price in [str(0), str(0.0), str(0.00)]:
            free_apps.append(row)

    return free_apps

Apply the function to our cleaned dataset.

In [21]:
ios_free = free_apps(ios_eng, 4)
android_free = free_apps(android_eng, 7)

In [22]:
explore_data(ios_free, 0, 5, rows_and_columns=True)

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


['284035177', 'Pandora - Music & Radio', '130242560', 'USD', '0.0', '1126879', '3594', '4.0', '4.5', '8.4.1', '12+', 'Music', '37', '4', '1', '1']


Number of rows: 3222
Number of columns: 16


In [23]:
explore_data(android_free, 0, 5, rows_and_columns=True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'March 26, 2017', '1.0', '2.3 and up']


Number of rows: 8864
Number of columns: 13


### Finding the Most Common Apps by Genre

Because our end goal is to add the most profitable apps on both Google Play and the App Store, we need to find app profiles that are successful in both markets.

To minimize risks and overhead, our validation strategy for an app idea has three steps:

1. Build a minimal Android version of the app, and add it to Google Play.
2. If the app has a good response from users, we develop it further.
3. If the app is profitable after six months, we build an iOS version of the app and add it to the App Store.

We'll begin the analysis by determining the most common genres for each market by building frequency tables for a few columns in our datasets.

In [24]:
# View Headers
print(ios_h)
print('\n')
print(android_h)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


In [25]:
def freq_table(data_set, index):
    frequency_table = dict()
    total = 0

    for row in data_set:
        total += 1
        value = row[index]
        if value in frequency_table:
            frequency_table[value] += 1
        else:
            frequency_table[value] = 1

    table_percentages = dict()
    for key in frequency_table:
        percentage = (frequency_table[key] / total) * 100
        table_percentages[key] = percentage

    return table_percentages

Then we convert the frequency table dictionary into a list of tuples and sort the list by frequency in descending order.

In [26]:
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = list()
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])
    return table_sorted

In [27]:
# prime genre ios
ios_gen = display_table(ios_free, 11)

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


In [28]:
# genre android
android_gen = display_table(android_free, 9)

Tools : 8.449909747292418
Entertainment : 6.069494584837545
Education : 5.347472924187725
Business : 4.591606498194946
Productivity : 3.892148014440433
Lifestyle : 3.892148014440433
Finance : 3.7003610108303246
Medical : 3.531137184115524
Sports : 3.463447653429603
Personalization : 3.3167870036101084
Communication : 3.2378158844765346
Action : 3.1024368231046933
Health & Fitness : 3.0798736462093865
Photography : 2.944494584837545
News & Magazines : 2.7978339350180503
Social : 2.6624548736462095
Travel & Local : 2.3240072202166067
Shopping : 2.2450361010830324
Books & Reference : 2.1435018050541514
Simulation : 2.0419675090252705
Dating : 1.861462093862816
Arcade : 1.8501805054151623
Video Players & Editors : 1.7712093862815883
Casual : 1.7599277978339352
Maps & Navigation : 1.3989169675090252
Food & Drink : 1.2409747292418771
Puzzle : 1.128158844765343
Racing : 0.9927797833935018
Role Playing : 0.9363718411552346
Libraries & Demo : 0.9363718411552346
Auto & Vehicles : 0.9250902527075

In [29]:
# category android
android_cat = display_table(android_free, 1)

FAMILY : 18.907942238267147
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.9034296028880866
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.6543321299638989
ART_AND_DESIGN : 

### Analysis of Most Common Apps by Genre
In this section, we will use the list of tuples generated above to answer the following questions:

<br>Analyze the frequency table you generated for the prime_genre column of the App Store dataset.

1. What is the most common genre? What is the next most common?
    a. Games, Entertainment
2. What other patterns do you see?
    a. Many more game apps are free than with any genre of apps.
    b. Many entertainment-related genres are free, compared to education, lifestyle and productivity genres.
3. What is the general impression — are most of the apps designed for practical purposes (education, shopping, utilities, productivity, lifestyle) or more for entertainment (games, photo and video, social networking, sports, music)?
    a. More apps are designed for entertainment purposes than for practical purposes.
4. Can you recommend an app profile for the App Store market based on this frequency table alone? If there's a large number of apps for a particular genre, does that also imply that apps of that genre generally have a large number of users?
    a. An app profile will emphasize entertainment based apps especially games, entertainment, photo & video and social networking apps.
    b. Although correlation does not equal causation, at first glance we can surmise that genres with high app activity generally have a large number of users. Further analysis is required for conclusion on this issue.

<br>Analyze the frequency table you generated for the Category and Genres column of the Google Play dataset.

1. What are the most common genres?.
    a. Tools, Entertainment
    b. Family, Game, Tools
2. What other patterns do you see?
    a. There is some overlap between the genre and category fields.
3. Compare the patterns you see for the Google Play market with those you saw for the App Store market.
    a. Entertainment is still important in this market but only secondarily. A more important genre is tools.
4. Can you recommend an app profile based on what you found so far? Do the frequency tables you generated reveal the most frequent app genres or what genres have the most users?
    a. an app profile with joint emphasis on tools and entertainment will work for this market.
    b. The frequency tables reveal the most frequent app genres. It does not reveal what genres have the most users, but we can imply that app development frequency will follow user activity.

### Most Popular Apps by Genre on the Appstore
In order to find out what genres are the most popular on the appstore, we will:

* Isolate the apps of each genre
* Add up the user ratings for the apps of that genre
* Divide the sum by the number of apps belonging to that genre (not by the total number of apps)

To calculate the average number of user ratings for each genre, we'll use a for loop inside another for loop, i.e. a nested loop.

In [30]:
# Generate frequency of the prime genre using the freq_table() function
iosgen_freq = freq_table(ios_free, 11)

In [31]:
for genre in iosgen_freq:
    total = 0
    len_genre = 0

    for row in ios_free:
        genre_app = row[11]
        if genre_app == genre:
            total += float(row[5]) # rating count total
            len_genre += 1

    avg_rating = total / len_genre
    print(genre, avg_rating)

Social Networking 71548.34905660378
Photo & Video 28441.54375
Games 22788.6696905016
Music 57326.530303030304
Reference 74942.11111111111
Health & Fitness 23298.015384615384
Weather 52279.892857142855
Utilities 18684.456790123455
Travel 28243.8
Shopping 26919.690476190477
News 21248.023255813954
Navigation 86090.33333333333
Lifestyle 16485.764705882353
Entertainment 14029.830708661417
Food & Drink 33333.92307692308
Sports 23008.898550724636
Book 39758.5
Finance 31467.944444444445
Education 7003.983050847458
Productivity 21028.410714285714
Business 7491.117647058823
Catalogs 4004.0
Medical 612.0


#### Analysis of the Most Popular Apps on the Appstore
The top 3 highest average review apps are: Navigation, Reference and Social Networking. The next 5 are Music, Weather, Book, Food & Drink and Finance.

Considering the likelihood of certain apps being overweighted in each genre, we will loop through each genre to determine if the average rating adequately represents the rating spread.

In [32]:
for row in ios_free:
    genre_app = row[11]
    if genre_app == 'Navigation':
        print(row[1], row[5])

Waze - GPS Navigation, Maps & Real-time Traffic 345046
Google Maps - Navigation & Transit 154911
Geocaching® 12811
CoPilot GPS – Car Navigation & Offline Maps 3582
ImmobilienScout24: Real Estate Search in Germany 187
Railway Route Search 5


Waze and Google Maps represent the bulk of the ratings in Navigation genre and there are not many apps in this genre. This may not be a good genre to include in our app profile, because users will favour the 2 major apps over others.

In [33]:
for row in ios_free:
    genre_app = row[11]
    if genre_app == 'Reference':
        print(row[1], row[5])

Bible 985920
Dictionary.com Dictionary & Thesaurus 200047
Dictionary.com Dictionary & Thesaurus for iPad 54175
Google Translate 26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition 17588
Merriam-Webster Dictionary 16849
Night Sky 12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) 8535
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools 4693
GUNS MODS for Minecraft PC Edition - Mods Tools 1497
Guides for Pokémon GO - Pokemon GO News and Cheats 826
WWDC 762
Horror Maps for Minecraft PE - Download The Scariest Maps for Minecraft Pocket Edition (MCPE) Free 718
VPN Express 14
Real Bike Traffic Rider Virtual Reality Glasses 8
教えて!goo 0
Jishokun-Japanese English Dictionary & Translator 0


For the reference genre, there re 2 major apps - Bible & Dictionary.Com. However, other apps in the genre have significant user engagement (in the tens of thousands). This might be a favourable genre for an app profile.

In [34]:
for row in ios_free:
    genre_app = row[11]
    if genre_app == 'Social Networking':
        print(row[1], row[5])

Facebook 2974676
Pinterest 1061624
Skype for iPhone 373519
Messenger 351466
Tumblr 334293
WhatsApp Messenger 287589
Kik 260965
ooVoo – Free Video Call, Text and Voice 177501
TextNow - Unlimited Text + Calls 164963
Viber Messenger – Text & Call 164249
Followers - Social Analytics For Instagram 112778
MeetMe - Chat and Meet New People 97072
We Heart It - Fashion, wallpapers, quotes, tattoos 90414
InsTrack for Instagram - Analytics Plus More 85535
Tango - Free Video Call, Voice and Chat 75412
LinkedIn 71856
Match™ - #1 Dating App. 60659
Skype for iPad 60163
POF - Best Dating App for Conversations 52642
Timehop 49510
Find My Family, Friends & iPhone - Life360 Locator 43877
Whisper - Share, Express, Meet 39819
Hangouts 36404
LINE PLAY - Your Avatar World 34677
WeChat 34584
Badoo - Meet New People, Chat, Socialize. 34428
Followers + for Instagram - Follower Analytics 28633
GroupMe 28260
Marco Polo Video Walkie Talkie 27662
Miitomo 23965
SimSimi 23530
Grindr - Gay and same sex guys chat, meet

We see that social networking is weighted by facebook and pinterest, which together was over 4m in reviews. However, smaller apps also have some significant traction as well. This will be a good genre for an app profile.

In [35]:
for row in ios_free:
    genre_app = row[11]
    if genre_app == 'Music':
        print(row[1], row[5])

Pandora - Music & Radio 1126879
Spotify Music 878563
Shazam - Discover music, artists, videos & lyrics 402925
iHeartRadio – Free Music & Radio Stations 293228
SoundCloud - Music & Audio 135744
Magic Piano by Smule 131695
Smule Sing! 119316
TuneIn Radio - MLB NBA Audiobooks Podcasts Music 110420
Amazon Music 106235
SoundHound Song Search & Music Player 82602
Sonos Controller 48905
Bandsintown Concerts 30845
Karaoke - Sing Karaoke, Unlimited Songs! 28606
My Mixtapez Music 26286
Sing Karaoke Songs Unlimited with StarMaker 26227
Ringtones for iPhone & Ringtone Maker 25403
Musi - Unlimited Music For YouTube 25193
AutoRap by Smule 18202
Spinrilla - Mixtapes For Free 15053
Napster - Top Music & Radio 14268
edjing Mix:DJ turntable to remix and scratch music 13580
Free Music - MP3 Streamer & Playlist Manager Pro 13443
Free Piano app by Yokee 13016
Google Play Music 10118
Certified Mixtapes - Hip Hop Albums & Mixtapes 9975
TIDAL 7398
YouTube Music 7109
Nicki Minaj: The Empire 5196
Sounds app - M

The Music genre is also a good option with a number of mid range review count options. However, there is significant competition among top named brands in the Music genre.

In [36]:
for row in ios_free:
    genre_app = row[11]
    if genre_app == 'Weather':
        print(row[1], row[5])

The Weather Channel: Forecast, Radar & Alerts 495626
The Weather Channel App for iPad – best local forecast, radar map, and storm tracking 208648
WeatherBug - Local Weather, Radar, Maps, Alerts 188583
MyRadar NOAA Weather Radar Forecast 150158
AccuWeather - Weather for Life 144214
Yahoo Weather 112603
Weather Underground: Custom Forecast & Local Radar 49192
NOAA Weather Radar - Weather Forecast & HD Radar 45696
Weather Live Free - Weather Forecast & Alerts 35702
Storm Radar 22792
QuakeFeed Earthquake Map, Alerts, and News 6081
Moji Weather - Free Weather Forecast 2333
Hurricane by American Red Cross 1158
Forecast Bar 375
Hurricane Tracker WESH 2 Orlando, Central Florida 203
FEMA 128
iWeather - World weather forecast 80
Weather - Radar - Storm with Morecast App 78
Yurekuru Call 53
Weather & Radar 37
WRAL Weather Alert 25
Météo-France 24
JaxReady 22
Freddy the Frogcaster's Weather Station 14
Almanac Long-Range Weather Forecast 12
TodayAir 0
wetter.com 0
WarnWetter 0


The Weather category may also be a useful category in the app profile because rating count is significantly spread among the top 10 apps in the genre.

In [37]:
for row in ios_free:
    genre_app = row[11]
    if genre_app == 'Book':
        print(row[1], row[5])

Kindle – Read eBooks, Magazines & Textbooks 252076
Audible – audio books, original series & podcasts 105274
Color Therapy Adult Coloring Book for Adults 84062
OverDrive – Library eBooks and Audiobooks 65450
HOOKED - Chat Stories 47829
BookShout: Read eBooks & Track Your Reading Goals 879
Dr. Seuss Treasury — 50 best kids books 451
Green Riding Hood 392
Weirdwood Manor 197
MangaZERO - comic reader 9
ikouhoushi 0
MangaTiara - love comic reader 0
謎解き 0
謎解き2016 0


Book genre may not be a good addition to the app profile as a significant portion of the reviews reflect only 5 apps.

In [38]:
for row in ios_free:
    genre_app = row[11]
    if genre_app == 'Food & Drink':
        print(row[1], row[5])

Starbucks 303856
Domino's Pizza USA 258624
OpenTable - Restaurant Reservations 113936
Allrecipes Dinner Spinner 109349
DoorDash - Food Delivery 25947
UberEATS: Uber for Food Delivery 17865
Postmates - Food Delivery, Faster 9519
Dunkin' Donuts - Get Offers, Coupons & Rewards 9068
Chick-fil-A 5665
McDonald's 4050
Deliveroo: Restaurant Delivery - Order Food Nearby 1702
SONIC Drive-In 1645
Nowait Guest 1625
7-Eleven, Inc. 1356
Outback 805
Bon Appetit 750
Starbucks Keyboard 457
Whataburger 197
Delish Eatmoji Keyboard 154
Lieferheld - Delicious food delivery service 29
Lieferando.de 29
McDo France 22
Chefkoch - Rezepte, Kochen, Backen & Kochbuch 20
Youmiam 9
Marmiton Twist 2
Open Food Facts 1


The Food and Drink genre may not be a good genre as the major significant apps are dominated by the big named brands.

In [39]:
for row in ios_free:
    genre_app = row[11]
    if genre_app == 'Finance':
        print(row[1], row[5])

Chase Mobile℠ 233270
Mint: Personal Finance, Budget, Bills & Money 232940
Bank of America - Mobile Banking 119773
PayPal - Send and request money safely 119487
Credit Karma: Free Credit Scores, Reports & Alerts 101679
Capital One Mobile 56110
Citi Mobile® 48822
Wells Fargo Mobile 43064
Chase Mobile 34322
Square Cash - Send Money for Free 23775
Capital One for iPad 21858
Venmo 21090
USAA Mobile 19946
TaxCaster – Free tax refund calculator 17516
Amex Mobile 11421
TurboTax Tax Return App - File 2016 income taxes 9635
Bank of America - Mobile Banking for iPad 7569
Wells Fargo for iPad 2207
Stash Invest: Investing & Financial Education 1655
Digit: Save Money Without Thinking About It 1506
IRS2Go 1329
Capital One CreditWise - Credit score and report 1019
U by BB&T 790
Paribus - Rebates When Prices Drop 768
KeyBank Mobile 623
VyStar Mobile Banking for iPhone 434
Sparkasse - Your mobile branch 77
VyStar Mobile Banking for iPad 57
Zaim 44
Ma Banque 17
Lloyds Bank Mobile Banking 17
Suica 10
Hali

The Food and Drink genre may not be a good genre as the major significant apps are dominated by the big named brands.

#### Conclusion
An app profile for the appstore should include apps from the genre Reference, Social Networking, Music and Weather.

### Most Popular Apps by Genre on Google Play
We will be working with the Category column for the android apps. Install numbers are a string that contain the string character '+'. In order to calculate average installs, we will remove the '+' character using the string replace() method and then convert the numbers to floats.

Then we will use the same nested loop structure from the previous section to calculate averages and perform analysis.

In [42]:
# Generate frequency of the category field using the freq_table() function
andcat_freq = freq_table(android_free, 1)

In [45]:
for category in andcat_freq:
    total = 0
    len_genre = 0

    for row in android_free:
        category_app = row[1]
        if category_app == category:
            installs = row[5].replace('+', "").replace(',', '') # remove + sign
            total += float(installs) # installs total
            len_genre += 1

    avg_rating = total / len_genre
    print(category, avg_rating)

ART_AND_DESIGN 1986335.0877192982
AUTO_AND_VEHICLES 647317.8170731707
BEAUTY 513151.88679245283
BOOKS_AND_REFERENCE 8767811.894736841
BUSINESS 1712290.1474201474
COMICS 817657.2727272727
COMMUNICATION 38456119.167247385
DATING 854028.8303030303
EDUCATION 1833495.145631068
ENTERTAINMENT 11640705.88235294
EVENTS 253542.22222222222
FINANCE 1387692.475609756
FOOD_AND_DRINK 1924897.7363636363
HEALTH_AND_FITNESS 4188821.9853479853
HOUSE_AND_HOME 1331540.5616438356
LIBRARIES_AND_DEMO 638503.734939759
LIFESTYLE 1437816.2687861272
GAME 15588015.603248259
FAMILY 3695641.8198090694
MEDICAL 120550.61980830671
SOCIAL 23253652.127118643
SHOPPING 7036877.311557789
PHOTOGRAPHY 17840110.40229885
SPORTS 3638640.1428571427
TRAVEL_AND_LOCAL 13984077.710144928
TOOLS 10801391.298666667
PERSONALIZATION 5201482.6122448975
PRODUCTIVITY 16787331.344927534
PARENTING 542603.6206896552
WEATHER 5074486.197183099
VIDEO_PLAYERS 24727872.452830188
NEWS_AND_MAGAZINES 9549178.467741935
MAPS_AND_NAVIGATION 4056941.774193

#### Analysis of the Most Popular Apps on the Google Play Store
The highest average review apps are from the following categories: Communication, Video Players, Social, Photography, Productivity, Game, Travel and Local, Entertainment, Tools.

We will review to see how spread the installs are across apps in each of these genres.

In [47]:
for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'COMMUNICATION':
        print(row[0], installs)

WhatsApp Messenger 1000000000.0
Messenger for SMS 10000000.0
My Tele2 5000000.0
imo beta free calls and text 100000000.0
Contacts 50000000.0
Call Free – Free Call 5000000.0
Web Browser & Explorer 5000000.0
Browser 4G 10000000.0
MegaFon Dashboard 10000000.0
ZenUI Dialer & Contacts 10000000.0
Cricket Visual Voicemail 10000000.0
TracFone My Account 1000000.0
Xperia Link™ 10000000.0
TouchPal Keyboard - Fun Emoji & Android Keyboard 10000000.0
Skype Lite - Free Video Call & Chat 5000000.0
My magenta 1000000.0
Android Messages 100000000.0
Google Duo - High Quality Video Calls 500000000.0
Seznam.cz 1000000.0
Antillean Gold Telegram (original version) 100000.0
AT&T Visual Voicemail 10000000.0
GMX Mail 10000000.0
Omlet Chat 10000000.0
My Vodacom SA 5000000.0
Microsoft Edge 5000000.0
Messenger – Text and Video Chat for Free 1000000000.0
imo free video calls and chat 500000000.0
Calls & Text by Mo+ 5000000.0
free video calls and chat 50000000.0
Skype - free IM & video calls 1000000000.0
Who 100000

From initial analysis, there are a few apps with significantly high number of installs which may be influencing the analysis and the average rating. We will exclude apps with installs in excess of 100m+

In [51]:
# calculate average installs for the less than 100m+ communication apps
apps_under100m = list()

for row in android_free:
    category_app = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))  # remove + sign
    if category_app == 'COMMUNICATION' and installs < 100000000:  # limit installs to those below 100m
        apps_under100m.append(installs)

avg_rating = sum(apps_under100m) / len(apps_under100m)
avg_rating

3603485.3884615386

If we remove the high installs apps, the average rating reduces by a factor of 10.

In [57]:
# List of apps and installs for communication category where installs less than 100m

for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'COMMUNICATION' and installs < 100000000:
        print(row[0], installs)

Messenger for SMS 10000000.0
My Tele2 5000000.0
Contacts 50000000.0
Call Free – Free Call 5000000.0
Web Browser & Explorer 5000000.0
Browser 4G 10000000.0
MegaFon Dashboard 10000000.0
ZenUI Dialer & Contacts 10000000.0
Cricket Visual Voicemail 10000000.0
TracFone My Account 1000000.0
Xperia Link™ 10000000.0
TouchPal Keyboard - Fun Emoji & Android Keyboard 10000000.0
Skype Lite - Free Video Call & Chat 5000000.0
My magenta 1000000.0
Seznam.cz 1000000.0
Antillean Gold Telegram (original version) 100000.0
AT&T Visual Voicemail 10000000.0
GMX Mail 10000000.0
Omlet Chat 10000000.0
My Vodacom SA 5000000.0
Microsoft Edge 5000000.0
Calls & Text by Mo+ 5000000.0
free video calls and chat 50000000.0
Messaging+ SMS, MMS Free 1000000.0
chomp SMS 10000000.0
Glide - Video Chat Messenger 10000000.0
Text SMS 10000000.0
Talkray - Free Calls & Texts 10000000.0
GroupMe 10000000.0
mysms SMS Text Messaging Sync 1000000.0
2ndLine - Second Phone Number 1000000.0
Ninesky Browser 1000000.0
Dolphin Browser - Fa

However, the remaining apps still have significant install numbers in the 1m-10m range. This will be a good genre for the app profile.

In [54]:
for row in android_free:
    category = row[1]
    if category == 'VIDEO_PLAYERS':
        print(row[0], row[5])

YouTube 1,000,000,000+
All Video Downloader 2018 1,000,000+
Video Downloader 10,000,000+
HD Video Player 1,000,000+
Iqiyi (for tablet) 1,000,000+
Video Player All Format 10,000,000+
Motorola Gallery 100,000,000+
Free TV series 100,000+
Video Player All Format for Android 500,000+
VLC for Android 100,000,000+
Code 10,000,000+
Vote for 50,000,000+
XX HD Video downloader-Free Video Downloader 1,000,000+
OBJECTIVE 1,000,000+
Music - Mp3 Player 10,000,000+
HD Movie Video Player 1,000,000+
YouCut - Video Editor & Video Maker, No Watermark 5,000,000+
Video Editor,Crop Video,Movie Video,Music,Effects 1,000,000+
YouTube Studio 10,000,000+
video player for android 10,000,000+
Vigo Video 50,000,000+
Google Play Movies & TV 1,000,000,000+
HTC Service － DLNA 10,000,000+
VPlayer 1,000,000+
MiniMovie - Free Video and Slideshow Editor 50,000,000+
Samsung Video Library 50,000,000+
OnePlus Gallery 1,000,000+
LIKE – Magic Video Maker & Community 50,000,000+
HTC Service—Video Player 5,000,000+
Play Tube 1

We will perform the analysis above to the rest of the categories to see how spread the average is and determine whether they are a good addition to the app profile.

In [58]:
# calculate average installs for the less than 100m+ video player apps
apps_under100m = list()

for row in android_free:
    category_app = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))  # remove + sign
    if category_app == 'VIDEO_PLAYERS' and installs < 100000000:  # limit installs to those below 100m
        apps_under100m.append(installs)

avg_rating = sum(apps_under100m) / len(apps_under100m)
print('Average installs for under 100m: ', avg_rating)

# List of apps and installs for video player category where installs less than 100m
for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'VIDEO_PLAYERS' and installs < 100000000:
        print(row[0], installs)

Average installs for under 100m:  5544878.133333334
All Video Downloader 2018 1000000.0
Video Downloader 10000000.0
HD Video Player 1000000.0
Iqiyi (for tablet) 1000000.0
Video Player All Format 10000000.0
Free TV series 100000.0
Video Player All Format for Android 500000.0
Code 10000000.0
Vote for 50000000.0
XX HD Video downloader-Free Video Downloader 1000000.0
OBJECTIVE 1000000.0
Music - Mp3 Player 10000000.0
HD Movie Video Player 1000000.0
YouCut - Video Editor & Video Maker, No Watermark 5000000.0
Video Editor,Crop Video,Movie Video,Music,Effects 1000000.0
YouTube Studio 10000000.0
video player for android 10000000.0
Vigo Video 50000000.0
HTC Service － DLNA 10000000.0
VPlayer 1000000.0
MiniMovie - Free Video and Slideshow Editor 50000000.0
Samsung Video Library 50000000.0
OnePlus Gallery 1000000.0
LIKE – Magic Video Maker & Community 50000000.0
HTC Service—Video Player 5000000.0
Play Tube 1000000.0
Droid Zap by Motorola 5000000.0
video player 1000000.0
G Guide Program Guide (SOFTB

After excluding the high install apps, the average install drops from 24.7m to 5.5m. That's around a 5x ratings reduction. However, nearly all the remainder apps have significant download numbers. This is definitely a genre for the app profile.

In [59]:
for row in android_free:
    category = row[1]
    if category == 'SOCIAL':
        print(row[0], row[5])

Facebook 1,000,000,000+
Facebook Lite 500,000,000+
Tumblr 100,000,000+
Social network all in one 2018 100,000+
Pinterest 100,000,000+
TextNow - free text + calls 10,000,000+
Google+ 1,000,000,000+
The Messenger App 1,000,000+
Messenger Pro 1,000,000+
Free Messages, Video, Chat,Text for Messenger Plus 1,000,000+
Telegram X 5,000,000+
The Video Messenger App 100,000+
Jodel - The Hyperlocal App 1,000,000+
Hide Something - Photo, Video 5,000,000+
Love Sticker 1,000,000+
Web Browser & Fast Explorer 5,000,000+
LiveMe - Video chat, new friends, and make money 10,000,000+
VidStatus app - Status Videos & Status Downloader 5,000,000+
Love Images 1,000,000+
Web Browser ( Fast & Secure Web Explorer) 500,000+
SPARK - Live random video chat & meet new people 5,000,000+
Golden telegram 50,000+
Facebook Local 1,000,000+
Meet – Talk to Strangers Using Random Video Chat 5,000,000+
MobilePatrol Public Safety App 1,000,000+
💘 WhatsLov: Smileys of love, stickers and GIF 1,000,000+
HTC Social Plugin - Faceb

In [60]:
# calculate average installs for the less than 100m+ social apps
apps_under100m = list()

for row in android_free:
    category_app = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))  # remove + sign
    if category_app == 'SOCIAL' and installs < 100000000:  # limit installs to those below 100m
        apps_under100m.append(installs)

avg_rating = sum(apps_under100m) / len(apps_under100m)
print('Average installs for under 100m: ', avg_rating)

# List of apps and installs for social category where installs less than 100m
for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'SOCIAL' and installs < 100000000:
        print(row[0], installs)

Average installs for under 100m:  3084582.5201793723
Social network all in one 2018 100000.0
TextNow - free text + calls 10000000.0
The Messenger App 1000000.0
Messenger Pro 1000000.0
Free Messages, Video, Chat,Text for Messenger Plus 1000000.0
Telegram X 5000000.0
The Video Messenger App 100000.0
Jodel - The Hyperlocal App 1000000.0
Hide Something - Photo, Video 5000000.0
Love Sticker 1000000.0
Web Browser & Fast Explorer 5000000.0
LiveMe - Video chat, new friends, and make money 10000000.0
VidStatus app - Status Videos & Status Downloader 5000000.0
Love Images 1000000.0
Web Browser ( Fast & Secure Web Explorer) 500000.0
SPARK - Live random video chat & meet new people 5000000.0
Golden telegram 50000.0
Facebook Local 1000000.0
Meet – Talk to Strangers Using Random Video Chat 5000000.0
MobilePatrol Public Safety App 1000000.0
💘 WhatsLov: Smileys of love, stickers and GIF 1000000.0
HTC Social Plugin - Facebook 10000000.0
Quora 10000000.0
Kate Mobile for VK 10000000.0
Family GPS tracker 

The average installs for social under 100m is 3m, dropping from 23m. Social is however a good category since apps have significant numbers.

In [61]:
for row in android_free:
    category = row[1]
    if category == 'PHOTOGRAPHY':
        print(row[0], row[5])

TouchNote: Cards & Gifts 1,000,000+
FreePrints – Free Photos Delivered 1,000,000+
Groovebook Photo Books & Gifts 500,000+
Moony Lab - Print Photos, Books & Magnets ™ 50,000+
LALALAB prints your photos, photobooks and magnets 1,000,000+
Snapfish 1,000,000+
Motorola Camera 50,000,000+
HD Camera - Best Cam with filters & panorama 5,000,000+
LightX Photo Editor & Photo Effects 10,000,000+
Sweet Snap - live filter, Selfie photo edit 10,000,000+
HD Camera - Quick Snap Photo & Video 1,000,000+
B612 - Beauty & Filter Camera 100,000,000+
Waterfall Photo Frames 1,000,000+
Photo frame 100,000+
Huji Cam 5,000,000+
Unicorn Photo 1,000,000+
HD Camera 5,000,000+
Makeup Editor -Beauty Photo Editor & Selfie Camera 1,000,000+
Makeup Photo Editor: Makeup Camera & Makeup Editor 1,000,000+
Moto Photo Editor 5,000,000+
InstaBeauty -Makeup Selfie Cam 50,000,000+
Garden Photo Frames - Garden Photo Editor 500,000+
Photo Frame 10,000,000+
Selfie Camera - Photo Editor & Filter & Sticker 50,000,000+
Sweet Snap Li

In [62]:
# calculate average installs for the less than 100m+ photography apps
apps_under100m = list()

for row in android_free:
    category_app = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))  # remove + sign
    if category_app == 'PHOTOGRAPHY' and installs < 100000000:  # limit installs to those below 100m
        apps_under100m.append(installs)

avg_rating = sum(apps_under100m) / len(apps_under100m)
print('Average installs for under 100m: ', avg_rating)

# List of apps and installs for photography category where installs less than 100m
for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'PHOTOGRAPHY' and installs < 100000000:
        print(row[0], installs)

Average installs for under 100m:  7670532.29338843
TouchNote: Cards & Gifts 1000000.0
FreePrints – Free Photos Delivered 1000000.0
Groovebook Photo Books & Gifts 500000.0
Moony Lab - Print Photos, Books & Magnets ™ 50000.0
LALALAB prints your photos, photobooks and magnets 1000000.0
Snapfish 1000000.0
Motorola Camera 50000000.0
HD Camera - Best Cam with filters & panorama 5000000.0
LightX Photo Editor & Photo Effects 10000000.0
Sweet Snap - live filter, Selfie photo edit 10000000.0
HD Camera - Quick Snap Photo & Video 1000000.0
Waterfall Photo Frames 1000000.0
Photo frame 100000.0
Huji Cam 5000000.0
Unicorn Photo 1000000.0
HD Camera 5000000.0
Makeup Editor -Beauty Photo Editor & Selfie Camera 1000000.0
Makeup Photo Editor: Makeup Camera & Makeup Editor 1000000.0
Moto Photo Editor 5000000.0
InstaBeauty -Makeup Selfie Cam 50000000.0
Garden Photo Frames - Garden Photo Editor 500000.0
Photo Frame 10000000.0
Selfie Camera - Photo Editor & Filter & Sticker 50000000.0
Sweet Snap Lite - live f

The average installs for photography category under 100m is 7.6m, dropping from 17.8m. This is the lowest average drop so far, meaning that large install apps do not weight this category as much as the others. Photography is definitely a good category for the apps profile.

In [63]:
for row in android_free:
    category = row[1]
    if category == 'PRODUCTIVITY':
        print(row[0], row[5])

Microsoft Word 500,000,000+
All-In-One Toolbox: Cleaner, Booster, App Manager 10,000,000+
AVG Cleaner – Speed, Battery & Memory Booster 10,000,000+
QR Scanner & Barcode Scanner 2018 10,000,000+
Chrome Beta 10,000,000+
Microsoft Outlook 100,000,000+
Google PDF Viewer 10,000,000+
My Claro Peru 5,000,000+
Power Booster - Junk Cleaner & CPU Cooler & Boost 1,000,000+
Google Assistant 10,000,000+
Microsoft OneDrive 100,000,000+
Calculator - unit converter 50,000,000+
Microsoft OneNote 100,000,000+
Metro name iD 10,000,000+
Google Keep 100,000,000+
Archos File Manager 5,000,000+
ES File Explorer File Manager 100,000,000+
ASUS SuperNote 10,000,000+
HTC File Manager 10,000,000+
MyMTN 1,000,000+
Dropbox 500,000,000+
ASUS Quick Memo 10,000,000+
HTC Calendar 10,000,000+
Google Docs 100,000,000+
ASUS Calling Screen 10,000,000+
lifebox 5,000,000+
Yandex.Disk 5,000,000+
Content Transfer 5,000,000+
HTC Mail 10,000,000+
Advanced Task Killer 50,000,000+
MyVodafone (India) - Online Recharge & Pay Bills 1

In [64]:
# calculate average installs for the less than 100m+ productivity apps
apps_under100m = list()

for row in android_free:
    category_app = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))  # remove + sign
    if category_app == 'PRODUCTIVITY' and installs < 100000000:  # limit installs to those below 100m
        apps_under100m.append(installs)

avg_rating = sum(apps_under100m) / len(apps_under100m)
print('Average installs for under 100m: ', avg_rating)

# List of apps and installs for productivity category where installs less than 100m
for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'PRODUCTIVITY' and installs < 100000000:
        print(row[0], installs)

Average installs for under 100m:  3379657.318885449
All-In-One Toolbox: Cleaner, Booster, App Manager 10000000.0
AVG Cleaner – Speed, Battery & Memory Booster 10000000.0
QR Scanner & Barcode Scanner 2018 10000000.0
Chrome Beta 10000000.0
Google PDF Viewer 10000000.0
My Claro Peru 5000000.0
Power Booster - Junk Cleaner & CPU Cooler & Boost 1000000.0
Google Assistant 10000000.0
Calculator - unit converter 50000000.0
Metro name iD 10000000.0
Archos File Manager 5000000.0
ASUS SuperNote 10000000.0
HTC File Manager 10000000.0
MyMTN 1000000.0
ASUS Quick Memo 10000000.0
HTC Calendar 10000000.0
ASUS Calling Screen 10000000.0
lifebox 5000000.0
Yandex.Disk 5000000.0
Content Transfer 5000000.0
HTC Mail 10000000.0
Advanced Task Killer 50000000.0
MyVodafone (India) - Online Recharge & Pay Bills 10000000.0
Microsoft Translator 5000000.0
My Airtel-Online Recharge, Pay Bill, Wallet, UPI 50000000.0
Do It Later: Tasks & To-Dos 50000000.0
Verizon Cloud 50000000.0
myAT&T 50000000.0
Hacker's Keyboard 10000

The average installs for productivity under 100m dropped to 3.3m from 16.7m. There is a significant number of very low install apps however, this category has a lot of app activity, which may reflect this spread.

In [65]:
for row in android_free:
    category = row[1]
    if category == 'GAME':
        print(row[0], row[5])

Solitaire 10,000,000+
Sonic Dash 100,000,000+
PAC-MAN 100,000,000+
Bubble Witch 3 Saga 50,000,000+
Race the Traffic Moto 10,000,000+
Marble - Temple Quest 10,000,000+
Shooting King 10,000,000+
Geometry Dash World 10,000,000+
Jungle Marble Blast 5,000,000+
Roll the Ball® - slide puzzle 100,000,000+
Block Craft 3D: Building Simulator Games For Free 50,000,000+
Farm Fruit Pop: Party Time 1,000,000+
Love Balls 50,000,000+
Piano Tiles 2™ 100,000,000+
Pokémon GO 100,000,000+
Paint Hit 10,000,000+
Snake VS Block 50,000,000+
Rolly Vortex 10,000,000+
Woody Puzzle 1,000,000+
Stack Jump 10,000,000+
The Cube 5,000,000+
Extreme Car Driving Simulator 100,000,000+
Bricks n Balls 1,000,000+
The Fish Master! 1,000,000+
Color Road 10,000,000+
Draw In 10,000,000+
PLANK! 500,000+
Looper! 1,000,000+
Trivia Crack 100,000,000+
Will it Crush? 5,000,000+
Tomb of the Mask 5,000,000+
Baseball Boy! 10,000,000+
Hello Stars 10,000,000+
Tank Stars 10,000,000+
Hole.io 10,000,000+
Mini Golf King - Multiplayer Game 5,0

In [66]:
# calculate average installs for the less than 100m+ game apps
apps_under100m = list()

for row in android_free:
    category_app = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))  # remove + sign
    if category_app == 'GAME' and installs < 100000000:  # limit installs to those below 100m
        apps_under100m.append(installs)

avg_rating = sum(apps_under100m) / len(apps_under100m)
print('Average installs for under 100m: ', avg_rating)

# List of apps and installs for game category where installs less than 100m
for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'GAME' and installs < 100000000:
        print(row[0], installs)

Average installs for under 100m:  6272564.694894147
Solitaire 10000000.0
Bubble Witch 3 Saga 50000000.0
Race the Traffic Moto 10000000.0
Marble - Temple Quest 10000000.0
Shooting King 10000000.0
Geometry Dash World 10000000.0
Jungle Marble Blast 5000000.0
Block Craft 3D: Building Simulator Games For Free 50000000.0
Farm Fruit Pop: Party Time 1000000.0
Love Balls 50000000.0
Paint Hit 10000000.0
Snake VS Block 50000000.0
Rolly Vortex 10000000.0
Woody Puzzle 1000000.0
Stack Jump 10000000.0
The Cube 5000000.0
Bricks n Balls 1000000.0
The Fish Master! 1000000.0
Color Road 10000000.0
Draw In 10000000.0
PLANK! 500000.0
Looper! 1000000.0
Will it Crush? 5000000.0
Tomb of the Mask 5000000.0
Baseball Boy! 10000000.0
Hello Stars 10000000.0
Tank Stars 10000000.0
Hole.io 10000000.0
Mini Golf King - Multiplayer Game 5000000.0
Flip the Gun - Simulator Game 10000000.0
Mad Skills BMX 2 1000000.0
MMX Hill Dash 2 – Offroad Truck, Car & Bike Racing 1000000.0
Word Link 10000000.0
Last Day on Earth: Survival

The average installs for the game category under 100m is 6.2m, dropping from 15.5m. This is one of the lowest average drop so far, meaning that large install apps do not weight this category as much as some others. Game Category is definitely a good category for the apps profile.

In [67]:
for row in android_free:
    category = row[1]
    if category == 'TRAVEL_AND_LOCAL':
        print(row[0], row[5])

trivago: Hotels & Travel 50,000,000+
Hopper - Watch & Book Flights 5,000,000+
TripIt: Travel Organizer 1,000,000+
Trip by Skyscanner - City & Travel Guide 500,000+
CityMaps2Go Plan Trips Travel Guide Offline Maps 1,000,000+
KAYAK Flights, Hotels & Cars 10,000,000+
World Travel Guide by Triposo 500,000+
Booking.com Travel Deals 100,000,000+
Hostelworld: Hostels & Cheap Hotels Travel App 1,000,000+
Google Trips - Travel Planner 5,000,000+
GPS Map Free 5,000,000+
GasBuddy: Find Cheap Gas 10,000,000+
Southwest Airlines 5,000,000+
AT&T Navigator: Maps, Traffic 10,000,000+
VZ Navigator 50,000,000+
KakaoMap - Map / Navigation 10,000,000+
AirAsia 10,000,000+
Expedia Hotels, Flights & Car Rental Travel Deals 10,000,000+
Goibibo - Flight Hotel Bus Car IRCTC Booking App 10,000,000+
Allegiant 1,000,000+
Amtrak 1,000,000+
JAL (Domestic and international flights) 1,000,000+
Flight & Hotel Booking App - ixigo 5,000,000+
VZ Navigator for Tablets 500,000+
TripAdvisor Hotels Flights Restaurants Attracti

In [68]:
# calculate average installs for the less than 100m+ travel & local apps
apps_under100m = list()

for row in android_free:
    category_app = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))  # remove + sign
    if category_app == 'TRAVEL_AND_LOCAL' and installs < 100000000:  # limit installs to those below 100m
        apps_under100m.append(installs)

avg_rating = sum(apps_under100m) / len(apps_under100m)
print('Average installs for under 100m: ', avg_rating)

# List of apps and installs for travel & local category where installs less than 100m
for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'TRAVEL_AND_LOCAL' and installs < 100000000:
        print(row[0], installs)

Average installs for under 100m:  2944079.6336633665
trivago: Hotels & Travel 50000000.0
Hopper - Watch & Book Flights 5000000.0
TripIt: Travel Organizer 1000000.0
Trip by Skyscanner - City & Travel Guide 500000.0
CityMaps2Go Plan Trips Travel Guide Offline Maps 1000000.0
KAYAK Flights, Hotels & Cars 10000000.0
World Travel Guide by Triposo 500000.0
Hostelworld: Hostels & Cheap Hotels Travel App 1000000.0
Google Trips - Travel Planner 5000000.0
GPS Map Free 5000000.0
GasBuddy: Find Cheap Gas 10000000.0
Southwest Airlines 5000000.0
AT&T Navigator: Maps, Traffic 10000000.0
VZ Navigator 50000000.0
KakaoMap - Map / Navigation 10000000.0
AirAsia 10000000.0
Expedia Hotels, Flights & Car Rental Travel Deals 10000000.0
Goibibo - Flight Hotel Bus Car IRCTC Booking App 10000000.0
Allegiant 1000000.0
Amtrak 1000000.0
JAL (Domestic and international flights) 1000000.0
Flight & Hotel Booking App - ixigo 5000000.0
VZ Navigator for Tablets 500000.0
HSL - Tickets, route planner and information 100000.

Average rating for travel and local below 100m installs dropped to 2.9m from 13.9m. However, a significant number of the apps below 100m installs have small install numbers. This may not be a good addition to the app profile.

In [69]:
for row in android_free:
    category = row[1]
    if category == 'ENTERTAINMENT':
        print(row[0], row[5])

Complete Spanish Movies 1,000,000+
Pluto TV - It’s Free TV 1,000,000+
Mobile TV 10,000,000+
TV+ 5,000,000+
Digital TV 5,000,000+
Motorola Spotlight Player™ 10,000,000+
Vigo Lite 5,000,000+
Hotstar 100,000,000+
Peers.TV: broadcast TV channels First, Match TV, TNT ... 5,000,000+
The green alien dance 1,000,000+
Spectrum TV 5,000,000+
H TV 5,000,000+
StarTimes - Live International Champions Cup 1,000,000+
Cinematic Cinematic 1,000,000+
MEGOGO - Cinema and TV 10,000,000+
Talking Angela 100,000,000+
DStv Now 5,000,000+
ivi - movies and TV shows in HD 10,000,000+
Radio Javan 1,000,000+
Talking Ginger 2 50,000,000+
Girly Lock Screen Wallpaper with Quotes 5,000,000+
🔥 Football Wallpapers 4K | Full HD Backgrounds 😍 1,000,000+
Movies by Flixster, with Rotten Tomatoes 10,000,000+
Low Poly – Puzzle art game 1,000,000+
BBC Media Player 10,000,000+
Amazon Prime Video 50,000,000+
Adult Glitter Color by Number Book - Sandbox Pages 1,000,000+
IMDb Movies & TV 100,000,000+
Twitch: Livestream Multiplayer

In [70]:
# calculate average installs for the less than 100m+ entertainment apps
apps_under100m = list()

for row in android_free:
    category_app = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))  # remove + sign
    if category_app == 'ENTERTAINMENT' and installs < 100000000:  # limit installs to those below 100m
        apps_under100m.append(installs)

avg_rating = sum(apps_under100m) / len(apps_under100m)
print('Average installs for under 100m: ', avg_rating)

# List of apps and installs for entertainment category where installs less than 100m
for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'ENTERTAINMENT' and installs < 100000000:
        print(row[0], installs)

Average installs for under 100m:  6118250.0
Complete Spanish Movies 1000000.0
Pluto TV - It’s Free TV 1000000.0
Mobile TV 10000000.0
TV+ 5000000.0
Digital TV 5000000.0
Motorola Spotlight Player™ 10000000.0
Vigo Lite 5000000.0
Peers.TV: broadcast TV channels First, Match TV, TNT ... 5000000.0
The green alien dance 1000000.0
Spectrum TV 5000000.0
H TV 5000000.0
StarTimes - Live International Champions Cup 1000000.0
Cinematic Cinematic 1000000.0
MEGOGO - Cinema and TV 10000000.0
DStv Now 5000000.0
ivi - movies and TV shows in HD 10000000.0
Radio Javan 1000000.0
Talking Ginger 2 50000000.0
Girly Lock Screen Wallpaper with Quotes 5000000.0
🔥 Football Wallpapers 4K | Full HD Backgrounds 😍 1000000.0
Movies by Flixster, with Rotten Tomatoes 10000000.0
Low Poly – Puzzle art game 1000000.0
BBC Media Player 10000000.0
Amazon Prime Video 50000000.0
Adult Glitter Color by Number Book - Sandbox Pages 1000000.0
Twitch: Livestream Multiplayer Games & Esports 50000000.0
Ziggo GO 1000000.0
YouTube Gamin

The average installs for the entertainment category under 100m is 6.1m, dropping from 11.6m. This is one of the lowest average drop so far, meaning that large install apps do not weight this category as much as some others. Entertainment Category is definitely a good category for the apps profile.

In [71]:
for row in android_free:
    category = row[1]
    if category == 'TOOLS':
        print(row[0], row[5])

Google 1,000,000,000+
Google Translate 500,000,000+
Moto Display 10,000,000+
Motorola Alert 50,000,000+
Motorola Assist 50,000,000+
Moto Suggestions ™ 1,000,000+
Moto Voice 10,000,000+
Calculator 100,000,000+
Device Help 100,000,000+
Account Manager 100,000,000+
myMetro 10,000,000+
File Manager 50,000,000+
My Telcel 50,000,000+
Calculator - free calculator, multi calculator app 10,000,000+
ASUS Sound Recorder 10,000,000+
iWnn IME for Nexus 5,000,000+
Samsung Max - Data Savings & Privacy Protection 10,000,000+
Android TV Remote Service 1,000,000+
ZenUI Help 10,000,000+
Calculator - free calculator ,multi calculator app 100,000+
SHAREit - Transfer & Share 500,000,000+
ZenUI Keyboard – Emoji, Theme 10,000,000+
Files Go by Google: Free up space on your phone 10,000,000+
SD card backup 1,000,000+
Nokia mobile support 5,000,000+
File Manager -- Take Command of Your Files Easily 10,000,000+
Samsung Calculator 100,000,000+
Clear 10,000,000+
Phone 10,000,000+
HTC Lock Screen 10,000,000+
Gboard 

In [72]:
# calculate average installs for the less than 100m+ tools apps
apps_under100m = list()

for row in android_free:
    category_app = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))  # remove + sign
    if category_app == 'TOOLS' and installs < 100000000:  # limit installs to those below 100m
        apps_under100m.append(installs)

avg_rating = sum(apps_under100m) / len(apps_under100m)
print('Average installs for under 100m: ', avg_rating)

# List of apps and installs for tools category where installs less than 100m
for row in android_free:
    category = row[1]
    installs = float(row[5].replace('+', "").replace(',', ''))
    if category == 'TOOLS' and installs < 100000000:
        print(row[0], installs)

Average installs for under 100m:  3191461.128987517
Moto Display 10000000.0
Motorola Alert 50000000.0
Motorola Assist 50000000.0
Moto Suggestions ™ 1000000.0
Moto Voice 10000000.0
myMetro 10000000.0
File Manager 50000000.0
My Telcel 50000000.0
Calculator - free calculator, multi calculator app 10000000.0
ASUS Sound Recorder 10000000.0
iWnn IME for Nexus 5000000.0
Samsung Max - Data Savings & Privacy Protection 10000000.0
Android TV Remote Service 1000000.0
ZenUI Help 10000000.0
Calculator - free calculator ,multi calculator app 100000.0
ZenUI Keyboard – Emoji, Theme 10000000.0
Files Go by Google: Free up space on your phone 10000000.0
SD card backup 1000000.0
Nokia mobile support 5000000.0
File Manager -- Take Command of Your Files Easily 10000000.0
Clear 10000000.0
Phone 10000000.0
HTC Lock Screen 10000000.0
AT&T Smart Wi-Fi 10000000.0
Google app for Android TV 10000000.0
Sound Recorder: Recorder & Voice Changer Free 10000000.0
Remote Link (PC Remote) 10000000.0
HTC Sense Input 100000

The average installs for the tools apps with installs under 100m is 3.1m dropping from 10.8m. The remainder apps with high install numbers are dominated by a number of brand name apps. This may not be a good addition to the app profile.

#### Conclusion
An app profile for the Google Play Store should include apps from the following categories: Communication, Video Players, Social, Photography, Game, Entertainment. Special emphasis should be given to game, entertainment and photography categories.

An app profile that will work across both stores will emphasize social networking and entertainment apps. These are popular apps in both markets. They reflect popularity by genres and by installs.