# Analysing Mobile App Data 

## Description of the Project

We are part of a company that builds Android and iOS mobile apps. We make them available on Google Play and the App Store. 

We only build apps that are free to download and install, and our main source of revenue consists of in-app ads. This means that the number of users of our apps determines our revenue for any given app — the more users who see and engage with the ads, the better. Our goal for this project is to analyze data to help our developers understand what type of apps are likely to attract more users.

As of Q3 2022, there were approximately 1.6 million iOS apps available on the App Store, and 3.5 million Android apps on Google Play.

## Project Goal

The goal is to collect and analyse data about mobile apps to help developers understand what type of apps are likely to attract more users.

## Opening and Exploring the Data

Collecting data for over 4 million apps requires a significant amount of time and money, so we'll try to analyze a sample of the data instead. To avoid spending resources on collecting new data ourselves, we should first try to see if we can find any relevant existing data at no cost. Luckily, here are two data sets that seem suitable for our goals:

- [A dataset](https://www.kaggle.com/lava18/google-play-store-apps) containing data about approximately 10,000 Android apps from Google Play; the data was collected in August 2018.
- [A dataset](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps) containing data about approximately 7,000 iOS apps from the App Store; the data was collected in July 2017.

Importing the packages needed to do the analysis:

In [2]:
import pandas as pd
import re
import matplotlib.pyplot as plt
from IPython.display import display, HTML

Importing the csv files to lists:

In [3]:
from csv import reader

### The Google Play data set ###
opened_file = open('googleplaystore.csv', encoding='UTF8')
read_file = reader(opened_file)
android = list(read_file)
android_header = android[0]
android = android[1:]

### The App Store data set ###
opened_file = open('AppleStore.csv', encoding='UTF8')
read_file = reader(opened_file)
ios = list(read_file)
ios_header = ios[0]
ios = ios[1:]

Defining a function to explore the data sets:

In [4]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

Testing the functionality:

In [5]:
explore_data(android,0,2)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']




Looking at the structure of the headers:

In [6]:
print(android_header)
print('\n')
print(ios_header)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


## Cleaning the data to correct/remove inaccurate data, remove duplicates, remove non-English apps and remove apps that aren't free:

### Correcting and/or Removing Inaccurate Data

To determine if some rows have incomplete data that are resulting in a shorter list of elements:

For the android data:

In [7]:
## For each row in the android data set, if the length of the row is not equal to the length of the header, print the row, print it's index position.

android_header_length = len(android_header)
print(android_header_length)

for row in android[1:]:
    if len(row) != len(android_header):
        print(row)
        print("\n")
        print("Index postion is:", android.index(row))

13
['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']


Index postion is: 10472


This indicates that the row for the app "Life Made WI-Fi Touchscreen Photo Frame" is missing a value (it's Category) and will most likely cause errors in analyses later (e.g. string values mixed with floats).

For the ios data:

In [8]:
## For each row in the ios data set, if the length of the row is not equal to the length of the header, print the row, print it's index position.

ios_header_length = len(ios_header)
print(ios_header_length)

for row in ios[1:]:
    if len(row) != ios_header_length:
        print(row)
        print("\n")
        print("Index postion is:", ios.index(row))

16


Removing the incorrect row from the android data:

In [9]:
print(len(android))
del(android[10472])
print(len(android))

10841
10840


### Removing duplicate entries:

Some apps may appear more than once in each data set. To determine if this is true, we will use the app name as reference and add any duplicates to a separate list.

Determining the number of duplicate entries:

In [10]:
duplicate_apps = []
unique_apps = []

for app in android:
    
    ## the variable "name" is whatever is in the first element position
    name = app[0]
    
    ## if the "name" is already in the "unique_apps" list, then add it to the "duplicate_apps" list
    if name in unique_apps:
        duplicate_apps.append(name)
        
    ## else put the "name" into the "unique_apps" list
    else:
        unique_apps.append(name)

# print the length of each list
print(len(duplicate_apps))
print(len(unique_apps))        

1181
9659


Looking at example duplicates:

In [11]:
print(duplicate_apps[:4])

['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings']


Querying how many duplicates of one of these exist:

In [12]:
for app in android:
    if app[0] == 'Google My Business':
        print(app)

['Google My Business', 'BUSINESS', '4.4', '70991', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 24, 2018', '2.19.0.204537701', '4.4 and up']
['Google My Business', 'BUSINESS', '4.4', '70991', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 24, 2018', '2.19.0.204537701', '4.4 and up']
['Google My Business', 'BUSINESS', '4.4', '70991', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 24, 2018', '2.19.0.204537701', '4.4 and up']


For "Google My Business", there are three entries.

Discussion on Kaggle indicates that Instagram has multiple entires, but the reviews for each differ:

In [13]:
for app in android:
    if app[0] == 'Instagram':
        print(app)

['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']


These duplicates need to be removed, but the entry with the highest number of reviews should be the duplicate that is kept.

To make a list of unique apps with its number of reviews being the maximum from the duplicates.

In [101]:
reviews_max = {}
for app in android:
    name = app[0]
    n_reviews = float(app[3])
    if name in reviews_max:
        if n_reviews > reviews_max[name]:
            reviews_max[name] = n_reviews        
    elif name not in reviews_max:
        reviews_max[name] = n_reviews
        
print(len(reviews_max))

9659


To filter the list of android apps to remove duplicates with the fewest reviews:

In [102]:
android_clean = []
already_added = []
android_duplicates = []

for app in android:
    name = app[0]
    n_reviews = float(app[3])
    if (reviews_max[name] == n_reviews) and (name not in already_added):
        android_clean.append(app)
        already_added.append(name)
    else:
        android_duplicates.append(app)
        
print(len(android_clean))
print(len(already_added))
print(len(android_duplicates))

9659
9659
1181


Exploring the data set:

In [103]:
explore_data(android_clean, 0, 3, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9659
Number of columns: 13


Determining if the 'Instagram' duplicates are gone and the remaining entry has the most reviews.

In [104]:
for app in android_clean:
    if app[0] == 'Instagram':
        print(app)

['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']


Determining the number of duplicate entries in the ios data:

In [105]:
duplicate_ios_apps = []
unique_ios_apps = []

for app in ios:
    
    ## the variable "name" is whatever is in the first element position
    name = app[0]
    
    ## if the "name" is already in the "unique_apps" list, then add it to the "duplicate_apps" list
    if name in unique_ios_apps:
        duplicate_ios_apps.append(name)
        
    ## else put the "name" into the "unique_apps" list
    else:
        unique_ios_apps.append(name)

# print the length of each list
print(len(duplicate_ios_apps))
print(len(unique_ios_apps))        

0
7197


There are no duplicates in the ios apps data.

### Removing Non-English Apps

English text usually includes letters from the English alphabet, numbers composed of digits from 0 to 9, punctuation marks (., !, ?, ;), and other symbols (+, *, /).

The numbers corresponding to the characters we commonly use in an English text are all in the range 0 to 127, according to the ASCII (American Standard Code for Information Interchange) system. Detecting characters above this range will generally determine if text is in non-English.

In [19]:
def is_english(string):
    for character in string:
        if ord(character) > 127:
            return False
    return True
        
print(is_english('hello'))
print(is_english('爱奇艺PPS 欢乐颂2》电视剧热播'))

True
False


The problem with this function, is that it will also determine symbols like a trademark or emoji as non-English.

In [20]:
print(is_english('Docs To Go™ Free Office Suite'))
print(is_english('Instachat 😜'))

False
False


A method needs to be established to determine how to separate these titles from non-English titles as best we can. We'll use a criteria of three consecutive characters being above the 127 range.

In [21]:
def is_english(string):
    non_ascii = 0
    
    for character in string:
        if ord(character) > 127:
            non_ascii += 1
            
    if non_ascii > 3:
        return False
    else:
        return True

print(is_english('Instachat 😜'))
print(is_english('爱奇艺PPS 欢乐颂2》电视剧热播'))

True
False


Using this function to filter the app lists down to English apps:

In [22]:
android_english = []
ios_english = []

for app in android_clean:
    name = app[0]
    if is_english(name) == True:
        android_english.append(app)

for app in ios:
    name = app[1]
    if is_english(name) == True:
        ios_english.append(app)  
        
print(len(android_english))
print(len(ios_english))

9614
6183


### Isolating Free Apps

Filtering the app lists to online include those that are free:

In [34]:
android_free = []
ios_free = []

for app in android_english:
    price = app[7]
    if price == "0":
        android_free.append(app)

for app in ios_english:
    price = app[4]
    if price == "0.0":
        ios_free.append(app)

print(len(android_free))
print(len(ios_free))

8864
3222


## Most Common Apps by Genre

Our aim is to determine the kind of apps that are likely to attact more users. We should find the categories most successful on each market.

Defining a function that creates a dictionary (key: category; value = percentage of apps):

In [111]:
def freq_table(data_set, index):
    frequency_table = {}
    total = 0
    
    for row in data_set:
        value = row[index]
        total += 1
        if value in frequency_table:
            frequency_table[value] += 1
        else:
            frequency_table[value] = 1
    
    table_percentages = {}
    for key in frequency_table:
        percentage = (frequency_table[key] / total) * 100
        table_percentages[key] = percentage
            
    return table_percentages

Defining a function that takes a dictionary of category/percentage and creates a descending list:

In [69]:
def display_table(data_set, index):
    table = freq_table(data_set, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)
    
    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1],':',entry[0])

Looking at categories on the Google Play store with largest percentage of apps:

In [90]:
display_table(android_free, 1)

FAMILY : 18.907942238267147
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.9034296028880866
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.6543321299638989
ART_AND_DESIGN : 

Looking at categories on the App Store with largest percentage of apps:

In [89]:
display_table(ios_free, 11)

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


To determine the average number of installs for apps in each category on the Google Play store:

In [116]:
categories_android = freq_table(android_free, 1)


## for each category in the category list
    ## go through the master list
        ## if the category of the app in the master list is the same as the category in the categories list
            ## grab its number of installs, make it a float, add the number of installs to the total variable, increase the count for the category by one
            ## once the end of the list is reached, it calculates the average installs, prints this number, then restarts the process for the next category

for category in categories_android:
    total = 0 ## total installs
    len_category = 0
    for app in android_free:
        category_app = app[1]
        if category_app == category:
            n_installs = app[5]
            n_installs = n_installs.replace(',', '')
            n_installs = n_installs.replace('+', '')
            total += float(n_installs)
            len_category += 1
    avg_n_installs = round(total / len_category)
    print(category, ':', avg_n_installs)

ART_AND_DESIGN : 1986335
AUTO_AND_VEHICLES : 647318
BEAUTY : 513152
BOOKS_AND_REFERENCE : 8767812
BUSINESS : 1712290
COMICS : 817657
COMMUNICATION : 38456119
DATING : 854029
EDUCATION : 1833495
ENTERTAINMENT : 11640706
EVENTS : 253542
FINANCE : 1387692
FOOD_AND_DRINK : 1924898
HEALTH_AND_FITNESS : 4188822
HOUSE_AND_HOME : 1331541
LIBRARIES_AND_DEMO : 638504
LIFESTYLE : 1437816
GAME : 15588016
FAMILY : 3695642
MEDICAL : 120551
SOCIAL : 23253652
SHOPPING : 7036877
PHOTOGRAPHY : 17840110
SPORTS : 3638640
TRAVEL_AND_LOCAL : 13984078
TOOLS : 10801391
PERSONALIZATION : 5201483
PRODUCTIVITY : 16787331
PARENTING : 542604
WEATHER : 5074486
VIDEO_PLAYERS : 24727872
NEWS_AND_MAGAZINES : 9549178
MAPS_AND_NAVIGATION : 4056942


Categories with the greatest number of average installs (over 5,000,000) are:
- BOOKS_AND_REFERENCE : 8,767,812
- COMMUNICATION : 38,456,119
- ENTERTAINMENT : 11,640,706
- GAME : 15,588,016
- SOCIAL : 23,253,652
- SHOPPING : 7,036,877
- PHOTOGRAPHY : 17,840,110
- TRAVEL_AND_LOCAL : 13,984,078
- TOOLS : 10,801,391
- PERSONALIZATION : 5,201,483
- PRODUCTIVITY : 16,787,331
- WEATHER : 5,074,486
- VIDEO_PLAYERS : 24,727,872
- NEWS_AND_MAGAZINES : 9,549,178

In [129]:
for app in android_free:
    category = app[1]
    if category == 'COMMUNICATION':
        n_installs = app[5]
        n_installs = n_installs.replace(',', '')
        n_installs = n_installs.replace('+', '')
        if float(n_installs) > 100000000:
            print(app)

['WhatsApp Messenger', 'COMMUNICATION', '4.4', '69119316', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 3, 2018', 'Varies with device', 'Varies with device']
['Google Duo - High Quality Video Calls', 'COMMUNICATION', '4.6', '2083237', 'Varies with device', '500,000,000+', 'Free', '0', 'Everyone', 'Communication', 'July 31, 2018', '37.1.206017801.DR37_RC14', '4.4 and up']
['Messenger – Text and Video Chat for Free', 'COMMUNICATION', '4.0', '56646578', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 1, 2018', 'Varies with device', 'Varies with device']
['imo free video calls and chat', 'COMMUNICATION', '4.3', '4785988', '11M', '500,000,000+', 'Free', '0', 'Everyone', 'Communication', 'June 8, 2018', '9.8.000000010501', '4.0 and up']
['Skype - free IM & video calls', 'COMMUNICATION', '4.1', '10484169', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 3, 2018', 'V

To determine the average number of ratings for apps in each category on the App Store:

In [117]:
categories_ios = freq_table(ios_free, 11)


## for each category in the category list
    ## go through the master list
        ## if the category of the app in the master list is the same as the category in the categories list
            ## grab its number of installs, make it a float, add the number of installs to the total variable, increase the count for the category by one
            ## once the end of the list is reached, it calculates the average installs, prints this number, then restarts the process for the next category

for category in categories_ios:
    total = 0 ## total installs
    len_category = 0
    for app in ios_free:
        category_app = app[11]
        if category_app == category:
            n_installs = app[5]
            n_installs = n_installs.replace(',', '')
            n_installs = n_installs.replace('+', '')
            total += float(n_installs)
            len_category += 1
    avg_n_installs = round(total / len_category)
    print(category, ':', avg_n_installs)

Social Networking : 71548
Photo & Video : 28442
Games : 22789
Music : 57327
Reference : 74942
Health & Fitness : 23298
Weather : 52280
Utilities : 18684
Travel : 28244
Shopping : 26920
News : 21248
Navigation : 86090
Lifestyle : 16486
Entertainment : 14030
Food & Drink : 33334
Sports : 23009
Book : 39758
Finance : 31468
Education : 7004
Productivity : 21028
Business : 7491
Catalogs : 4004
Medical : 612


Categories with the greatest number of average reviews (over 30,000) are:
- Social Networking : 71,548
- Music : 57,327
- Reference : 74,942
- Weather : 52,280
- Navigation : 86,090
- Food & Drink : 33,334
- Book : 39,758
- Finance : 31,468

Looking at these two lists in greater detail:

A few apps appear to skew the numbers in some categories, like YouTube, Facebook, WhatsApp and Skype.

Google Play / Average Installs
- BOOKS_AND_REFERENCE : 8,767,812
    - Skewed due to the Bible, Google Play Books and Amazon Kindle.
- COMMUNICATION : 38,456,119
    - Dominated by social media apps.
- ENTERTAINMENT : 11,640,706
    - A few major players, like Twitch and Netflix.
- GAME : 15,588,016
    - A highly saturated market.
- SOCIAL : 23,253,652
    - Dominated by social media apps.
- SHOPPING : 7,036,877
    - A few major players, like Ebay and Amazon.
- PHOTOGRAPHY : 17,840,110
    - Google Photos has over 100,000,000 installs.
- TRAVEL_AND_LOCAL : 13,984,078
    - Google Maps skews average.
- TOOLS : 10,801,391
    - Google and Google Translate skew average.
- PERSONALIZATION : 5,201,483
    - There are no major players in this category, and may be suitable for exploration.
- PRODUCTIVITY : 16,787,331
    - Skewed due to Microsoft Suite, Google Calendar, Dropbox and Google Drive.
- WEATHER : 5,074,486
    - There are no major players in this category, and may be suitable for exploration.
- VIDEO_PLAYERS : 24,727,872
    - Skewed due to YouTube, Google Play Movies & TV, and MX Player.
- NEWS_AND_MAGAZINES : 9,549,178
    - Skewed due to Twitter, Google News and Flipboard.
    
App Store / Average Reviews
- Social Networking : 71,548
    - Skewed due to Facebook, Pinterest, Skype, Messenger and WhatsApp.
- Music : 57,327
    - Skewed due to Pandora, Spotify, Shazam and iHeartRadio.
- Reference : 74,942
    - Skewed due to the Bible.
- Weather : 52,280
    - No major players.
- Navigation : 86,090
    - Skewed due to Waze and Google Maps.
- Food & Drink : 33,334
    - The biggest apps based on reviews are Starbucks, Domino's Pizza, Opentable and Allrecipes.
- Book : 39,758
    - Skewed due to Kindle and Audible.
- Finance : 31,468 
    - The biggest apps based on reviews are Mint, Paypal, Bank of America and Chase Mobile

In [159]:
for app in android_free:
    if app[1] == "NEWS_AND_MAGAZINES":
        n_installs = app[5]
        n_installs = n_installs.replace(',', '')
        n_installs = n_installs.replace('+', '')
        if float(n_installs) > 100000000:
            print(app)

['Twitter', 'NEWS_AND_MAGAZINES', '4.3', '11667403', 'Varies with device', '500,000,000+', 'Free', '0', 'Mature 17+', 'News & Magazines', 'August 6, 2018', 'Varies with device', 'Varies with device']
['Flipboard: News For Our Time', 'NEWS_AND_MAGAZINES', '4.4', '1284018', 'Varies with device', '500,000,000+', 'Free', '0', 'Everyone 10+', 'News & Magazines', 'August 3, 2018', 'Varies with device', 'Varies with device']
['Google News', 'NEWS_AND_MAGAZINES', '3.9', '878065', '13M', '1,000,000,000+', 'Free', '0', 'Teen', 'News & Magazines', 'August 1, 2018', '5.2.0', '4.4 and up']


In [211]:
for app in ios_free:
    if app[11] == "Catalogs":
        n_reviews = app[5]
        n_reviews = n_reviews.replace(',', '')
        n_reviews = n_reviews.replace('+', '')
        if float(n_reviews) > 1000:
            print(app)

['955286870', 'CPlus for Craigslist app - mobile classifieds', '120219648', 'USD', '0.0', '13345', '2788', '5.0', '5.0', '3.0.0', '17+', 'Catalogs', '37', '5', '1', '1']
['1132217067', 'DRAGONS MODS FREE for Minecraft PC Game Edition', '86984704', 'USD', '0.0', '2027', '160', '4.0', '3.0', '1.1', '4+', 'Catalogs', '37', '4', '1', '1']


The following Personalisation apps have the greatest number of installs:

In [201]:
for app in android_free:
    if app[1] == "PERSONALIZATION":
        n_installs = app[5]
        n_installs = n_installs.replace(',', '')
        n_installs = n_installs.replace('+', '')
        if float(n_installs) > 90000000:
            print(app)

['ZEDGE™ Ringtones & Wallpapers', 'PERSONALIZATION', '4.6', '6466641', 'Varies with device', '100,000,000+', 'Free', '0', 'Teen', 'Personalization', 'July 19, 2018', 'Varies with device', 'Varies with device']
['CM Launcher 3D - Theme, Wallpapers, Efficient', 'PERSONALIZATION', '4.6', '6702776', '17M', '100,000,000+', 'Free', '0', 'Teen', 'Personalization', 'August 3, 2018', '5.41.0', '4.0.3 and up']
['APUS Launcher - Theme, Wallpaper, Hide Apps', 'PERSONALIZATION', '4.5', '5783441', '14M', '100,000,000+', 'Free', '0', 'Everyone', 'Personalization', 'August 6, 2018', '3.9.7', '4.0.3 and up']
['Hola Launcher- Theme,Wallpaper', 'PERSONALIZATION', '4.5', '3277209', '7.6M', '100,000,000+', 'Free', '0', 'Everyone', 'Personalization', 'May 9, 2018', '3.2.5', '4.0 and up']
['Backgrounds HD (Wallpapers)', 'PERSONALIZATION', '4.6', '2390185', 'Varies with device', '100,000,000+', 'Free', '0', 'Teen', 'Personalization', 'August 4, 2018', 'Varies with device', 'Varies with device']
['GO Keyboard 

There are no major brands in this list, and interestingly, the App Store doesn't have a large presence of personalisation apps. While they have only shown to be popular on Google Play, there may be a market on the App Store for those wanting more personalisation over their phone.