# Analysing Mobile App Data

In this project we will analyse data to help our developers understand what type of apps are likely to attract more users.

Since the company only develop apps which are free to download and install, the main source of income is revenue of in-app ads, which means higher engagement with the ads equals higher revenue.

# Opening and Exploring the Data 

First we will define a function which we can reuse to explore the data. The function will print the defined rows and columns for the provided dataset. If the last parameter is passed it will also print the number of rows, columns and data points.

In [1]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))
        print('Number of data points:', len(dataset) * len(dataset[0]))

Let's get an idea of what our data looks like and it's size.

- Apple store data has less number of rows but more columns than the google play data.
- Google play data has almost 26k more data points. This is assuming all data points have a value.

In [28]:
from csv import reader

# open apple store data
opened_file = open('AppleStore.csv')
read_file = reader(opened_file)
apple_store_data = list(read_file)
opened_file.close()


# open google play data
opened_file = open('googleplaystore.csv')
read_file = reader(opened_file)
google_play_data = list(read_file)
opened_file.close()

# explore both data sets
print("Apple Data")
explore_data(apple_store_data[1:],0,3, True)
print()
print("Google Data")
explore_data(google_play_data[1:],0,3, True)




Apple Data
['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


Number of rows: 7197
Number of columns: 16
Number of data points: 115152

Google Data
['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+',

In [26]:
# print column names for each dataset
print('Apple data headers:')
print(apple_store_data[0])
print()
print('Google data headers:')
print(google_play_data[0])

Apple data headers:
['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']

Google data headers:
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


Columns of interest:
- price / Price: to remove paid apps
- ipadSc_urls.num: do more screenshots result in more installs?
- prime_genre / Genres: to understand which genres are most popular
- size_bytes / Size: To understand if there is a correlation between app size and downloads
- cont_rating / Rating: Does higher content rating result in less or more installs?
- rating_count_tot: which apps have higher ratings, use this against other columns
- user_rating: 


# Deleting Wrong Data
The [discussions forums](https://www.kaggle.com/datasets/lava18/google-play-store-apps/discussion) suggest some missing data.

It's been suggested that there is at least one row with missing data causing value to shift over by one.

Let's loop through the google dataset to see if the length of any rows do not match the length of the header row. If they don't we will delete that row.

In [29]:
# Get number of columns in header row
header_len = len(google_play_data[0])
print(header_len)

# Iterate through data and print the index and row if row not same length as the number of cols in the header
# Delete the any rows with missing data
for index, row in enumerate(google_play_data):
    if len(row) != header_len:
        print('índex:'+str(index))
        print(row)
        print()
        print(str(index)+ ' row deleted and replaced with the below row\n')
        del google_play_data[index] # delete the record with missing data
        print(google_play_data[index]) # print the same row to check that it's gone.        

13
índex:10473
['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']

10473 row deleted and replaced with the below row

['osmino Wi-Fi: free WiFi', 'TOOLS', '4.2', '134203', '4.1M', '10,000,000+', 'Free', '0', 'Everyone', 'Tools', 'August 7, 2018', '6.06.14', '4.4 and up']


# Removing Duplicate Entries: Part One

First let's check for duplicate entries and print some of the apps identified as having duplicates

In [40]:
unique_apps = []
duplicate_apps = []

for row in google_play_data[1:]:
    app_name = row[0]
    if app_name in unique_apps:
        duplicate_apps.append(app_name)
    else:
        unique_apps.append(app_name)
        
print(len(unique_apps))
print(len(duplicate_apps))
 
print("Duplicate App Examples")
print(duplicate_apps[:10])

9659
1181
Duplicate App Examples
['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings', 'Box', 'Zenefits', 'Google Ads', 'Google My Business', 'Slack']


There are 1181 duplicate entries.

Let's confirm they are duplicates by printing rows for a couple of the duplicate apps identified. We need to identify what columns we can use to identify the most recent record, if there are any differences.

In [39]:
print(google_play_data[0])
for row in google_play_data[1:]:
    name = row[0]
    if name == 'Quick PDF Scanner + OCR FREE':
        print(row)
    

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']
['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']
['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']
['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80804', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']


In [43]:
print(google_play_data[0])
for row in google_play_data[1:]:
    name = row[0]
    if name == 'Google Ads':
        print(row)
    

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']
['Google Ads', 'BUSINESS', '4.3', '29313', '20M', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 30, 2018', '1.12.0', '4.0.3 and up']
['Google Ads', 'BUSINESS', '4.3', '29313', '20M', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 30, 2018', '1.12.0', '4.0.3 and up']
['Google Ads', 'BUSINESS', '4.3', '29331', '20M', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 30, 2018', '1.12.0', '4.0.3 and up']


The only column which appears to differ is 'Reveiws'. Let's use this to remove duplicates by keeping the duplicate where the number of reviews is highest.

# Removing Duplicate Entries: Part Two

First let's create dictionary of the max number of reviews for each app

In [47]:
reviews_max = {} # store the name of the app and the highest number of reviews

for row in google_play_data[1:]:
    name, n_reviews = row[0], float(row[3]) # store app name and number of reviews
    
    # Update reviews_max if this app's reviews are higher or if it doesn't exist yet
    if (name in reviews_max and reviews_max[name] < n_reviews) or name not in reviews_max:
        reviews_max[name] = n_reviews

print(len(reviews_max))

9659


Use the above dictionary to remove rows where the app name and reviews match, leaving us with unique records

In [53]:
# Add row to new clean list
android_clean = []

# make a note of what apps have been added to the cleaned data so that we don't 
# add them again if the reviews of two rows for the same app are equal
already_added = []

for row in google_play_data[1:]:
    name, n_reviews = row[0], float(row[3])
    
    if n_reviews == reviews_max[name] and name not in already_added:
        android_clean.append(row)
        already_added.append(name)
        
print(len(android_clean))
print(android_clean[1:6])

9659
[['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up'], ['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up'], ['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up'], ['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'March 26, 2017', '1.0', '2.3 and up'], ['Smoke Effect Photo Maker - Smoke Editor', 'ART_AND_DESIGN', '3.8', '178', '19M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'April 26, 2018', '1.1', '4.0.3 and up']]


Let's double check one of the app which had duplicate records again

In [55]:
print(google_play_data[0])
for row in android_clean:
    name = row[0]
    if name == 'Quick PDF Scanner + OCR FREE':
        print(row)
    

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']
['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']


We can see there is now only one record for the 'Quick PDF Scanner + OCR FREE' app instead of three 🥳

# Removing Non-English Apps: Part One
Our company uses English for the app we develop so we don't need to include non-english apps in our analysis. Let's try to remove non-english apps.

First we write a function which checks if string has non-english characters, if it has less than 3 non english characters return true identifying it as an english app which we will keep in our dataset.

We need to keep apps with some non-english alphabet characters to prevent discluding english apps which use certain symbols or emojis that fall outside of the english alphabet character window of 127.

In [57]:
def isEnglish(string):
    chars_not_eng = 0
    for char in string:
        char_num = ord(char)
        if char_num > 127:
            chars_not_eng += 1

    if chars_not_eng > 3:
        return False
    else:
        return True

Check that our function works correctly

In [61]:
print(isEnglish('火车票Pro for 12306')) # should be true
print(isEnglish('Docs To Go™ Free Office Suite')) # should be true
print(isEnglish('Instachat 😜')) # should be true
print(isEnglish('爱奇艺PPS -《欢乐颂2》电视剧热播')) # should be false

True
True
True
False


# Removing Non-English Apps: Part Two
Use the above function to remove non-english apps from the google play dataset

In [10]:
english_apple_apps = []
for row in apple_store_data:
    if isEnglish(row[1]):
        english_apple_apps.append(row)
    
print(len(english_apple_apps))
print(english_apple_apps[:5])

english_google_apps = []
for row in android_clean:
    if isEnglish(row[0]):
        english_google_apps.append(row)
    
    
print(len(english_google_apps))
print(english_google_apps[:5])



6184
[['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic'], ['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1'], ['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1'], ['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1'], ['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']]
9614
[['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'], ['U Launcher 

After removing non-english apps from both datasets we now have 6184 rows in Apple Store data and 9614 in Google Play data

### Isolating the Free Apps
Remove non-free apps and check how many records we have left

In [11]:
free_eng_apple_apps = [] #4
free_eng_google_apps = [] #7

for row in english_apple_apps[1:]:
    price = float(row[4])
    if price > 0:
        free_eng_apple_apps.append(row)
        
for row in english_google_apps[0:]:
    type = row[6]
    if type == 'Free':
        free_eng_google_apps.append(row)
        
print(len(free_eng_apple_apps))
print(len(free_eng_google_apps))


2961
8863


### Most Common Apps by Genre: Part One

Since we want to publish the app on both the Apple and Google markets we need to find app profiles that are successful in both stores.

Our validation strategy involves a few steps:
1. Build a minimal android version of the app and add it to one of the market places, e.g. Google Play
2. If the app has a good repsonse from users, develop it further taking into account any user feedback.
3. If the app is poriftable after six months, build an iOS version of the app and publish to the App Store

#### Fields to analyse

Apple
- "prime_genre": Primary Genre
- "user_rating" : Average User Rating value (for all version)
- "rating_count_tot": User Rating counts (for all version)

Google
- Genres
- Rating
- Reviews - count of reviews - does this help?
- Installs

# Most Common Apps by Genre: Part Two

## Define frequency and display table functions

In [12]:
# frequency table
def freq_table(dataset, index):
    freq_table = {}
    for row in dataset:
        val = row[index]
        if val in freq_table:
            freq_table[val] += 1
        else:
            freq_table[val] = 1
    
    perc_table = {}
    for row in freq_table:
        perc = (freq_table[row] / len(dataset)) * 100
        perc_table[row] = perc

    return perc_table

# converts frequency table ot tuple and sorts in desc order
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])
        

# Most Common Apps by Genre: Part Three

## Top genres on Apple Store

1. The top 3 most common genres for free apps on Apple Store are Games, Education and Entertainment
2. Games are significantly more common at over 50% indicating the focus is largely on func apps.
3. Generally it seems games, education and entertainment apps are more common than other apps. News, Medical, Shopping and Catalogs being the least popular.
4. Games being the most common type of app may suggest there is a high demand for these types of apps, but more analysis could be done to confirm this. 

In [13]:
# apple apps genre frequency table > prime_genre
display_table(free_eng_apple_apps, -5)

Games : 51.266464032421474
Education : 9.861533265788584
Entertainment : 6.5856129685916915
Photo & Video : 6.112799729821006
Utilities : 4.4579533941236065
Productivity : 3.7825059101654848
Health & Fitness : 3.3772374197906116
Music : 2.397838568051334
Lifestyle : 1.6210739614994936
Weather : 1.3846673421141507
Book : 1.3846673421141507
Business : 1.21580547112462
Sports : 1.1820330969267139
Reference : 1.1820330969267139
Navigation : 0.7429922323539345
Travel : 0.6754474839581223
Social Networking : 0.6754474839581223
Food & Drink : 0.60790273556231
Medical : 0.5065856129685917
News : 0.4728132387706856
Finance : 0.4390408645727794
Shopping : 0.033772374197906116
Catalogs : 0.033772374197906116


## Top Categories on Google Play

Genres in the google play data seem to be a more granular detail, since we are only looking at the bigger picture we will use the Category field as it's defintiive and more telling of the bigger picture. 

1. The top 3 most common apps on Google Play are Family, Games and Tools.
2. Games appear to be much less common on Google Play Store and overall a more balanced landscape of genres.
3. Education and photography are also less common on Google Play compared to Apple Store

In [14]:
# Category frequency table
display_table(free_eng_google_apps, 1)

FAMILY : 18.898792733837304
GAME : 9.725826469592688
TOOLS : 8.462146000225657
BUSINESS : 4.592124562789123
LIFESTYLE : 3.9038700214374367
PRODUCTIVITY : 3.8925871601038025
FINANCE : 3.7007785174320205
MEDICAL : 3.5315355974275078
SPORTS : 3.396141261423897
PERSONALIZATION : 3.317161232088458
COMMUNICATION : 3.2381812027530184
HEALTH_AND_FITNESS : 3.0802211440821394
PHOTOGRAPHY : 2.944826808078529
NEWS_AND_MAGAZINES : 2.798149610741284
SOCIAL : 2.6627552747376737
TRAVEL_AND_LOCAL : 2.335552296062281
SHOPPING : 2.245289405393208
BOOKS_AND_REFERENCE : 2.1437436533904997
DATING : 1.8616721200496444
VIDEO_PLAYERS : 1.7939749520478394
MAPS_AND_NAVIGATION : 1.399074805370642
FOOD_AND_DRINK : 1.241114746699763
EDUCATION : 1.1621347173643235
ENTERTAINMENT : 0.9590432133589079
LIBRARIES_AND_DEMO : 0.9364774906916393
AUTO_AND_VEHICLES : 0.9251946293580051
HOUSE_AND_HOME : 0.8236488773552973
WEATHER : 0.8010831546880289
EVENTS : 0.7108202640189552
PARENTING : 0.6544059573507841
ART_AND_DESIGN : 0

In [15]:
display_table(free_eng_google_apps, 9) # genres

Tools : 8.450863138892023
Entertainment : 6.070179397495204
Education : 5.348076272142616
Business : 4.592124562789123
Productivity : 3.8925871601038025
Lifestyle : 3.8925871601038025
Finance : 3.7007785174320205
Medical : 3.5315355974275078
Sports : 3.463838429425702
Personalization : 3.317161232088458
Communication : 3.2381812027530184
Action : 3.102786866749408
Health & Fitness : 3.0802211440821394
Photography : 2.944826808078529
News & Magazines : 2.798149610741284
Social : 2.6627552747376737
Travel & Local : 2.324269434728647
Shopping : 2.245289405393208
Books & Reference : 2.1437436533904997
Simulation : 2.042197901387792
Dating : 1.8616721200496444
Arcade : 1.8503892587160102
Video Players & Editors : 1.771409229380571
Casual : 1.7601263680469368
Maps & Navigation : 1.399074805370642
Food & Drink : 1.241114746699763
Puzzle : 1.128286133363421
Racing : 0.9928917973598104
Role Playing : 0.9364774906916393
Libraries & Demo : 0.9364774906916393
Auto & Vehicles : 0.9251946293580051
S

# Most Popular Apps by Genre on the App Store

Again, Games comes up top here with the highest number of user reviews, with Entertainment in second place. This further suggests developing an app in the games or entertainment genre is recommended.

In [16]:
prime_genres_ft = freq_table(free_eng_apple_apps, -5)

for genre in prime_genres_ft:
    total = 0 # number of ratings
    len_genre = 0 # number of apps specific to each genre
    
    for row in free_eng_apple_apps:
        genre_app = row[-5]
        if genre_app == genre:
            total += float(row[5])
            len_genre += 1

    avg_user_rating = total / len_genre
    print(genre,':', str(avg_user_rating))

Games : 6695.863636363636
Entertainment : 2131.5128205128203
Music : 2759.1971830985917
Photo & Video : 2531.519337016575
Health & Fitness : 2679.85
Business : 4043.472222222222
Weather : 3248.4146341463415
Utilities : 1326.6818181818182
News : 3872.3571428571427
Education : 640.972602739726
Reference : 2400.3714285714286
Productivity : 2247.9285714285716
Navigation : 1174.590909090909
Lifestyle : 902.7708333333334
Book : 320.4146341463415
Finance : 882.8461538461538
Sports : 253.74285714285713
Medical : 663.7333333333333
Travel : 602.95
Shopping : 2722.0
Food & Drink : 579.5
Social Networking : 393.0
Catalogs : 1309.0


A closer look at what games have the most ratings shows games which are colourful and simple to pickup and play, and have some multiplayer or social elements.

In [17]:
for app in free_eng_apple_apps:
    if app[-5] == 'Games':
        print(app[1], ':', app[5]) # print name and number of ratings

Fruit Ninja Classic : 698516
Clear Vision (17+) : 541693
Minecraft: Pocket Edition : 522012
Plants vs. Zombies : 426463
Doodle Jump : 395261
Draw Something : 360974
Infinity Blade : 326482
Geometry Dash : 266440
Tiny Wings : 219418
Traffic Rush : 213092
Plants vs. Zombies HD : 163598
Infinity Blade II : 153588
The Room : 143908
Plague Inc. : 143285
Where's My Water? : 131656
Call of Duty: Black Ops Zombies : 116601
Sunday Lawn : 107727
SCRABBLE Premium : 105776
Flick Home Run ! : 98851
Terraria : 98036
Scribblenauts Remix : 86127
Zombieville USA 2 : 79683
Trivia Crack (No Ads) : 77370
Solitaire by MobilityWare : 76720
Call of Duty: Zombies : 63943
Cut the Rope: Experiments : 63272
Phase 10 Pro - Play Your Friends! : 59155
The Sims 3 : 54408
HB2 PLUS : 54073
StickWars : 53821
BATTLE BEARS: Zombies! : 50710
Slayin : 45574
Ski Safari : 45121
Lane Splitter : 44567
Bloons TD 5 : 42078
Fieldrunners : 41633
Super Stickman Golf : 41446
Cat Physics : 40552
SCRABBLE Premium for iPad : 39678
Infi

In [18]:
for app in free_eng_apple_apps:
    if app[-5] == 'Business':
        print(app[1], ':', app[5]) # print name and number of ratings

Scanner Pro - PDF document scanner app with OCR : 31912
Splashtop Personal - Remote Desktop : 29376
TurboScan™ Pro - document & receipt scanner: scan multiple pages and photos to PDF : 28388
iScanner - PDF Document Scanner App : 8675
PDF Reader Pro Edition - Annotate,edit & sign PDFs : 7033
InstaLogo Logo Creator - Graphic design maker : 6263
Tiny Scanner+ - PDF scanner to scan document, receipt & fax : 5611
HotSchedules : 3292
ScanBizCards Business Card Reader : 3166
TapeACall Pro - Call Recorder For Phone Calls : 2951
CamCard - Business card scanner & reader : 2923
File Manager Pro App : 2154
Scanner For Me - PDF Scan with OCR for Documents : 1976
AudioNote - Notepad and Voice Recorder : 1756
SuperCam_Pro : 1399
PDF Converter - Convert Documents, Photos to PDF : 1294
Voice Translate Pro : 1158
Fax from iPhone - send fax app : 1039
Genius Scan+ - PDF Scanner : 916
OfficeSuite Pro (Mobile Office) : 896
Consultant Aide : 881
SamCard Pro-card reader&business card scanner&ocr : 875
ALON D

Business app ratings come in second, which are largely due to the scanner and remote desktop app

### App Profile Recommendation
The above analysis suggests developing a game which is easy to pick up and play withouth the need to learn the mechanics



# Most Popular Apps by Genre on Google Play

Unlike Apple play store, google play store's most popular apps are 

In [19]:
category_ft = freq_table(free_eng_google_apps, 1)

for cat in category_ft:
    total = 0 # store sum of each category
    len_category = 0 # store number of apps specific to this category
    
    for row in free_eng_google_apps:
        category_app = row[1]
        installs = row[5].replace(',','').replace('+','') # remove non-numeric chars
        installs = float(installs) # convert to float
        
        if cat == category_app:
            total += installs
            len_category += 1
        
    avg_installs = total / len_category
    print(cat,':',avg_installs)

ART_AND_DESIGN : 1986335.0877192982
AUTO_AND_VEHICLES : 647317.8170731707
BEAUTY : 513151.88679245283
BOOKS_AND_REFERENCE : 8767811.894736841
BUSINESS : 1712290.1474201474
COMICS : 817657.2727272727
COMMUNICATION : 38456119.167247385
DATING : 854028.8303030303
EDUCATION : 1833495.145631068
ENTERTAINMENT : 11640705.88235294
EVENTS : 253542.22222222222
FINANCE : 1387692.475609756
FOOD_AND_DRINK : 1924897.7363636363
HEALTH_AND_FITNESS : 4188821.9853479853
HOUSE_AND_HOME : 1331540.5616438356
LIBRARIES_AND_DEMO : 638503.734939759
LIFESTYLE : 1437816.2687861272
GAME : 15588015.603248259
FAMILY : 3697848.1731343283
MEDICAL : 120550.61980830671
SOCIAL : 23253652.127118643
SHOPPING : 7036877.311557789
PHOTOGRAPHY : 17840110.40229885
SPORTS : 3638640.1428571427
TRAVEL_AND_LOCAL : 13984077.710144928
TOOLS : 10801391.298666667
PERSONALIZATION : 5201482.6122448975
PRODUCTIVITY : 16787331.344927534
PARENTING : 542603.6206896552
WEATHER : 5074486.197183099
VIDEO_PLAYERS : 24727872.452830188
NEWS_AND_

Communication apps generally has the highest number of installs. 

In [20]:
for app in free_eng_google_apps:
    if app[1] == 'COMMUNICATION' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

WhatsApp Messenger : 1,000,000,000+
imo beta free calls and text : 100,000,000+
Android Messages : 100,000,000+
Google Duo - High Quality Video Calls : 500,000,000+
Messenger – Text and Video Chat for Free : 1,000,000,000+
imo free video calls and chat : 500,000,000+
Skype - free IM & video calls : 1,000,000,000+
Who : 100,000,000+
GO SMS Pro - Messenger, Free Themes, Emoji : 100,000,000+
LINE: Free Calls & Messages : 500,000,000+
Google Chrome: Fast & Secure : 1,000,000,000+
Firefox Browser fast & private : 100,000,000+
UC Browser - Fast Download Private & Secure : 500,000,000+
Gmail : 1,000,000,000+
Hangouts : 1,000,000,000+
Messenger Lite: Free Calls & Messages : 100,000,000+
Kik : 100,000,000+
KakaoTalk: Free Calls & Text : 100,000,000+
Opera Mini - fast web browser : 100,000,000+
Opera Browser: Fast and Secure : 100,000,000+
Telegram : 100,000,000+
Truecaller: Caller ID, SMS spam blocking & Dialer : 100,000,000+
UC Browser Mini -Tiny Fast Private & Secure : 100,000,000+
Viber Mess

# Conclusion

The games genre clearly dominates in the apple store. The most popular games have simple mechanics or are simpler to pick up and play than their desktop counterpart.

The Google play store is more evened out, but has a higher demand for social and communication apps. Let's consider this in our conclusion.

Based on the mixed trends between both stores I suggest we develop a multiplayer game which is simple to pick up and play, with features for communicating with other players as well as social connectors such as sharing high scores on social media platforms. This approach takes into account the demand for gaming entertainment seen on the Apple store as well as the need to connecting with others as seen on the Google Play store.