# APPLE & GOOGLE PLAYSTORE PROJECT

For a company that builds Android and iOS mobile applications that are available, free to download and install on Apple and Google Play Store, the main source of revenue consists of in-app ads.

This means that revenue for any given app is mostly influenced by the number of users who use the applications i.e the more users that see and engage with the ads, the better.

Working as a Data Analyst the goal for this project is to analyze data to help developers understand the type of apps likely to attract more users.

## Opening Dataset

In [1]:
# for applestore dataset
from csv import reader
open_file_apple = open('AppleStore.csv')
read_file_apple = reader(open_file_apple)
apple_data = list(read_file_apple)
header_row_apple = apple_data[0]
apple_data = apple_data[1:]

# for googleplaystore dataset
from csv import reader
open_file_google = open('googleplaystore.csv')
read_file_google = reader(open_file_google)
google_data = list(read_file_google)
header_row_google = google_data[0]
google_data = google_data[1:]


In [2]:
def explore_data(dataset, start, end, rows_and_columns = False):
    dataset_slice = dataset[start:end]
    for row in dataset_slice:
        print(row)
        print('\n')
    
    if rows_and_columns:
        print('number of rows:', len(dataset))
        print('number of columns:', len(dataset[0]))


In [3]:
print(header_row_apple)
print('\n')
explore_data(apple_data, 0, 8, True)



['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


['284035177', 'Pandora - Music & Radio', '130242560', 'USD', '0.0', '1126879', '3594', '4.0', '4.5', '8.4.1', '12+', 'Music', '37', '4', '1', '1']


['429047995', 'Pinterest', '74778624', 'USD', '0.0', '1061

The data set above contains approximately 7000 iOS apps from the App Store collected in July 2017.The following columns will help to proivide more information and insight for this analysis:

* 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre.

To gain more descriptive understanding of the data set containing the column titles listed above, see [AppleStore.csv](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps) 

In [4]:
print(header_row_google)
print('\n')
explore_data(google_data, 0, 8, True)


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Eve

The data set above contains approximately 10000 android apps from Google Play collected in August 2018. The following columns will help to proivide more information and insight for this analysis:

* 'App', 'Category', 'Rating', 'Reviews', 'Size', 'Type', 'Price', 'Content Rating', 'Genres'.

To gain more descriptive understanding of the data set cotaining the column titles listed above, see [googleplaystore.csv](https://www.kaggle.com/lava18/google-play-store-apps) 

## Deleting Wrong Data

In [5]:
print(google_data[10472])  # incorrect row
print('\n')
print(header_row_google)  # header
print('\n')
print(google_data[0])  # correct row

print(len(google_data))
del google_data[10472]
print(len(google_data))

In [2]:
### If explored, the Google Play data set can be seen to contain duplicate entries for some applications.
# For example;

for app in google_data:
    name = app[0]
    if name == 'Facebook':
        print(app)

NameError: name 'google_data' is not defined

## Deleting duplicate data entries

In [8]:
duplicate_apps = []
unique_apps = []

for app in google_data:
    name = app[0]
    if name in unique_apps:
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)
print('number of duplicate apps:', len(duplicate_apps))




number of duplicate apps: 1181


We don't want to count certain apps more than once when we analyze data, so we need to remove the duplicate entries and keep only one entry per app. In doing this, removing the duplicate rows randomly wouldn't be so efficient so we could probably find a better way.

Upon examination of the rows printed for the Facebook app, the main difference happens on the fourth position of each row, which corresponds to the number of reviews. The different numbers show the data was collected at different times.

We can use this information to build a criteria for the removal of the duplicates. The higher the number of reviews, the more recent the data should be. Rather than removing duplicates randomly, we'll only keep the row with the highest number of reviews and remove the other entries for any given app.



In [9]:
print('Expected Length', len(google_data) - 1181)

Expected Length 9659


In [10]:
reviews_max = {}
for app in google_data:
    name = app[0]
    n_reviews = float(app[3])
    
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
        
    elif name not in reviews_max:
            reviews_max[name] = n_reviews
            

In [11]:
print(len(reviews_max))

9659


For the removal of duplicates, I;
* Created a dictionary, where each dictionary key is a unique app name and the corresponding dictionary value is the highest number of reviews of that app.
* Used the dictionary to create a new set of data which will have only one entry per app. For each app, the entry with the highest number of reviews is selected.


In [12]:
android_clean = []
already_clean = []

for apps in google_data:
    name = apps[0]
    n_reviews = float(apps[3])
    
    if (reviews_max[name] == n_reviews) and (name not in already_clean):
        android_clean.append(apps)
        already_clean.append(name)
        
explore_data(android_clean, 0, 3, True )


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


number of rows: 9659
number of columns: 13


Using the dictionary created above to remove duplicates, I;
* Created two empty lists, "android_clean", "already_clean".
* Looped throught the google_data, assigning the app name for each app contained in the google_data to a variable named "name".
* Converted the number of reviews for each app in the google_data set to a "float" and assigned it to a variable named "n_reviews".
* Appended the current row of each app in the google_data set to the android_clean list, and the app name (name) to the already_cleaned list if:
* The number of reviews of the app matches the number of reviews of that app as described in the reviews_max dictionary; and
* The name of the app is not already in the already_added list. This helped to keep track of apps already added to the already_added list. it also accounts for those cases where the highest number of reviews of a duplicate app is the same for more than one entry. 

In [12]:
print(android_clean[4412][0])

中国語 AQリスニング


## Removing Non-English Apps

In [1]:
def english_characters(string):
    ascii_range = 0
    for character in string:
        if ord(character) > 127:
            ascii_range += 1
        
    if ascii_range > 3:
        return False
    else:
        return True
        
app_name1, app_name2, app_name3, app_name4 = 'Instagram', '爱奇艺PPS -《欢乐颂2》电视剧热播', 'Docs To Go™ Free Office Suite', 'Instachat 😜' 
print(english_characters(app_name1))
print(english_characters(app_name2))
print(english_characters(app_name3))
print(english_characters(app_name4))

True
False
True
True


In [3]:
english_apple_data = []
english_google_data = []

for app in apple_data:
    name = app[1]
    if english_characters(name):
        english_apple_data.append(app)
    
for app in android_clean:
    name = app[0]
    if english_characters(name):
        english_google_data.append(app)

print(explore_data(english_apple_data, 0, 3, True))
print('/n')
print(explore_data(english_google_data, 0, 3, True))
 

NameError: name 'apple_data' is not defined

## Isolating Free Applications

In [15]:
free_apple_data = []
free_google_data = []

for app in english_apple_data:
    name = app[1]
    price = app[4]
    if price == '0.0':
        free_apple_data.append(app)
    
for app in english_google_data:
    name = app[0]
    price = app[7]
    if price == '0':
        free_google_data.append(app)
        
print(len(free_apple_data))
print(len(free_google_data))


3222
8864


## Most Common Apps by Genre

As earlier explained in the introduction,the aim of this project is to determine the kinds of apps that are likely to attract more users as this will most likely influence the revenue generated depending on the number of people using the apps.In other to achieve this, the apps will be added on both Google Play and the App Store and this will help to identify app profiles that are successful in both markets.
The validation strategy for an app idea is comprised of three steps:
* Build a minimal Android version of the app, and add it to Google Play.
* If the app has a good response from users, it is developed further.
* If the app is profitable after six months, an iOS version of the app is built and added to the App Store.

To get a sense of the most common genres for each market, a frequency table for the prime_genre column of the App Store data set, and the Genres and Category columns of the Google Play data set will be created.

For the App Store data set, some columns may be used to generate frequency tables for the most common genres:
* 'track_name', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre'.

For the Google Play data set, some columns may be used to generate frequency tables for the most common genres. They include:
* App', 'Category', 'Rating', 'Reviews', 'Type', 'Price', 'Content Rating', 'Genres'

In [16]:
def freq_table(dataset, index):
    tables = {}
    total = 0
    for row in dataset:
        total += 1
        value = row[index]
        if value in tables:
            tables[value] += 1
        else:
            tables[value] = 1
            
    table_percentages = {}
    for key in tables:
        percentages = (tables[key] / total) * 100
        table_percentages[key] = percentages
    
    return table_percentages

def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])
        
        

In [17]:
display_table(free_apple_data, -5)

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


In [18]:
display_table(free_google_data, 1) # Category

FAMILY : 18.907942238267147
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.9034296028880866
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.6543321299638989
ART_AND_DESIGN : 

In [19]:
display_table(free_google_data, -4) # Genres

Tools : 8.449909747292418
Entertainment : 6.069494584837545
Education : 5.347472924187725
Business : 4.591606498194946
Productivity : 3.892148014440433
Lifestyle : 3.892148014440433
Finance : 3.7003610108303246
Medical : 3.531137184115524
Sports : 3.463447653429603
Personalization : 3.3167870036101084
Communication : 3.2378158844765346
Action : 3.1024368231046933
Health & Fitness : 3.0798736462093865
Photography : 2.944494584837545
News & Magazines : 2.7978339350180503
Social : 2.6624548736462095
Travel & Local : 2.3240072202166067
Shopping : 2.2450361010830324
Books & Reference : 2.1435018050541514
Simulation : 2.0419675090252705
Dating : 1.861462093862816
Arcade : 1.8501805054151623
Video Players & Editors : 1.7712093862815883
Casual : 1.7599277978339352
Maps & Navigation : 1.3989169675090252
Food & Drink : 1.2409747292418771
Puzzle : 1.128158844765343
Racing : 0.9927797833935018
Role Playing : 0.9363718411552346
Libraries & Demo : 0.9363718411552346
Auto & Vehicles : 0.9250902527075

For the free English apps, more than 8.16% are games, entertainment apps are about 8%, followed by photo and video apps, which are about 5%. Only 3.66% of the apps are designed for education, followed by social networking apps which amount for 3.29% of the apps in our data set.
The App Store data set(at least the part containing free English apps) is majorly dominated by apps designed for fun (games, entertainment, photo and video, social networking, sports, music, etc.), while apps with practical purposes (education, shopping, utilities, productivity, lifestyle, etc.) are more rare. 
However, the fact that fun apps are the most numerous doesn't also imply that they also have the greatest number of users — the demand might not be the same as the offer.

For Genres and Category columns of the Google Play data set there seems to be a significant difference as most apps arent designed for fun, but are rather designed for practical purposes (family, tools, business, lifestyle, productivity, etc.). Upon further observation, it is seen that the family category (which accounts for almost 19% of the apps) means mostly games for kids.
The difference between the Genres and the Category columns isn't really clear, but it is observed that the Genres column has more categories.
From this, we can deduce that the App Store is dominated by apps designed for fun, while Google Play has a more balanced combination of both practical and fun apps. 

## Most Popular Apps by Genre on Apple Store

In [20]:
genres_apple = freq_table(free_apple_data, -5)
for genre in genres_apple:
    total = 0
    len_genre = 0
    for app in free_apple_data:
        genre_app = app[-5]
        if genre_app == genre:
            n_userratings = float(app[5])
            total += n_userratings
            len_genre += 1
    avg_userrating = total / len_genre
    print(genre,' : ',avg_userrating )
    

Productivity  :  21028.410714285714
Travel  :  28243.8
Education  :  7003.983050847458
Social Networking  :  71548.34905660378
Photo & Video  :  28441.54375
Weather  :  52279.892857142855
Reference  :  74942.11111111111
Lifestyle  :  16485.764705882353
Health & Fitness  :  23298.015384615384
Food & Drink  :  33333.92307692308
Finance  :  31467.944444444445
Music  :  57326.530303030304
Sports  :  23008.898550724636
Business  :  7491.117647058823
Navigation  :  86090.33333333333
Entertainment  :  14029.830708661417
Utilities  :  18684.456790123455
Book  :  39758.5
Shopping  :  26919.690476190477
Medical  :  612.0
News  :  21248.023255813954
Catalogs  :  4004.0
Games  :  22788.6696905016


In [21]:
for app in free_apple_data:
    if app[-5] == 'Social Networking':
        print(app[1],':', app[5])

Facebook : 2974676
Pinterest : 1061624
Skype for iPhone : 373519
Messenger : 351466
Tumblr : 334293
WhatsApp Messenger : 287589
Kik : 260965
ooVoo – Free Video Call, Text and Voice : 177501
TextNow - Unlimited Text + Calls : 164963
Viber Messenger – Text & Call : 164249
Followers - Social Analytics For Instagram : 112778
MeetMe - Chat and Meet New People : 97072
We Heart It - Fashion, wallpapers, quotes, tattoos : 90414
InsTrack for Instagram - Analytics Plus More : 85535
Tango - Free Video Call, Voice and Chat : 75412
LinkedIn : 71856
Match™ - #1 Dating App. : 60659
Skype for iPad : 60163
POF - Best Dating App for Conversations : 52642
Timehop : 49510
Find My Family, Friends & iPhone - Life360 Locator : 43877
Whisper - Share, Express, Meet : 39819
Hangouts : 36404
LINE PLAY - Your Avatar World : 34677
WeChat : 34584
Badoo - Meet New People, Chat, Socialize. : 34428
Followers + for Instagram - Follower Analytics : 28633
GroupMe : 28260
Marco Polo Video Walkie Talkie : 27662
Miitomo : 2

In [22]:
for app in free_apple_data:
    if app[-5] == 'Navigation':
        print(app[1],':', app[5])

Waze - GPS Navigation, Maps & Real-time Traffic : 345046
Google Maps - Navigation & Transit : 154911
Geocaching® : 12811
CoPilot GPS – Car Navigation & Offline Maps : 3582
ImmobilienScout24: Real Estate Search in Germany : 187
Railway Route Search : 5


In [32]:
for app in free_apple_data:
    if app[-5] == 'Reference':
        print(app[1],':', app[5])

Bible : 985920
Dictionary.com Dictionary & Thesaurus : 200047
Dictionary.com Dictionary & Thesaurus for iPad : 54175
Google Translate : 26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran : 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition : 17588
Merriam-Webster Dictionary : 16849
Night Sky : 12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) : 8535
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools : 4693
GUNS MODS for Minecraft PC Edition - Mods Tools : 1497
Guides for Pokémon GO - Pokemon GO News and Cheats : 826
WWDC : 762
Horror Maps for Minecraft PE - Download The Scariest Maps for Minecraft Pocket Edition (MCPE) Free : 718
VPN Express : 14
Real Bike Traffic Rider Virtual Reality Glasses : 8
教えて!goo : 0
Jishokun-Japanese English Dictionary & Translator : 0


Following the analysis of the genres in the Apple Store data set by their average user ratings, it is observed that 'Social Networking', 'Reference' and 'Navigation' genres have the highest averages. Upon further analysis of the various applications under these genres by their total content ratings it is observed that some applications are more commonly used by most. The applications with the highest total content ratings are 'Pinterest', 'Bible' and 'Waze' for the 'Social Networking', 'Reference' and 'Navigation' genres in the Apple Store dataset.

## Most Popular Apps by Genre on Google Play

In [42]:
genres_google = freq_table(free_google_data, 1)
for category in genres_google:
    total = 0
    len_category = 0
    for app in free_google_data:
        category_app = app[1]
        if category_app == category:
            n_installs = app[5]
            n_installs = n_installs.replace('+', '')
            n_installs = n_installs.replace(',', '')
            total += float(n_installs)
            len_category += 1
    avg_n_installs = total / len_category
    print(category, ' : ',avg_n_installs)
        
    
            

VIDEO_PLAYERS  :  24727872.452830188
COMICS  :  817657.2727272727
MEDICAL  :  120550.61980830671
NEWS_AND_MAGAZINES  :  9549178.467741935
SOCIAL  :  23253652.127118643
WEATHER  :  5074486.197183099
BOOKS_AND_REFERENCE  :  8767811.894736841
LIBRARIES_AND_DEMO  :  638503.734939759
PHOTOGRAPHY  :  17840110.40229885
GAME  :  15588015.603248259
FAMILY  :  3695641.8198090694
FOOD_AND_DRINK  :  1924897.7363636363
BUSINESS  :  1712290.1474201474
TOOLS  :  10801391.298666667
PERSONALIZATION  :  5201482.6122448975
DATING  :  854028.8303030303
ENTERTAINMENT  :  11640705.88235294
TRAVEL_AND_LOCAL  :  13984077.710144928
HOUSE_AND_HOME  :  1331540.5616438356
EVENTS  :  253542.22222222222
SPORTS  :  3638640.1428571427
BEAUTY  :  513151.88679245283
HEALTH_AND_FITNESS  :  4188821.9853479853
COMMUNICATION  :  38456119.167247385
ART_AND_DESIGN  :  1986335.0877192982
MAPS_AND_NAVIGATION  :  4056941.7741935486
FINANCE  :  1387692.475609756
LIFESTYLE  :  1437816.2687861272
PARENTING  :  542603.6206896552
PR

In [43]:
for app in free_google_data:
    if app[1] == 'VIDEO_PLAYERS' and (app[5] == '1,000,000+' or app[5] == '500,000,000+' or app[5] == '100,000,000+') :
        print(app[0],':', app[5])

All Video Downloader 2018 : 1,000,000+
HD Video Player : 1,000,000+
Iqiyi (for tablet) : 1,000,000+
Motorola Gallery : 100,000,000+
VLC for Android : 100,000,000+
XX HD Video downloader-Free Video Downloader : 1,000,000+
OBJECTIVE : 1,000,000+
HD Movie Video Player : 1,000,000+
Video Editor,Crop Video,Movie Video,Music,Effects : 1,000,000+
VPlayer : 1,000,000+
OnePlus Gallery : 1,000,000+
Play Tube : 1,000,000+
video player : 1,000,000+
G Guide Program Guide (SOFTBANK EMOBILE WILLCOM version) : 1,000,000+
Video.Guru - Video Maker : 1,000,000+
Video Status : 1,000,000+
SVT Play : 1,000,000+
BluTV : 1,000,000+
Tencent Video - Supporting the whole network : 1,000,000+
MX Player : 500,000,000+
VUE: video editor & camcorder : 1,000,000+
Dubsmash : 100,000,000+
Mobizen Screen Recorder for LG - Record, Capture : 1,000,000+
VidPlay : 1,000,000+
VivaVideo - Video Editor & Photo Movie : 100,000,000+
VideoShow-Video Editor, Video Maker, Beauty Camera : 100,000,000+
Multiple Videos at Same Time : 

In [46]:
for app in free_google_data:
    if app[1] == 'COMMUNICATION' and (app[5] == '1,000,000+' or app[5] == '500,000,000+' or app[5] == '100,000,000+') :
        print(app[0],':', app[5])

imo beta free calls and text : 100,000,000+
TracFone My Account : 1,000,000+
My magenta : 1,000,000+
Android Messages : 100,000,000+
Google Duo - High Quality Video Calls : 500,000,000+
Seznam.cz : 1,000,000+
imo free video calls and chat : 500,000,000+
Who : 100,000,000+
GO SMS Pro - Messenger, Free Themes, Emoji : 100,000,000+
Messaging+ SMS, MMS Free : 1,000,000+
LINE: Free Calls & Messages : 500,000,000+
mysms SMS Text Messaging Sync : 1,000,000+
2ndLine - Second Phone Number : 1,000,000+
Firefox Browser fast & private : 100,000,000+
Ninesky Browser : 1,000,000+
UC Browser - Fast Download Private & Secure : 500,000,000+
Ghostery Privacy Browser : 1,000,000+
InBrowser - Incognito Browsing : 1,000,000+
PHONE for Google Voice & GTalk : 1,000,000+
Safest Call Blocker : 1,000,000+
Should I Answer? : 1,000,000+
RocketDial Dialer & Contacts : 1,000,000+
True Contact - Real Caller ID : 1,000,000+
Video Caller Id : 1,000,000+
Burner - Free Phone Number : 1,000,000+
Caller ID + : 1,000,000+


In [48]:
for app in free_google_data:
    if app[1] == 'SOCIAL' and (app[5] == '1,000,000+' or app[5] == '500,000,000+' or app[5] == '100,000,000+') :
        print(app[0],':', app[5])

Facebook Lite : 500,000,000+
Tumblr : 100,000,000+
Pinterest : 100,000,000+
The Messenger App : 1,000,000+
Messenger Pro : 1,000,000+
Free Messages, Video, Chat,Text for Messenger Plus : 1,000,000+
Jodel - The Hyperlocal App : 1,000,000+
Love Sticker : 1,000,000+
Love Images : 1,000,000+
Facebook Local : 1,000,000+
MobilePatrol Public Safety App : 1,000,000+
💘 WhatsLov: Smileys of love, stickers and GIF : 1,000,000+
Family GPS tracker KidControl + GPS by SMS Locator : 1,000,000+
Moment : 1,000,000+
Badoo - Free Chat & Dating App : 100,000,000+
Tango - Live Video Broadcast : 100,000,000+
TwitCasting Live : 1,000,000+
Snapchat : 500,000,000+
Banjo : 1,000,000+
Frontback - Social Photos : 1,000,000+
LinkedIn : 100,000,000+
Couple - Relationship App : 1,000,000+
Tik Tok - including musical.ly : 100,000,000+
B-Messenger Video Chat : 1,000,000+
BIGO LIVE - Live Stream : 100,000,000+
FollowMeter for Instagram : 1,000,000+
pixiv : 1,000,000+
U LIVE – Video Chat & Stream : 1,000,000+
VMate Lite

In [49]:
for app in free_google_data:
    if app[1] == 'BOOKS_AND_REFERENCE' and (app[5] == '1,000,000+' or app[5] == '500,000,000+' or app[5] == '100,000,000+') :
        print(app[0],':', app[5])

Book store : 1,000,000+
Free Books - Spirit Fanfiction and Stories : 1,000,000+
FamilySearch Tree : 1,000,000+
Cloud of Books : 1,000,000+
ReadEra – free ebook reader : 1,000,000+
eBoox: book reader fb2 epub zip : 1,000,000+
All Maths Formulas : 1,000,000+
English-Myanmar Dictionary : 1,000,000+
Golden Dictionary (EN-AR) : 1,000,000+
All Language Translator Free : 1,000,000+
Bible : 100,000,000+
Amazon Kindle : 100,000,000+
Wattpad 📖 Free Books : 100,000,000+
Al Quran Al karim : 1,000,000+
Koran Read &MP3 30 Juz Offline : 1,000,000+
Hafizi Quran 15 lines per page : 1,000,000+
Satellite AR : 1,000,000+
Audiobooks from Audible : 100,000,000+
Oxford A-Z of English Usage : 1,000,000+
Brilliant Quotes: Life, Love, Family & Motivation : 1,000,000+
Stats Royale for Clash Royale : 1,000,000+
wikiHow: how to do anything : 1,000,000+
EGW Writings : 1,000,000+
My Little Pony AR Guide : 1,000,000+


Following the analysis of the genres in the Google Play Store data set by their average user ratings, it is observed that 'VIDEO_PLAYERS', 'SOCIAL' and 'COMMUNICATION' genres have the highest average number of installations. Upon further analysis of the various applications under these genres by their total content ratings it is observed that most applications have a total number of installations > 1,000,000.

## Conclusion
Upon comparison of the genres with the most average user ratings in both the Apple Store and Google Play store data set it can be concluded that applications in `Social` and `Books and reference` genres show potentials for being profitable 