# App types population analysis for App Store and Google Play 


This project is to analyse different genres of apps in the App Store and on Google Play. The goal of this project is to find out what type of apps are likely to attract more users and are more profitable, and therefore help our team to make data-driven decisions with respect to what genre of apps to build.

Our team only aims to those apps which are free to download, and our revenue is from the in-app ads, so the more users in our apps, the better. 

## Reading in and exploring the data

As of September 2018, there were approximately 2 million iOS apps available on the App Store, and 2.1 million Android apps on Google Play. Collecting data for over 4 million apps requires a significant amount of time and money, so we'll try to analyse a sample of the data instead.

In [12]:
from csv import reader
opened_file1=open('C:/Users/Irene Lin/Desktop/AppleStore.csv',encoding='utf8')
read_file1=reader(opened_file1)
apple_data=list(read_file1)

opened_file2=open('C:/Users/Irene Lin/Desktop/googleplaystore.csv', encoding='utf8')
read_file2=reader(opened_file2)
google_data=list(read_file2)

To make it easier to explore the two data sets, we'll first write a function named `explore_data()` that we can use repeatedly to explore rows in a more readable way. We'll also add an option for our function to show the number of rows and columns for any data set.

In [13]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset)-1)
        print('Number of columns:', len(dataset[0]))
        
explore_data(apple_data,0,5,True)
explore_data(google_data,0,5,True)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


Number of rows: 7197
Number of columns: 16
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Ed

We can see that App Store data set has 7197 apps and 16 columns.At the very initial glance, the columns that might be useful for the purpose of our analysis is 'track_name', 'price', 'rating_count_tot', 'prime_genre'.

There are 10840 apps in our Google Play data set, and the columns that seem interesting are: 'category', 'installs', 'type', 'price', 'genres'.

In [14]:
print(apple_data[0])
print('\n')
print(google_data[0])
print('\n')

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']




If you want to explore more about the datasets, the apple dataset can be downloaded [here](AppleStore.csv), and the google dataset [here](googleplaystore.csv).

## Deleting wrong data

There is a dedicated [discussion section](https://www.kaggle.com/datasets/lava18/google-play-store-apps/discussion) in Kaggle, where we know that an error was outlined.Let's print this row and then delete it.

In [15]:
print(google_data[10473])

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']


In [16]:
del google_data[10473]

## Removing duplicate entries

There are some duplicate entries in the Google Play dataset, and we can confirm that with code below.

In [17]:
for app in google_data:
    name=app[0]
    if name == 'Instagram':
        print(app)

['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']


It shows that as an example of duplicate entries, 'Instagram' occurs four times in our data set.

In [18]:
duplicate_apps=[]
unique_apps=[]
for app in google_data[1:]:
    name=app[0]
    
    if name in unique_apps:
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)
 
print('Number of duplicate apps:', len(duplicate_apps))
print('Examples of duplicate apps:', duplicate_apps[:10])

Number of duplicate apps: 1181
Examples of duplicate apps: ['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings', 'Box', 'Zenefits', 'Google Ads', 'Google My Business', 'Slack']


There are 1181 dupolicate apps in our data set. We need to remove all the duplicate entries, keeping the row with the highest number of reviews, which means it is the latest data.

In [19]:
reviews_max={}
for app in google_data[1:]:
    name=app[0]
    n_reviews=float(app[3])
    if (name in reviews_max and reviews_max[name]<n_reviews) or (name not in reviews_max):
        reviews_max[name]=n_reviews
    
print(len(reviews_max))

9659


Then we will determine every app is unique.

In [20]:
android_clean=[]
already_added=[]

for app in google_data[1:]:
    name=app[0]
    n_reviews=float(app[3])
    if n_reviews==reviews_max[name] and name not in already_added:
        android_clean.append(app)
        already_added.append(name)

In [21]:
print(len(android_clean))

9659


So there are 9659 unique apps.

## Removing non-English apps

Our product is aiming at the English-speaking users, so now we check if the app is for English users.

In [34]:
def check_english(a_string):
    times=0
    for character in a_string:
        if ord(character) >127:
            times+=1
        if times>3:
            return False
    return True

google_English_apps=[]
for app in android_clean:
    if check_english(app[0]):
        google_English_apps.append(app)
        
apple_English_apps=[]
for app in apple_data[1:]:
    if check_english(app[1]):
        apple_English_apps.append(app)
        
print(google_English_apps[:10])
print(apple_English_apps[:10])

[['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'], ['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up'], ['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up'], ['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up'], ['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'March 26, 2017', '1.0', '2.3 and up'], ['Smoke Effect Photo Maker - Smoke Editor', 'ART_AND_DESIGN', '3.8', '178', '19M', '50,000+', '

## Isolating the Free Apps

In [23]:
for app in google_English_apps:
    if app[7] !=0:
        del app;
        
for app in apple_English_apps:
    if app[4] !=0:
        del app;

In [35]:
print(len(google_English_apps))
print(google_English_apps[0])
print(len(apple_English_apps))

9614
['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']
6183


## Most Common Apps by Genre

Now we show the frequency tables for the columns 'prime_genre' and 'Category'.

In [25]:
def freq_table(dataset, index):
    count={}
    for app in dataset:
        genre=app[index]
        if genre in count:
            count[genre]+=1
        else:
            count[genre]=1
    
    count_sum=len(dataset)
    for genre in count:
        count[genre]=count[genre]/count_sum*100
    
    return count

In [26]:
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])



In [36]:
prime_freq=display_table(apple_English_apps,11)

Games : 54.860100274947435
Entertainment : 7.261846999838266
Education : 6.6310852337053205
Photo & Video : 5.515122109008572
Utilities : 3.4449296458030085
Productivity : 2.7171276079573023
Health & Fitness : 2.6686074721009216
Music : 2.215752870774705
Social Networking : 2.037845705967977
Sports : 1.6820313763545207
Lifestyle : 1.6011644832605532
Shopping : 1.3747371825974446
Weather : 1.1159631246967492
Travel : 0.9704027171276078
News : 0.9218825812712276
Book : 0.8895358240336406
Reference : 0.8571890667960537
Business : 0.8571890667960537
Finance : 0.7924955523208799
Food & Drink : 0.7116286592269124
Navigation : 0.452854601326217
Medical : 0.3396409509946628
Catalogs : 0.08086689309396733


We can conclude that among the free English apps in the App Store, the most common genre is __Games__(54.86%),which accounts for more than half of all apps, dominating the App Store. The next most common genre is __Entertainment__, which is about 7.26%, followed by __Education__(6.63%) and __Photo&Video__(5.52%).

Top 5 popular genres of apps are __Games, Entertainment, Education, Photo&Video, Utilities__, and they approximately take 75% of all apps.

It shows that the larger proportion of apps in App Store is designed for people's entertainment(Games, Entertainment, Photo&Video, etc.), while apps with practical purposes(Education, Utilities, Health&Fitness) are less.

However, the large proportion of apps does not imply that they have a large number of users. After all, users have too many choices.

Let's see the situation on Google Play.

In [28]:
genre_freq=display_table(google_English_apps,1)

FAMILY : 19.325982941543582
GAME : 9.819013938007073
TOOLS : 8.61244019138756
BUSINESS : 4.358227584772207
MEDICAL : 4.108591637195756
PERSONALIZATION : 3.900561680882047
PRODUCTIVITY : 3.879758685250676
LIFESTYLE : 3.786145204909507
FINANCE : 3.588516746411483
SPORTS : 3.3804867900977738
COMMUNICATION : 3.2660703141252343
HEALTH_AND_FITNESS : 2.995631370917412
PHOTOGRAPHY : 2.9124193883919283
NEWS_AND_MAGAZINES : 2.600374453921365
SOCIAL : 2.485957977948825
TRAVEL_AND_LOCAL : 2.2779280216351157
BOOKS_AND_REFERENCE : 2.26752652381943
SHOPPING : 2.090701060952777
DATING : 1.768254628666528
VIDEO_PLAYERS : 1.6954441439567296
MAPS_AND_NAVIGATION : 1.3417932182234242
FOOD_AND_DRINK : 1.1649677553567712
EDUCATION : 1.1025587684626585
ENTERTAINMENT : 0.9049303099646349
LIBRARIES_AND_DEMO : 0.8737258165175785
AUTO_AND_VEHICLES : 0.8737258165175785
WEATHER : 0.8217183274391513
HOUSE_AND_HOME : 0.7593093405450385
EVENTS : 0.6656958602038693
PARENTING : 0.6240898689411275
ART_AND_DESIGN : 0.6240

Google Play shows a more balanced distribution for different kinds of apps. The most common genre on Google Play is __Family__, which accounts for 19.3% of all apps.

__Games__, the most common genre in App Store, is second common genre on Google Play,which accounts for 9.8% of all apps.

It seems to show a different landscape with App Store, but if we explore more 'Family' genre on the Google Play, we will find that they are mainly games for children under this genre.

Also, it does not reveal what genres have the most users. So next, we will find out the most popular genres by calculating installs numbers.

## Most popular genres of apps

In [37]:
apple_freq=freq_table(apple_English_apps,11)
for genre in apple_freq:
    total=0
    len_genre=0
    for app in apple_English_apps:
        genre_app=app[11]
        if genre_app==genre:
            total+=float(app[5])
            len_genre+=1
            
    avg_number=total/len_genre
    print(genre,':',avg_number)

Social Networking : 60253.84920634921
Photo & Video : 14688.715542521993
Games : 15586.759433962265
Music : 29047.109489051094
Reference : 27037.188679245282
Health & Fitness : 10802.157575757576
Weather : 23145.246376811596
Utilities : 7927.525821596244
Travel : 19030.183333333334
Shopping : 26635.011764705883
News : 16980.315789473683
Navigation : 19370.821428571428
Lifestyle : 8930.373737373737
Entertainment : 8862.409799554565
Food & Drink : 19934.386363636364
Sports : 15350.913461538461
Book : 10359.2
Finance : 23353.530612244896
Education : 2472.278048780488
Productivity : 8508.089285714286
Business : 5149.320754716981
Catalogs : 3465.0
Medical : 648.952380952381


In this part of analysis, we take numbers of user ratings as an indicator of numbers of active users. From the result above, we can conclude that __social networking__ is the most popular genre in the App Store, which number of rating is approximately two times of the next popular genre. 

In [38]:
for app in apple_English_apps:
    if app[-5]=='Social Networking':
        print(app[1], ':', app[5])

Facebook : 2974676
Pinterest : 1061624
Skype for iPhone : 373519
Messenger : 351466
Tumblr : 334293
WhatsApp Messenger : 287589
Kik : 260965
ooVoo – Free Video Call, Text and Voice : 177501
TextNow - Unlimited Text + Calls : 164963
Viber Messenger – Text & Call : 164249
Followers - Social Analytics For Instagram : 112778
MeetMe - Chat and Meet New People : 97072
We Heart It - Fashion, wallpapers, quotes, tattoos : 90414
InsTrack for Instagram - Analytics Plus More : 85535
Tango - Free Video Call, Voice and Chat : 75412
LinkedIn : 71856
Match™ - #1 Dating App. : 60659
Skype for iPad : 60163
POF - Best Dating App for Conversations : 52642
Timehop : 49510
Find My Family, Friends & iPhone - Life360 Locator : 43877
Whisper - Share, Express, Meet : 39819
Hangouts : 36404
LINE PLAY - Your Avatar World : 34677
WeChat : 34584
Badoo - Meet New People, Chat, Socialize. : 34428
Followers + for Instagram - Follower Analytics : 28633
GroupMe : 28260
Marco Polo Video Walkie Talkie : 27662
Miitomo : 2

The distribution is imbalanced, and the average number of ratings is heavily influenced by giants like Facebook, Pinterest, Skype, etc.

In [40]:
for app in apple_English_apps:
    if app[-5] =='Music':
        print(app[1],':',app[5])

Pandora - Music & Radio : 1126879
Spotify Music : 878563
Shazam - Discover music, artists, videos & lyrics : 402925
iHeartRadio – Free Music & Radio Stations : 293228
SoundCloud - Music & Audio : 135744
Magic Piano by Smule : 131695
Smule Sing! : 119316
TuneIn Radio - MLB NBA Audiobooks Podcasts Music : 110420
Amazon Music : 106235
SoundHound Song Search & Music Player : 82602
TuneIn Radio Pro - MLB Audiobooks Podcasts Music : 71609
Sonos Controller : 48905
Tabs & Chords by Ultimate Guitar - learn and play : 35045
I Am T-Pain 2.0 : 32650
Bandsintown Concerts : 30845
Karaoke - Sing Karaoke, Unlimited Songs! : 28606
My Mixtapez Music : 26286
Sing Karaoke Songs Unlimited with StarMaker : 26227
Ringtones for iPhone & Ringtone Maker : 25403
Musi - Unlimited Music For YouTube : 25193
AutoRap by Smule : 18202
Spinrilla - Mixtapes For Free : 15053
Napster - Top Music & Radio : 14268
edjing Mix:DJ turntable to remix and scratch music : 13580
Free Music - MP3 Streamer & Playlist Manager Pro : 13

Same pattern applies to Music that Pandora, Spotify, Shazam, etc. are dominating the music market, every of which has more than 400k ratings, while most apps in this genre struggle to get past 100k threshold.

The third popular genre is 'Reference'. Let's explore it.

In [41]:
for app in apple_English_apps:
    if app[-5]=='Reference':
        print(app[1],':',app[5])

Bible : 985920
Dictionary.com Dictionary & Thesaurus : 200047
Dictionary.com Dictionary & Thesaurus for iPad : 54175
Google Translate : 26786
Sky Guide: View Stars Night or Day : 22100
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran : 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition : 17588
Merriam-Webster Dictionary : 16849
Night Sky : 12122
Dictionary.com Dictionary & Thesaurus Premium : 11530
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) : 8535
WolframAlpha : 7410
e-Sword HD: Bible Study Made Easy : 7309
iHandy Translator Pro : 5163
Dictionary.com Premium Dictionary & Thesaurus for iPad : 4922
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools : 4693
Speak & Translate － Live Voice and Text Translator : 4344
National Geographic World Atlas : 4255
Knots 3D : 3196
iQuran : 2929
Merriam-Webster Dictionary & Thesaurus : 2843
e-Sword LT: Bible Study on the Go : 2152
GUNS MODS for Minecraf

It seems to be a niche. Only Bible and Dictionary.com skew the statistics, and other apps in this genre seem to have the potential. Therefore, I would recommend __Reference__ to be a choice of The App Store for our team.

Then, let's explore Google Play.

In [43]:
cate_freq=freq_table(google_English_apps,1)
for cate in cate_freq:
    total=0
    len_category=0
    for app in google_English_apps:
        category_app=app[1]
        if category_app==cate:
            num_installs=app[5]
            num_installs=num_installs.replace('+','')
            num_installs=num_installs.replace(',','')
            total+=float(num_installs)
            len_category+=1
            
    avg_num=total/len_category
    print(cate,':',avg_num)

ART_AND_DESIGN : 1887285.0
AUTO_AND_VEHICLES : 632501.3214285715
BEAUTY : 513151.88679245283
BOOKS_AND_REFERENCE : 7641777.871559633
BUSINESS : 1663758.627684964
COMICS : 817657.2727272727
COMMUNICATION : 35153714.17515924
DATING : 828971.2176470588
EDUCATION : 1782566.0377358492
ENTERTAINMENT : 11375402.298850575
EVENTS : 249580.640625
FINANCE : 1319851.4028985507
FOOD_AND_DRINK : 1891060.2767857143
HEALTH_AND_FITNESS : 3972300.388888889
HOUSE_AND_HOME : 1331540.5616438356
LIBRARIES_AND_DEMO : 630903.6904761905
LIFESTYLE : 1369954.7774725275
GAME : 14256217.600635594
FAMILY : 3345018.516684607
MEDICAL : 96944.49873417722
SOCIAL : 22961790.384937238
SHOPPING : 6966908.880597015
PHOTOGRAPHY : 16636241.267857144
SPORTS : 3373767.6861538463
TRAVEL_AND_LOCAL : 13218662.767123288
TOOLS : 9785955.211352658
PERSONALIZATION : 4086652.4853333333
PRODUCTIVITY : 15530942.008042896
PARENTING : 525351.8333333334
WEATHER : 4570892.658227848
VIDEO_PLAYERS : 24121489.079754602
NEWS_AND_MAGAZINES : 947

The most popular genre is __COMMUNICATION__,with 35153714 times of average installs. Let's explore apps in this genre.

In [45]:
for app in google_English_apps:
    if app[1]=='COMMUNICATION':
        print(app[0],':',app[5])

WhatsApp Messenger : 1,000,000,000+
Messenger for SMS : 10,000,000+
My Tele2 : 5,000,000+
imo beta free calls and text : 100,000,000+
Contacts : 50,000,000+
Call Free – Free Call : 5,000,000+
Web Browser & Explorer : 5,000,000+
Browser 4G : 10,000,000+
MegaFon Dashboard : 10,000,000+
ZenUI Dialer & Contacts : 10,000,000+
Cricket Visual Voicemail : 10,000,000+
TracFone My Account : 1,000,000+
Xperia Link™ : 10,000,000+
TouchPal Keyboard - Fun Emoji & Android Keyboard : 10,000,000+
Skype Lite - Free Video Call & Chat : 5,000,000+
My magenta : 1,000,000+
Android Messages : 100,000,000+
Google Duo - High Quality Video Calls : 500,000,000+
Seznam.cz : 1,000,000+
Antillean Gold Telegram (original version) : 100,000+
AT&T Visual Voicemail : 10,000,000+
GMX Mail : 10,000,000+
Omlet Chat : 10,000,000+
My Vodacom SA : 5,000,000+
Microsoft Edge : 5,000,000+
Messenger – Text and Video Chat for Free : 1,000,000,000+
imo free video calls and chat : 500,000,000+
Calls & Text by Mo+ : 5,000,000+
free 

The pattern is similar to what we saw in the App Store that giants like WhatsApp and Messager are dominating the market. The pattern is repeated for Video games, Social apps, Photography apps, etc.

The books and reference genre seems to be popular as well, with an average number of installs of 8,767,811.Since it's the recommended genre for the App Store, and our aim is to recommend an app genre for both the App Store and Google Play. Let's explore this genre a little bit.

In [46]:
for app in google_English_apps:
    if app[1] == 'BOOKS_AND_REFERENCE':
        print(app[0], ':', app[5])

E-Book Read - Read Book for free : 50,000+
Download free book with green book : 100,000+
Wikipedia : 10,000,000+
Cool Reader : 10,000,000+
Free Panda Radio Music : 100,000+
Book store : 1,000,000+
FBReader: Favorite Book Reader : 10,000,000+
English Grammar Complete Handbook : 500,000+
Free Books - Spirit Fanfiction and Stories : 1,000,000+
Google Play Books : 1,000,000,000+
AlReader -any text book reader : 5,000,000+
Offline English Dictionary : 100,000+
Offline: English to Tagalog Dictionary : 500,000+
FamilySearch Tree : 1,000,000+
Cloud of Books : 1,000,000+
Recipes of Prophetic Medicine for free : 500,000+
ReadEra – free ebook reader : 1,000,000+
Anonymous caller detection : 10,000+
Ebook Reader : 5,000,000+
Litnet - E-books : 100,000+
Read books online : 5,000,000+
English to Urdu Dictionary : 500,000+
eBoox: book reader fb2 epub zip : 1,000,000+
English Persian Dictionary : 500,000+
Flybook : 500,000+
All Maths Formulas : 1,000,000+
Ancestry : 5,000,000+
HTC Help : 10,000,000+
E

In [47]:
for app in google_English_apps:
    if app[1] == 'BOOKS_AND_REFERENCE' and (app[5] == '1,000,000,000+'
                                            or app[5] == '500,000,000+'
                                            or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

Google Play Books : 1,000,000,000+
Bible : 100,000,000+
Amazon Kindle : 100,000,000+
Wattpad 📖 Free Books : 100,000,000+
Audiobooks from Audible : 100,000,000+


It looks like only a few apps are extremely popular, so this niche still shows potential. Therefore, I would recommend __Books and reference__ to be within our consideration.

## Conclusions

In this project, we analyse datasets about apps in the App Store and Google Play, aiming to find out the most potential app genre to help out company to make data-driven decisions.

For both App Store and Google Play, my recommendation of genre is __Books and reference__.

