# Title

This project is about analyzing data on the engagement of in app advertisements on our Android and IOS mobile apps in the Google Play and App Stores.

Our goal here is to help our developers understand the type of applications that will attract more users.

In [1]:
from csv import reader

def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

opened_file_apple = open('AppleStore.csv')
apple_read = reader(opened_file_apple)
app_data_apple = list(apple_read)

opened_file_android = open('googleplaystore.csv')
android_read = reader(opened_file_android)
app_data_android = list(android_read)

explore_data(app_data_apple, 0, 5, True )
print('\n')
explore_data(app_data_android, 0, 5, True)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


Number of rows: 7198
Number of columns: 16


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo 

# Potential Columns

* For the App Store applications (IOS) we see that potential columns that can aid us in our analysis are 

 1. Price
 2. Rating_count_tot
 3. rating_count_ver
 4. user_rating
 5. user_rating_ver
 6. sup_devices.num
 7. prime_genre
 ---
* For the Google Play Store applications (Android) we see potential columns include

 1. Ratings
 2. Reviews
 3. Installs
 4. Price
 5. Type
 6. Content Rating
 7. Genres

# Data Cleaning

Below we will clean up the data before starting our analysis. We will be performing the following tasks:

* Detect and correct/remove inaccurate data
* Detect and remove duplicate data
* Remove Non-English and paid applications as our company only build free apps toward an English speaking audience.

In [2]:
# First we detect and remove inaccurate data.

# There happens to be a row that has a missing category column
# and so all the entries have been shifted left, causing inaccurate 
# data. Let's remove this one (row number 10473)

In [3]:
print([app_data_android[10473]])

[['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']]


In [4]:
# Let's remove this row with an error from the data set using
# the del command.

In [5]:
del app_data_android[10473]

In [6]:
# Now let's recheck that row to see if the previous one was
# deleted.

In [7]:
print(app_data_android[10473])

['osmino Wi-Fi: free WiFi', 'TOOLS', '4.2', '134203', '4.1M', '10,000,000+', 'Free', '0', 'Everyone', 'Tools', 'August 7, 2018', '6.06.14', '4.4 and up']


# Duplicate Entries

We need to be careful to not include duplicate data in our dataset when we perform our analysis so let's first check to see if there are such data.


We do so by doing the following:

* We loop through the data set and create two empty lists, one called **duplicates**, which will hold any application that has their name appear more than once, and **unique** which will hold the first instance the app shows up through the loop.  
<br>

* We then append the value to the duplicates list if the name of the app is already in the unique list, if not we append it to the unique list as it'd be the first time it appears.  
<br>

* Finally we'll print out the number of duplicate apps in the dataset as well as print out a couple of examples of duplicate apps ~10.

In [8]:
duplicates = []
unique = []

for app in app_data_android:
    name = app[0]
    if name in unique:
        duplicates.append(name)
    else:
        unique.append(name)
        
print("Number of duplicate apps:", len(duplicates))
print("\n")
print("Examples of duplicate apps:", duplicates[:10])

Number of duplicate apps: 1181


Examples of duplicate apps: ['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings', 'Box', 'Zenefits', 'Google Ads', 'Google My Business', 'Slack']


# Duplicate Removal Criterion 

Now that we know there are duplicates in our dataset, let's see if there's any data differences in these duplicates.

In [9]:
for app in app_data_android:
    name = app[0]
    if name == 'Slack':
        print(app)
        

['Slack', 'BUSINESS', '4.4', '51507', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'August 2, 2018', 'Varies with device', 'Varies with device']
['Slack', 'BUSINESS', '4.4', '51507', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'August 2, 2018', 'Varies with device', 'Varies with device']
['Slack', 'BUSINESS', '4.4', '51510', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'August 2, 2018', 'Varies with device', 'Varies with device']


As we can see, when we ran the entries for the 'Slack' application in the android dataset, we notice there are 3 times the app appears in the data. The first 2 times appear exactly the same but the third time, the reviews number has increased by 3 from 51507 to 51510.



Based on this information, we will choose to remove the oldest duplicates and keep the entry with the highest number of reviews for all duplicate apps.

# Removal

We first start the removal process by creating an empty dictionary called **reviews_max** which will hold the app name as the key and the highest review for that app as it's value.

Next, we start sorting through the dataset each app at a time.

If the app is appearing for the first time, we will add it to the **reviews_max** dictionary with the corresponding rating as its value.

If the app name has already appeared in the **reviews_max** dictionary as a key, but the number of reviews for the current iteration is higher than the one stored in **reviews_max**, then we update the value in the **reviews_max** with what would now be the highest reviews  for that app.

We then print out the length of reviews_max to make sure the number of apps equals the total dataset minus the duplicate data.

Since we removed an inaccurate row and also exclude the header row which contains the column names, our android dataset has 10840 apps. We found earlier that the android dataset also contains 1181 duplicate entries. Performing a subtraction of (10840 - 1181), we should end up with a **review_max** length of 9659, which is exactly what we get.


In [10]:
reviews_max = {}

for app in app_data_android[1:]:
    name = app[0]
    n_reviews = float(app[3])
    if (name in reviews_max) and (reviews_max[name] < n_reviews):
        reviews_max[name] = n_reviews
    if (name not in reviews_max):
        reviews_max[name] = n_reviews

print(len(reviews_max))

9659


Now that we have the proper applications and their highest reviews, we can update the dataset. Remember that the **reviews_max** dictionary only contained the name of the app and its number of reviews. 

To get the complete cleaned android data, we first create two lists called **android_clean** and **already_added**.

The **android_clean** list will store our new cleaned dataset.

The **already_added** list will store the app names to keep track of what apps we have already added. We need this list because there are some apps that have multiple entires with the highest number of reviews, so we will want to avoid adding them again.

* We first loop through the complete android dataset (exluding the header) and store the app name and number of reviews in corresponding variables.<br>


* We check to see if the number of reviews is equal to the maximum for that app, which we saved in our reviews_max dictionary. If it is, then we check to see if we have already added the name of the app in our **android_clean** dataset.<br>


* If we have not added the name in our **android_clean** dataset, we append the app's entire row of information to the **android_clean** dataset, and append the name of the app to the **already_added** to ensure we don't add it again.


Once we're finished, we will again check to see the length and make sure it is 9659 and then print out the first 10 apps in our newly updated **android_clean** dataset.

 

In [11]:
android_clean = []
already_added = []

for app in app_data_android[1:]:
    name = app[0]
    n_reviews = float(app[3])
    
    if (n_reviews == reviews_max[name]) and (name not in already_added):
        android_clean.append(app)
        already_added.append(name)
        
print(len(android_clean))
android_clean[:10]

9659


[['Photo Editor & Candy Camera & Grid & ScrapBook',
  'ART_AND_DESIGN',
  '4.1',
  '159',
  '19M',
  '10,000+',
  'Free',
  '0',
  'Everyone',
  'Art & Design',
  'January 7, 2018',
  '1.0.0',
  '4.0.3 and up'],
 ['U Launcher Lite – FREE Live Cool Themes, Hide Apps',
  'ART_AND_DESIGN',
  '4.7',
  '87510',
  '8.7M',
  '5,000,000+',
  'Free',
  '0',
  'Everyone',
  'Art & Design',
  'August 1, 2018',
  '1.2.4',
  '4.0.3 and up'],
 ['Sketch - Draw & Paint',
  'ART_AND_DESIGN',
  '4.5',
  '215644',
  '25M',
  '50,000,000+',
  'Free',
  '0',
  'Teen',
  'Art & Design',
  'June 8, 2018',
  'Varies with device',
  '4.2 and up'],
 ['Pixel Draw - Number Art Coloring Book',
  'ART_AND_DESIGN',
  '4.3',
  '967',
  '2.8M',
  '100,000+',
  'Free',
  '0',
  'Everyone',
  'Art & Design;Creativity',
  'June 20, 2018',
  '1.1',
  '4.4 and up'],
 ['Paper flowers instructions',
  'ART_AND_DESIGN',
  '4.4',
  '167',
  '5.6M',
  '50,000+',
  'Free',
  '0',
  'Everyone',
  'Art & Design',
  'March 26, 2017

Now let's do the same thing for Apple Store applications (IOS)


In [12]:
duplicates_apple = []
unique_apple = []

for app in app_data_apple[1:]:
    name = app[0]
    if name in unique_apple:
        duplicates_apple.append(name)
    else:
        unique_apple.append(name)
        
        
print("Number of duplicate apps:", len(duplicates_apple))



Number of duplicate apps: 0


From the code above, we notice that the Apple Store dataset has no duplicate apps and so we'll skip the removal process and move on to the next step in our cleaning.

# Checking for Non-English Apps

Now that we have completed the removal of all duplicate apps in our dataset, let's move onto removing the apps that are not in English.

We begin this process by first writing a function that will tell us whether an app is detected as English or non-English

In [13]:
def is_english(application):
    for char in application:
        if ord(char) > 127:
            return False
    
    return True


print(is_english('Instagram'))
print(is_english('爱奇艺PPS -《欢乐颂2》电视剧热播'))
    

True
False


From the test above, we can see that **is_english** function is working correctly. We wrote the function by testing each unicode character of the parameter using the integer representing that character. This is done using the **ord** command. If **ord(char)** returns an integer greater than 127, then that character is not an English character and so the function will return False. If we loop the entire parameter and find no character having an integer value greater than 127, then all the characters in that string (app name) are in English.

Although this is a good start, we know that certain applications might still be in English but have some character(s) with integer values greater than 127. Below are a couple of examples.

In [14]:
print(is_english('Docs To Go™ Free Office Suite'))
print(is_english('Instachat 😜'))


False
False


Because of this, it would be more effective if we consider a new condition - to accept up to 3 characters with integer value greater than 127. Doing this would allow up to 3 emoji or other special characters. If we only allow 3 characters, the majority of the application name will still be English and so we should detect it as such.

In [15]:
def updated_is_english(application):
    count = 0
    for char in application:
        if ord(char) > 127:
            count += 1
        if count >3:
            return False
    
    return True

print(updated_is_english('Docs To Go™ Free Office Suite'))
print(updated_is_english('Instachat 😜'))
print(updated_is_english('爱奇艺PPS -《欢乐颂2》电视剧热播'))

True
True
False


Our **updated_is_english** function now appears to accept the 2 names that were rejected in our previous **is_english** function.


Now, let's use the updated function on our IOS and Android datasets and see how many apps (rows) we end up with.

In [16]:
new_ios = []
new_android = []

for app in app_data_apple[1:]:
    name = app[1]
    if updated_is_english(name) == True:
        new_ios.append(app)

    
print("After removal of any duplicates and non-English apps, \nthe number of IOS apps is now:", len(new_ios))

for app in android_clean:
    name_android = app[0]
    if updated_is_english(name_android) == True:
        new_android.append(app)
        
print('\n')
print("After removal of any duplicates and non-English apps, \nthe number of Android apps is now:", len(new_android))
    

After removal of any duplicates and non-English apps, 
the number of IOS apps is now: 6183


After removal of any duplicates and non-English apps, 
the number of Android apps is now: 9614


# Isolating Free Applications

Now that we have removed any and all duplicates as well as non-English applications, our last area of focus is on isolating the free apps since our hypothetical company only build apps that are free to download and install. 

In [17]:
free_ios = []
free_android = []

for app in new_ios:
    price = app[4]
    if price == '0.0':
        free_ios.append(app)
        
print("Number of free IOS apps:",len(free_ios))

for app in new_android:
    price = app[7]
    if price == '0':
        free_android.append(app)

print("Number of free Android apps:",len(free_android))

Number of free IOS apps: 3222
Number of free Android apps: 8864


After isolating the free apps in the datasets, our data of interest now contains just 3222 IOS apps and 8864 Android apps. 



# Strategy

Our strategy for app ideas comprise of the following three steps.

1. Build a basic Android version of the app and add it to Google Play store. <br>
2. If the app receives good ratings/response from users, we will develop it further. <br>
3. If the app becomes profitable after a 6 month period, we proceed on an IOS version and add it to the App Store.


Since our goal is to make apps that are available on both Android and IOS platforms, we need to sift through our data sets and find profiles of apps that are succesful on both markets. This would give us the best idea of the type of apps that could find success.

After inspecting both data sets, we think the best approach is to generate frequency tables for the **prime_genre** column of the IOS dataset and **Category** and **Genre** columns of the Android dataset.

In [22]:
def freq_table(dataset, index):
    dictionary = {}
    dictionary_as_a_percent = {}
    for app in dataset:
        value = app[index]
        if value in dictionary:
            dictionary[value] += 1
        else: 
            dictionary[value] = 1
    
    for key in dictionary:
        percent = (dictionary[key]/len(dataset)) * 100
        dictionary_as_a_percent[key] = percent
    return dictionary_as_a_percent


        
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])
        

print("Below is the frequency table for the prime_genre\ncolumn.\n")
display_table(free_ios, 11)   # The index number for prime_genre column is 11 in IOS dataset.

print("\n")

print("Below is the frequency table for the Genre column \n")
display_table(free_android, 9) # The index number for Genre column is 9 in Android dataset

print("\n")

print("Below is the frequency table for the Category column \n")
display_table(free_android, 1) # The index number for the Category column is 1 in Android dataset.

    

Below is the frequency table for the prime_genre
column.

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


Below is the frequency table for the Genre column 

Tools : 8.449909747292418
Entertainment : 6.069494584837545
Education : 5.347472924187725
Business : 4.591606498194946
Productivity : 3.892148014440433
Lifestyle : 3.892148014440433
Fi

Examining the results of the frequency tables:

* It appears that the most common genre of the App Store dataset is the **Games** genre by a landslide with the runner-up being the **Entertainment** genre. <br>


* The pattern seems to be that the genres relating to entertainment **(Games, Entertainment, Photo and Video, Social Network, etc)** are the most popular while practical apps **(Education, Lifestyle, Finance, Shopping, Productivity, etc)** are less popular. <br>


* The most popular Genre of the Google Play store dataset is **Tools** with **Entertainment** coming in second. However there is much less of a difference between the percentages than as was the case with the IOS genres. <br>


* From the **Category** table, it appears that **Game** is still common, similar with the App Store data. Hence it would be safe to say that making a game would have the highest potential. <br>


* However, notice that the most common genres also mean higher competition as there are more alternatives available for users to pursue. On the other hand the least common genres, although suggest a lack of users, could also imply the opportunity to capture a high market share due to a lack of apps in those genres. <br>







# Average Number of Ratings Per Genre in App Store

We now move ahead and find out the average number of ratings per genre. To do this, we add up the number of ratings for each app in a specific genre and then divide that sum by the number of apps in that genre.

As a hypothetical example, say we have a genre called **Gummy Bears** which has a total of 4 apps: **Blue**, **Green**, **Red** and **Yellow**. Furthermore let's say the number of ratings for each apps are 25, 19, 59, and 7 respectively. 

Then, the average number of ratings would be (25+19+59+7)/4 = 27.5


For our project, we first use the **freq_table** function to obtain all the genres and save it to a variable called **apple_genre**. 

Then for each genre in **apple_genre**, we loop through the entire **free_ios** dataset and everytime the genre in the **free_ios** dataset is the same as the **apple_genre**, we will add the **total_rating_count** of that app to a variable called **total**. We also increment the **len_genre** by 1. At the end of the loop, we will print out the average number of ratings for each genre, which is **total/len_genre**, where **total** will contain the total number of ratings in the genre and **len_genre** will contain the number of apps in the genre.

We then repeat with the second genre in apple_genre until all genres are looped through and the average number of ratings for each genre is determined.  

In [23]:
apple_genre = freq_table(free_ios, 11)

for genre in apple_genre:
    total = 0
    len_genre = 0
    for app in free_ios:
        genre_app = app[11]
        if genre_app == genre:
            number_ratings = float(app[5])
            total = total + number_ratings
            len_genre += 1
    
    average_number_ratings = total / len_genre
    print(genre, ':', average_number_ratings)
    

Lifestyle : 16485.764705882353
Productivity : 21028.410714285714
Medical : 612.0
News : 21248.023255813954
Shopping : 26919.690476190477
Business : 7491.117647058823
Education : 7003.983050847458
Photo & Video : 28441.54375
Book : 39758.5
Entertainment : 14029.830708661417
Travel : 28243.8
Finance : 31467.944444444445
Music : 57326.530303030304
Food & Drink : 33333.92307692308
Reference : 74942.11111111111
Utilities : 18684.456790123455
Health & Fitness : 23298.015384615384
Sports : 23008.898550724636
Games : 22788.6696905016
Social Networking : 71548.34905660378
Catalogs : 4004.0
Weather : 52279.892857142855
Navigation : 86090.33333333333


We can see that Social Networking has an average of 71548 ratings. With further analysis we can determine that the majority of the ratings are dominated by the social media giants like Facebook and Pinterest, both having more than 1,000,000 total ratings.

In [27]:
for app in free_ios:
    if app[11] == 'Social Networking':
        print(app[1], ':', app[5])

Facebook : 2974676
Pinterest : 1061624
Skype for iPhone : 373519
Messenger : 351466
Tumblr : 334293
WhatsApp Messenger : 287589
Kik : 260965
ooVoo – Free Video Call, Text and Voice : 177501
TextNow - Unlimited Text + Calls : 164963
Viber Messenger – Text & Call : 164249
Followers - Social Analytics For Instagram : 112778
MeetMe - Chat and Meet New People : 97072
We Heart It - Fashion, wallpapers, quotes, tattoos : 90414
InsTrack for Instagram - Analytics Plus More : 85535
Tango - Free Video Call, Voice and Chat : 75412
LinkedIn : 71856
Match™ - #1 Dating App. : 60659
Skype for iPad : 60163
POF - Best Dating App for Conversations : 52642
Timehop : 49510
Find My Family, Friends & iPhone - Life360 Locator : 43877
Whisper - Share, Express, Meet : 39819
Hangouts : 36404
LINE PLAY - Your Avatar World : 34677
WeChat : 34584
Badoo - Meet New People, Chat, Socialize. : 34428
Followers + for Instagram - Follower Analytics : 28633
GroupMe : 28260
Marco Polo Video Walkie Talkie : 27662
Miitomo : 2

# Average Number of Installs per Category in Google Play Store

For Android applications, we do not have a rating count column but we do have a column called **Installs** which should be a very helpful in determining which categories are most in demand. Therefore, we use the same approach as with the average rating count for IOS genres, in determining the average number of installs per Google Play Store category.

One key difference in the code is that we cannot actually get the precise number of installs. For example, the number of installs in the Google Play dataset include **100,000+** , **1,000,000+**, **100,000,000+** and so on.

Therefore in order to actually calculate the average number of installs, we need to replace the **'+'** and **','** with empty strings so we're able to convert the string into a float. Only after are we able to perform the average calculation.

Furthermore, since 1,000,000+ does not equal 1,000,000 our average number of installs won't be accurate, but would still give us a reasonable value that we can compare for each category. 

In [29]:
android_category = freq_table(free_android, 1)

for category in android_category:
    total = 0
    len_category = 0
    for app in free_android:
        category_app = app[1]
        if category_app == category:
            number_installs = app[5]
            number_installs = number_installs.replace('+', '')
            number_installs = number_installs.replace(',', '')
            total += float(number_installs)
            len_category += 1
        
    average_installs = total / len_category
    print(category,':',average_installs)
    

SPORTS : 3638640.1428571427
PRODUCTIVITY : 16787331.344927534
PHOTOGRAPHY : 17840110.40229885
AUTO_AND_VEHICLES : 647317.8170731707
COMMUNICATION : 38456119.167247385
MAPS_AND_NAVIGATION : 4056941.7741935486
SOCIAL : 23253652.127118643
EDUCATION : 1833495.145631068
BUSINESS : 1712290.1474201474
HEALTH_AND_FITNESS : 4188821.9853479853
HOUSE_AND_HOME : 1331540.5616438356
LIFESTYLE : 1437816.2687861272
DATING : 854028.8303030303
LIBRARIES_AND_DEMO : 638503.734939759
PARENTING : 542603.6206896552
WEATHER : 5074486.197183099
TOOLS : 10801391.298666667
TRAVEL_AND_LOCAL : 13984077.710144928
ART_AND_DESIGN : 1986335.0877192982
SHOPPING : 7036877.311557789
NEWS_AND_MAGAZINES : 9549178.467741935
MEDICAL : 120550.61980830671
VIDEO_PLAYERS : 24727872.452830188
FAMILY : 3695641.8198090694
PERSONALIZATION : 5201482.6122448975
EVENTS : 253542.22222222222
BOOKS_AND_REFERENCE : 8767811.894736841
COMICS : 817657.2727272727
FINANCE : 1387692.475609756
FOOD_AND_DRINK : 1924897.7363636363
GAME : 15588015.6

Similar to what we did with the average number of ratings for the **Social Networking** genre in the **App Store**, we will do the same with the **Social** category in the **Google Play store**. 



In [33]:
for app in free_android:
    if app[1] == 'SOCIAL':
        print(app[0], ':', app[5])

Facebook : 1,000,000,000+
Facebook Lite : 500,000,000+
Tumblr : 100,000,000+
Social network all in one 2018 : 100,000+
Pinterest : 100,000,000+
TextNow - free text + calls : 10,000,000+
Google+ : 1,000,000,000+
The Messenger App : 1,000,000+
Messenger Pro : 1,000,000+
Free Messages, Video, Chat,Text for Messenger Plus : 1,000,000+
Telegram X : 5,000,000+
The Video Messenger App : 100,000+
Jodel - The Hyperlocal App : 1,000,000+
Hide Something - Photo, Video : 5,000,000+
Love Sticker : 1,000,000+
Web Browser & Fast Explorer : 5,000,000+
LiveMe - Video chat, new friends, and make money : 10,000,000+
VidStatus app - Status Videos & Status Downloader : 5,000,000+
Love Images : 1,000,000+
Web Browser ( Fast & Secure Web Explorer) : 500,000+
SPARK - Live random video chat & meet new people : 5,000,000+
Golden telegram : 50,000+
Facebook Local : 1,000,000+
Meet – Talk to Strangers Using Random Video Chat : 5,000,000+
MobilePatrol Public Safety App : 1,000,000+
💘 WhatsLov: Smileys of love, sti

As you can see, Facebook and Facebook Lite combined have over 1.5 billion installs meanwhile third place Tumblr have over 100,000,000 installs. Similar to what we saw with the Social Networking genre in the App Store, Facebook skews the average number of installs for the Google Play Store **'SOCIAL'** category as well.

# Final Decision

After taking a closer look at both the App Store average number of ratings and the Google Play store average number of installs, I've come up with two suggestions for our company. Since we're hoping for success on both platforms, I wanted to determine the genres/categories that will have both a high number of ratings from the App Store, as well as a high number of installs from the Google Play Store.


If we examine the average number of ratings data carefully, we notice the top 3 being: **Navigation**, **Social Networking**, and **Reference**.

**Social** is also a category with a large number of average installs from the Google Play store. However, realizing that giants like Facebook have a large market share, could make things extremely difficult for newcomers getting into the niche. This is why I would advise not to make apps that fit into the social / social networking genre.

With that being said, both **Navigation** and **Books Reference** categories have a decent average number of installs and would be more realistic to capture an audience in one of these categories as there aren't any obvious giants for alternatives. 

I would advise our company to make an app geared towards Books and reference, as there's only so much variability you can achieve with Navigation apps. 

I would say the best course of action in this regard, is to make an app that is based off a book, comic or manga, and which includes a couple of features for example:

* characters and their personality/description
* map of the setting
* jigsaw puzzles
* book quizzes
* book slang 
* book discussion forum
* chapter summaries
* option to allow people to write their own chapter and share with others.

If the app becomes successful, and the book popular, we can potentially move into the gaming sector and do something like **"Pokemon Go"**,  allowing a free to play game that is based off the book we're referencing, but also give the opportunity for in app purchases which boosts the potential for profitability.