# Google and Apple app store analysis to find which free app profiles are profitable

Our main objective in this project is to identify free apps which are profitable in both the stores. The data sets used in this project are taken from kaggle.com which could be found [here](https://www.kaggle.com/lava18/google-play-store-apps) and [here](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/version/7#_=_).

All in all, the data sets contain details like app name, ratings, version, number of downloads, etc of more than 15,000 apps.

## Opening the data sets in Python as lists
In the following code block, we will open the data sets stored locally using the ```csv``` built-in module of Python.
To open the file in Python, we make use of:
- ```open``` command
- ```reader``` command
- and finally, ```list``` command

In [179]:
from csv import reader

### The Google Play data set ###
opened_file = open('F:\Google Drive\DataQuest\Programming in Python\google-play-store-apps\googleplaystore.csv', encoding="utf8")
read_file = reader(opened_file)
android = list(read_file)
android_header = android[0]
android = android[1:]

### The App Store data set ###
opened_file = open('F:\Google Drive\DataQuest\Programming in Python\storeappledataset10kapps\AppleStore.csv', encoding="utf8")
read_file = reader(opened_file)
ios = list(read_file)
ios_header = ios[0]
ios = ios[1:]

## Exploring the data set

Now that we have opened the data set, we need some way to quickly explore the data set for finding the index, or number of rows and columns. This is something which we will have to use through out the project. Hence, we create a function for it named ```explore_data```.

```explore_data``` takes has **four** parameters.
- dataset (Name of the data set)
- start (row from which you want to explore)
- end (row till which you want to explore)
- rows_and_columns (if, you also want to know how many rows and columns are there)

In [180]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line between rows
        
    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

print(ios_header)
print('\n')
explore_data(ios, 0, 3, True)

['', 'id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['1', '281656475', 'PAC-MAN Premium', '100788224', 'USD', '3.99', '21292', '26', '4', '4.5', '6.3.5', '4+', 'Games', '38', '5', '10', '1']


['2', '281796108', 'Evernote - stay organized', '158578688', 'USD', '0', '161065', '26', '4', '3.5', '8.2.2', '4+', 'Productivity', '37', '5', '23', '1']


['3', '281940292', 'WeatherBug - Local Weather, Radar, Maps, Alerts', '100524032', 'USD', '0', '188583', '2822', '3.5', '4.5', '5.0.0', '4+', 'Weather', '37', '5', '3', '1']


Number of rows: 7197
Number of columns: 17


While data cleaning, I came across an error and on further exploring on kaggle.com from where I downloaded the data set, I came to know that there is [error in one of the rows](https://www.kaggle.com/lava18/google-play-store-apps/discussion/66015) in the original file. In order to correct the error, we delete the row with error.

In [181]:

del android[10472]


In our data set, there are many apps which are repeated. For example, Instagram has more than one entries. We will have to delete the multiple app entries and keep only unique ones for performing analysis. It is important to delete the mulitple entries because if we don't, it will affect the overall results we obtain from the data.

To isolate only unique apps,
- we create two empty lists
    - duplicate_apps and unique_apps
- we scan through all apps in the data set and if they are in unique_apps list, we add them to duplicate_apps list. If they are not in unique_apps list, we add them in it.




In [182]:
#for android
duplicate_apps_and = []
unique_apps_and = []

for app in android:
    name = app[0]
    if name in unique_apps_and:
        duplicate_apps_and.append(name)
    else:
        unique_apps_and.append(name)
    
print('Number of duplicate Android apps:', len(duplicate_apps_and))

#for ios
duplicate_apps_ios = []
unique_apps_ios = []

for app in ios:
    name = app[2]
    if name in unique_apps_ios:
        duplicate_apps_ios.append(name)
    else:
        unique_apps_ios.append(name)
    
print('Number of duplicate iOS apps:', len(duplicate_apps_ios))
print (duplicate_apps_ios)


Number of duplicate Android apps: 1180
Number of duplicate iOS apps: 2
['VR Roller Coaster', 'Mannequin Challenge']


Now that we have identified the number of apps which are repeated, we have to keep one copy of these apps and discard the duplicates.

In order to do that, we have to first decide a parameter on which we will select or reject an app.
In the data set, the number of reviews for duplicate apps are different for each entry.

For example:

Instagram has 66577313, 66577446, 66577313 and 66509917 as number of ratings. We will be selecting the one with highest value of number of ratings and discarding the rest as more number of ratings means that the data is more recent as compared to other entries.

## Removing duplicate entries
- We first create an empty dictionary named ```reviews_max```.
- We check if app name is in the dictionary and the review with that key is maximum.
- If it is, we move on to next entry. If it isn't, we update it with the highest value.

In [183]:
#for android
reviews_max = {}

for app in android:
    name = app[0]
    n_reviews = float(app[3])
    
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
        
    elif name not in reviews_max:
        reviews_max[name] = n_reviews
        
#for ios
reviews_max_ios = {}

for app in ios:
    name = app[2]
    n_reviews = float(app[6])
    
    if name in reviews_max_ios and reviews_max_ios[name] < n_reviews:
        reviews_max_ios[name] = n_reviews
        
    elif name not in reviews_max_ios:
        reviews_max_ios[name] = n_reviews

In the above code block, we created an empty dictionary and then populated it with the the highest number of reviews for that app.

In the below code block, we will use that dictionary as a guide to keep or discard the app in our list of unique apps.
- We first start by creating two lists - android_clean and already_added
- We go through all the apps in the data set
- if the review associated with that app is equal to review associated with that app in the dictionary, then it means it is the value which we have to keep
- Along with it, we also check if it is present in already_added list. If it is, we skip it as we don't want any duplications.
- At the end, we are left with unique apps with highest value of ratings

In [184]:
#for android

android_clean = []
already_added = []

for app in android:
    name = app[0]
    n_reviews = float(app[3])
    
    if (reviews_max[name] == n_reviews) and (name not in already_added):
        android_clean.append(app)
        already_added.append(name)
        
#for ios
ios_clean = []
ios_already_added = []

for app in ios:
    name = app[2]
    n_reviews = float(app[6])
    
    if (reviews_max_ios[name] == n_reviews) and (name not in ios_already_added):
        ios_clean.append(app)
        ios_already_added.append(name)

In [185]:
print('Expected length:', len(android) - 1180)
print('Expected length:', len(ios) - 2)
print('Actual length:', len(android_clean))
print('Actual length:', len(ios_clean))

Expected length: 9659
Expected length: 7195
Actual length: 9659
Actual length: 7195


In [186]:
explore_data(android_clean, 0, 3, True)
print (len(android_clean))

explore_data(ios_clean, 0, 3, True)
print (len(ios_clean))


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9659
Number of columns: 13
9659
['1', '281656475', 'PAC-MAN Premium', '100788224', 'USD', '3.99', '21292', '26', '4', '4.5', '6.3.5', '4+', 'Games', '38', '5', '10', '1']


['2', '281796108', 'Evernote - stay organized', '158578688', 'USD', '0', '161065', '26', '4', '3.5', '8.2.2', '4+', 'Productivity', '37', '5', '23', '1']


['3', '281940292', 'WeatherBug - Local Weather, Radar, Maps, Alerts', '100524032', 'USD', '0', '

## Removing non-English apps

As we have stated in the objective, we intend to find apps which are free and English. The data set contains apps which are not english in it. We will have to remove those apps too.

To identify non-english apps:
- we will use the built-in ```ord``` function of python
- ```ord``` function gives the ASCII value of any string
- for most of english characters, ASCII value is between 0 and 127

The function ```is_english``` takes a string as a parameter. It checks the first 3 characters and if value falls between 0 and 127, it returns True i.e identifies the app as english.

**Note:** Using this method, some non-english apps can also be identfied as english. Like apps which have first 3 characters as english.

The reason why we have still used this code is because it still manages to identify majority of the apps correctly.

In [187]:
def is_english(string):
    non_ascii = 0
    
    for character in string:
        if ord(character) > 127:
            non_ascii += 1
    
    if non_ascii > 3:
        return False
    else:
        return True

In following code block, we check the app language and then segregate them into two lists: one for english apps and another for non-english

In [188]:
android_english = []
android_not = []
ios_english = []
ios_not = []

for app in android_clean:
    name = app[0]
    if is_english(name):
        android_english.append(app)
    else:
        android_not.append(app)
        
for app in ios:
    name = app[2]
    if is_english(name):
        ios_english.append(app)
    else:
        ios_not.append(app)
        
print(android_header)
print('\n')
print(ios_header)
print('\n')
explore_data(android_english, 0, 3, True)
print('\n')
explore_data(ios_english, 0, 3, True)


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['', 'id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9614
Number of columns: 13


['1', '2

### Now we have data which,
- has errors removed
- has all unique apps
- contains only english apps*

Our next step is to isolate only free apps from the data set.

**Note: If you want to perform analysis on paid apps as well, do NOT run this block.**

In [189]:
def remove_paid (dataset, index):
    free_app = []
    paid_app = []
    for app in dataset:
        price = app[index]
        if price == '0':
            free_app.append(app)
        
    return free_app

free_and = remove_paid(android_english, 7)
print ('Number of unique, only english* and free android apps: ' , len(free_and))

free_ios = remove_paid(ios_english, 5)
print ('Number of unique, only english* and free ios apps: ' , len(free_ios))


Number of unique, only english* and free android apps:  8864
Number of unique, only english* and free ios apps:  3222


# The data cleaning process has ended here.

The final data set which we have consists of:
- Unique apps
- English apps
- Free apps

Now, we can perform data analysis.

---------------------------------

# Data Analysis starts from here.

As already stated in the start, our objective of this project is to identify free apps which are profitable in both the stores. 

In order to meet our objective, we will have to identify factors which tell us which are the most common app genres in both the stores. 

After going through the data set, we have decided to check the relation between number of reviews and app genres.

If a particular app genre has more number of reviews, it means that apps of those genre are downloaded more. That's a good indicator for us as our main source of income is in-app ads. And more number of user = more income.

First, let us create a frequency table for genres and prime_genres of google and apple store data respectively.

Since we are going to use it more than once, we will create a function for generating frequency tables.

In [190]:
def freq_table(dataset, index):
    freq = {}
    for genre in dataset:
        name = genre[index]
        if name in freq:
            freq[name] += 1
        else:
            freq[name] = 1
    return freq


Now that we have our frequency tables for any columns in the data set ready, we will have to sort it in ascending or descending order as Python dictionaries don't follow any order by default. In order to sort the dictionary, we will first convert all the key value pairs in it into tuple and then use the built-in ```sorted()``` function to sort in in descending order.

## Sorting frequency table in descending order

The following code block contains a function to sort any frequency table (dictionary) in ascending or descending order.

We first convert the key value pair in to tuple and then sort it using ```sorted()``` which returns a sorted frequency table.

In [191]:
def freq_sort(dataset, index):
    ftable = freq_table(dataset, index)
    stable = []
    for key in ftable:
        to_tuple = (ftable[key] , key)
        stable.append(to_tuple)
        
    sort_table = sorted(stable, reverse = True)
    
    for entry in sort_table:
        print (entry[1], ':', entry[0])

Using the above function, we can create and display frequency tables sorted in descending order for any columns in the data set. Let us print the frequency tables for genres and prime_genres column of data sets.

In [192]:
'''print("Frequency table for Genres in Google Play Store data:")
print('\n')
freq_sort(free_and, 9) #Google play store genres column
print('\n')'''


print("Frequency table for Category in Google Play Store data:")
print('\n')
freq_sort(free_and, 1) #Google play store category column
print('\n')


'''print("Frequency table for prime_genres in Apple Store data:")
print('\n')
freq_sort(free_ios, 12) #Apple App store prime_genre column'''

Frequency table for Category in Google Play Store data:


FAMILY : 1676
GAME : 862
TOOLS : 750
BUSINESS : 407
LIFESTYLE : 346
PRODUCTIVITY : 345
FINANCE : 328
MEDICAL : 313
SPORTS : 301
PERSONALIZATION : 294
COMMUNICATION : 287
HEALTH_AND_FITNESS : 273
PHOTOGRAPHY : 261
NEWS_AND_MAGAZINES : 248
SOCIAL : 236
TRAVEL_AND_LOCAL : 207
SHOPPING : 199
BOOKS_AND_REFERENCE : 190
DATING : 165
VIDEO_PLAYERS : 159
MAPS_AND_NAVIGATION : 124
FOOD_AND_DRINK : 110
EDUCATION : 103
ENTERTAINMENT : 85
LIBRARIES_AND_DEMO : 83
AUTO_AND_VEHICLES : 82
HOUSE_AND_HOME : 73
WEATHER : 71
EVENTS : 63
PARENTING : 58
ART_AND_DESIGN : 57
COMICS : 55
BEAUTY : 53




'print("Frequency table for prime_genres in Apple Store data:")\nprint(\'\n\')\nfreq_sort(free_ios, 12) #Apple App store prime_genre column'

The below code block helps us to generate the average ratings of all the app genres in Apple AppStore.
- Using our ```freq_table``` function, we generate the frequency table from ```free_ios``` data set. 
- We store the new frequency table in ```ios_genres```

In order to calculate average genre ratings,
- we create variables named total and genre_ len with 0 as initial value
- Using nested loop, we match the value of genre in free_ios and add the corresponding value of ratings in the total variable
- Nested loop is required so that we add rating values of only one particular genre
- genre_len is incremented by 1 value evertime the if condition is True. It calculates the number of times the genre occurs in the data set
- Once we have these two values, to find average, we divide total / genre_len to get the average

In [193]:
ios_genres = freq_table(free_ios, 12)

for app in ios_genres:
    total = 0
    genre_len = 0
    for genre in free_ios:
        if genre[12] == app:
            ratings = float(genre[6])
            total += ratings
            genre_len += 1
    avg_genre_rating = (total / genre_len)
    print(app, ":", avg_genre_rating)


Productivity : 21028.410714285714
Weather : 52279.892857142855
Shopping : 26919.690476190477
Reference : 74942.11111111111
Finance : 31467.944444444445
Music : 57326.530303030304
Utilities : 18684.456790123455
Travel : 28243.8
Social Networking : 71548.34905660378
Sports : 23008.898550724636
Health & Fitness : 23298.015384615384
Games : 22788.6696905016
Food & Drink : 33333.92307692308
News : 21248.023255813954
Book : 39758.5
Photo & Video : 28441.54375
Entertainment : 14029.830708661417
Business : 7491.117647058823
Lifestyle : 16485.764705882353
Education : 7003.983050847458
Navigation : 86090.33333333333
Medical : 612.0
Catalogs : 4004.0


Above code block shows us average rating for every genre in the Apple AppStore. From the above averages, we can infer some things:
- Navigation, Reference, Social Networking and weather are some of the top apps with average user reviews
- Medical, Business, Education and catalogs have the lowest. So these are definitely not important to us.

Let us analyze the top apps.

Among them, navigation is highly influenced by Google Maps and Waze.

In [194]:
for app in free_ios:
    if app[12] == 'Navigation':
        ratings = app[6]
        name = app[2]
        #print (name, ":", ratings) #Remove comment to print

print ('\n')        
        
for app in free_ios:
    if app[12] == 'Social Networking':
        ratings = app[6]
        name = app[2]
        #print (name, ":", ratings) #Remove comment to print





In the social networking category, Facebook, Instagram, LinkedIn and other big apps influence the outcome.

Also, apps like these require a lot of investment and hence, these are out of our scope.


Apps in the Reference genre look promising. They have are apps which have popular books example, Bible, Quran, etc. We can create a similar app with additional features absent in the existing apps so that our app is different than others.

Let us check popular app genre for the Google Play store. In this data set, we have number of installs provided to us. This is good as we don't have to assume the popularity as we did with Apple AppStore data. But, as you can see, the number of installs data is not in the format we want. So first, we will have to change it into float type data to perform analysis.

In [195]:
def convert_float(string, char1, char2):
    new_string = string.replace(char1, '')
    new_string = new_string.replace(char2, '')
    flt = float(new_string)
    return flt

In [196]:
play_category = freq_table (free_and, 1)

for app in play_category:
    total = 0
    genre_len = 0
    for cat in free_and:
        install = cat[5]
        if cat[1] == app:
            n_install = convert_float(install, ',','+')
            total += n_install
            genre_len += 1
    avg_play = (total / genre_len)
    print(app, ':', avg_play)
            
            

ART_AND_DESIGN : 1986335.0877192982
AUTO_AND_VEHICLES : 647317.8170731707
BEAUTY : 513151.88679245283
BOOKS_AND_REFERENCE : 8767811.894736841
BUSINESS : 1712290.1474201474
COMICS : 817657.2727272727
COMMUNICATION : 38456119.167247385
DATING : 854028.8303030303
EDUCATION : 1833495.145631068
ENTERTAINMENT : 11640705.88235294
EVENTS : 253542.22222222222
FINANCE : 1387692.475609756
FOOD_AND_DRINK : 1924897.7363636363
HEALTH_AND_FITNESS : 4188821.9853479853
HOUSE_AND_HOME : 1331540.5616438356
LIBRARIES_AND_DEMO : 638503.734939759
LIFESTYLE : 1437816.2687861272
GAME : 15588015.603248259
FAMILY : 3695641.8198090694
MEDICAL : 120550.61980830671
SOCIAL : 23253652.127118643
SHOPPING : 7036877.311557789
PHOTOGRAPHY : 17840110.40229885
SPORTS : 3638640.1428571427
TRAVEL_AND_LOCAL : 13984077.710144928
TOOLS : 10801391.298666667
PERSONALIZATION : 5201482.6122448975
PRODUCTIVITY : 16787331.344927534
PARENTING : 542603.6206896552
WEATHER : 5074486.197183099
VIDEO_PLAYERS : 24727872.452830188
NEWS_AND_

As can be seen from the above data,
Communication, video players, social, photography and produtivity apps are the most popular apps in Google Play store. Let us analyze them.

### Communication apps

In [197]:
def print_apps (dataset, category, category_index, name_index, install_index):
    for app in dataset:
        if (app[category_index] == category):
            print (app[name_index], ':', app[install_index])
    print ('\n')

In [198]:
print_apps (free_and, 'COMMUNICATION', 1, 0, 5)

WhatsApp Messenger : 1,000,000,000+
Messenger for SMS : 10,000,000+
My Tele2 : 5,000,000+
imo beta free calls and text : 100,000,000+
Contacts : 50,000,000+
Call Free – Free Call : 5,000,000+
Web Browser & Explorer : 5,000,000+
Browser 4G : 10,000,000+
MegaFon Dashboard : 10,000,000+
ZenUI Dialer & Contacts : 10,000,000+
Cricket Visual Voicemail : 10,000,000+
TracFone My Account : 1,000,000+
Xperia Link™ : 10,000,000+
TouchPal Keyboard - Fun Emoji & Android Keyboard : 10,000,000+
Skype Lite - Free Video Call & Chat : 5,000,000+
My magenta : 1,000,000+
Android Messages : 100,000,000+
Google Duo - High Quality Video Calls : 500,000,000+
Seznam.cz : 1,000,000+
Antillean Gold Telegram (original version) : 100,000+
AT&T Visual Voicemail : 10,000,000+
GMX Mail : 10,000,000+
Omlet Chat : 10,000,000+
My Vodacom SA : 5,000,000+
Microsoft Edge : 5,000,000+
Messenger – Text and Video Chat for Free : 1,000,000,000+
imo free video calls and chat : 500,000,000+
Calls & Text by Mo+ : 5,000,000+
free 

FO STELIA Méaulte : 100+
FO AIRBUS Nantes : 100+
Firefox Focus: The privacy browser : 1,000,000+
FP Connect : 100+
FreedomPop Messaging Phone/SIM : 500,000+
FP Live : 10+
HipChat - beta version : 50,000+




In [199]:
bn = '1,000,000,000+'
mn100 = '100,000,000+'
mn10 = '10,000,000+'
mn = '1,000,000+'
tn100 = '100,000+'
tn10 = '10,000+'
tn = '1,000+'

In [200]:
def app_count (dataset, cat_i, cat_name, ins_i, inst, name_i):
    for app in dataset:
        if app[cat_i] == cat_name and app[ins_i] == inst:
            print (app[name_i], ",", app[ins_i])
            
app_count (free_and, 1, 'COMMUNICATION', 5, bn, 0)

WhatsApp Messenger , 1,000,000,000+
Messenger – Text and Video Chat for Free , 1,000,000,000+
Skype - free IM & video calls , 1,000,000,000+
Google Chrome: Fast & Secure , 1,000,000,000+
Gmail , 1,000,000,000+
Hangouts , 1,000,000,000+


As you can see, this category is influenced by apps like WhatsApp Messenger, Facebook Messenger, Skype, etc which have a billion plus app installs. Due to this, the average for this category has been skewed and we won't consider this.

Let us see the next top category now.

### Video Player Apps

In [201]:
print_apps (free_and, 'VIDEO_PLAYERS', 1, 0, 5)

YouTube : 1,000,000,000+
All Video Downloader 2018 : 1,000,000+
Video Downloader : 10,000,000+
HD Video Player : 1,000,000+
Iqiyi (for tablet) : 1,000,000+
Video Player All Format : 10,000,000+
Motorola Gallery : 100,000,000+
Free TV series : 100,000+
Video Player All Format for Android : 500,000+
VLC for Android : 100,000,000+
Code : 10,000,000+
Vote for : 50,000,000+
XX HD Video downloader-Free Video Downloader : 1,000,000+
OBJECTIVE : 1,000,000+
Music - Mp3 Player : 10,000,000+
HD Movie Video Player : 1,000,000+
YouCut - Video Editor & Video Maker, No Watermark : 5,000,000+
Video Editor,Crop Video,Movie Video,Music,Effects : 1,000,000+
YouTube Studio : 10,000,000+
video player for android : 10,000,000+
Vigo Video : 50,000,000+
Google Play Movies & TV : 1,000,000,000+
HTC Service － DLNA : 10,000,000+
VPlayer : 1,000,000+
MiniMovie - Free Video and Slideshow Editor : 50,000,000+
Samsung Video Library : 50,000,000+
OnePlus Gallery : 1,000,000+
LIKE – Magic Video Maker & Community : 50,

In [202]:
app_count (free_and, 1, 'VIDEO_PLAYERS', 5, bn, 0)
app_count (free_and, 1, 'COMMUNICATION', 5, mn100, 0)



YouTube , 1,000,000,000+
Google Play Movies & TV , 1,000,000,000+
imo beta free calls and text , 100,000,000+
Android Messages , 100,000,000+
Who , 100,000,000+
GO SMS Pro - Messenger, Free Themes, Emoji , 100,000,000+
Firefox Browser fast & private , 100,000,000+
Messenger Lite: Free Calls & Messages , 100,000,000+
Kik , 100,000,000+
KakaoTalk: Free Calls & Text , 100,000,000+
Opera Mini - fast web browser , 100,000,000+
Opera Browser: Fast and Secure , 100,000,000+
Telegram , 100,000,000+
Truecaller: Caller ID, SMS spam blocking & Dialer , 100,000,000+
UC Browser Mini -Tiny Fast Private & Secure , 100,000,000+
WeChat , 100,000,000+
Yahoo Mail – Stay Organized , 100,000,000+
BBM - Free Calls & Messages , 100,000,000+


This category is also influenced by YouTube and Google Play Movies and TV which has billion plus downloads and quite a few others with more than 100 million plus downloads.

Let us check the photography category.

### Photography

In [203]:
print_apps (free_and, 'PHOTOGRAPHY', 1, 0, 5)

TouchNote: Cards & Gifts : 1,000,000+
FreePrints – Free Photos Delivered : 1,000,000+
Groovebook Photo Books & Gifts : 500,000+
Moony Lab - Print Photos, Books & Magnets ™ : 50,000+
LALALAB prints your photos, photobooks and magnets : 1,000,000+
Snapfish : 1,000,000+
Motorola Camera : 50,000,000+
HD Camera - Best Cam with filters & panorama : 5,000,000+
LightX Photo Editor & Photo Effects : 10,000,000+
Sweet Snap - live filter, Selfie photo edit : 10,000,000+
HD Camera - Quick Snap Photo & Video : 1,000,000+
B612 - Beauty & Filter Camera : 100,000,000+
Waterfall Photo Frames : 1,000,000+
Photo frame : 100,000+
Huji Cam : 5,000,000+
Unicorn Photo : 1,000,000+
HD Camera : 5,000,000+
Makeup Editor -Beauty Photo Editor & Selfie Camera : 1,000,000+
Makeup Photo Editor: Makeup Camera & Makeup Editor : 1,000,000+
Moto Photo Editor : 5,000,000+
InstaBeauty -Makeup Selfie Cam : 50,000,000+
Garden Photo Frames - Garden Photo Editor : 500,000+
Photo Frame : 10,000,000+
Selfie Camera - Photo Edito

In [204]:
app_count (free_and, 1, 'PHOTOGRAPHY', 5, bn, 0)
app_count (free_and, 1, 'PHOTOGRAPHY', 5, mn100, 0)

Google Photos , 1,000,000,000+
B612 - Beauty & Filter Camera , 100,000,000+
YouCam Makeup - Magic Selfie Makeovers , 100,000,000+
Sweet Selfie - selfie camera, beauty cam, photo edit , 100,000,000+
Retrica , 100,000,000+
Photo Editor Pro , 100,000,000+
BeautyPlus - Easy Photo Editor & Selfie Camera , 100,000,000+
PicsArt Photo Studio: Collage Maker & Pic Editor , 100,000,000+
Photo Collage Editor , 100,000,000+
Z Camera - Photo Editor, Beauty Selfie, Collage , 100,000,000+
PhotoGrid: Video & Pic Collage Maker, Photo Editor , 100,000,000+
Candy Camera - selfie, beauty camera, photo editor , 100,000,000+
YouCam Perfect - Selfie Photo Editor , 100,000,000+
Camera360: Selfie Photo Editor with Funny Sticker , 100,000,000+
S Photo Editor - Collage Maker , Photo Collage , 100,000,000+
AR effect , 100,000,000+
Cymera Camera- Photo Editor, Filter,Collage,Layout , 100,000,000+
LINE Camera - Photo editor , 100,000,000+
Photo Editor Collage Maker Pro , 100,000,000+


### Books category

If you remember, in the analysis of AppStore, we saw that the Books category looked promising. Let us check the books category in the play store data.

In [205]:
print_apps (free_and, 'BOOKS_AND_REFERENCE', 1, 0, 5)

E-Book Read - Read Book for free : 50,000+
Download free book with green book : 100,000+
Wikipedia : 10,000,000+
Cool Reader : 10,000,000+
Free Panda Radio Music : 100,000+
Book store : 1,000,000+
FBReader: Favorite Book Reader : 10,000,000+
English Grammar Complete Handbook : 500,000+
Free Books - Spirit Fanfiction and Stories : 1,000,000+
Google Play Books : 1,000,000,000+
AlReader -any text book reader : 5,000,000+
Offline English Dictionary : 100,000+
Offline: English to Tagalog Dictionary : 500,000+
FamilySearch Tree : 1,000,000+
Cloud of Books : 1,000,000+
Recipes of Prophetic Medicine for free : 500,000+
ReadEra – free ebook reader : 1,000,000+
Anonymous caller detection : 10,000+
Ebook Reader : 5,000,000+
Litnet - E-books : 100,000+
Read books online : 5,000,000+
English to Urdu Dictionary : 500,000+
eBoox: book reader fb2 epub zip : 1,000,000+
English Persian Dictionary : 500,000+
Flybook : 500,000+
All Maths Formulas : 1,000,000+
Ancestry : 5,000,000+
HTC Help : 10,000,000+
E

In [206]:
app_count (free_and, 1, 'BOOKS_AND_REFERENCE', 5, bn, 0)
app_count (free_and, 1, 'BOOKS_AND_REFERENCE', 5, mn100, 0)
app_count (free_and, 1, 'BOOKS_AND_REFERENCE', 5, mn10, 0)
app_count (free_and, 1, 'BOOKS_AND_REFERENCE', 5, tn, 0)

Google Play Books , 1,000,000,000+
Bible , 100,000,000+
Amazon Kindle , 100,000,000+
Wattpad 📖 Free Books , 100,000,000+
Audiobooks from Audible , 100,000,000+
Wikipedia , 10,000,000+
Cool Reader , 10,000,000+
FBReader: Favorite Book Reader , 10,000,000+
HTC Help , 10,000,000+
Moon+ Reader , 10,000,000+
Aldiko Book Reader , 10,000,000+
Al-Quran (Free) , 10,000,000+
Al Quran Indonesia , 10,000,000+
Al'Quran Bahasa Indonesia , 10,000,000+
Quran for Android , 10,000,000+
Dictionary.com: Find Definitions for English Words , 10,000,000+
English Dictionary - Offline , 10,000,000+
NOOK: Read eBooks & Magazines , 10,000,000+
Dictionary , 10,000,000+
Spanish English Translator , 10,000,000+
Dictionary - Merriam-Webster , 10,000,000+
JW Library , 10,000,000+
Oxford Dictionary of English : Free , 10,000,000+
English Hindi Dictionary , 10,000,000+
C Offline Tutorial , 1,000+
R Programing Offline Tutorial , 1,000+
R Quick Reference Big Data , 1,000+
AE Bulletins , 1,000+
Ag PhD Planting Population 

As you can see, except for Google Play Books, no other app has more than a billion installs in this category. There are few apps in the 100 million plus installs bracket, but they are few. 

# Conclusion

We analyzed more than 15,000 apps from both the app stores in order to identify a free app whose main income source would be in-app ads. Looking at the data from both the stores, an app in the books category would be a good choice for us. In this category, not all apps are dominated by big companies like we saw in communication and video player genre. In fact, the 100 million and 10 million app install bracket is dominated by apps based on holy books. Add to this the fact that this category is also better suited for Apple AppStore, we should go ahead with an app in this category but with better features.