# Data Science Project: Research on Profitable Types of Apps in App Store & Google Play Markets

- My aim: find what apps profiles are profitable on the Apple App Store and Google's Play Store. 
- My job: data analyst for Android and IOS app maker.
- My mission: help establish a data-driven strategy.
- What my clients do: create free to download apps but revenue generating through in-app ads.
- My goal: find which apps characteristics bring in the most users.




### Data Source

- Data I am using:

 a. Web-scraped Google Play data featuring about 10 000 apps, gathered in 2019
 
 b. Apple Store API-collected data, with about 7 000 apps, gathered in 2018
 

 
 
- Link to data:

    a. [Android](https://www.kaggle.com/lava18/google-play-store-apps)
    
    b. [Apple](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps)
    



- Data relevance: roughly 4 million apps exist, hence it might be better to study a representative sample instead.




 --- 
 ---
 ---

## Data Gathering
Opening the two data sets:

In [1]:
from csv import reader

#Google Play data import
opened_file = open('googleplaystore.csv', encoding ='utf8')
read_file = reader(opened_file)
android = list(read_file)
android_header = android[0]
android = android[1:]

#App Store data import
opened_file = open('AppleStore.csv', encoding ='utf8')
read_file = reader(opened_file)
ios = list(read_file)
ios_header = ios[0]
ios = ios[1:]


Creating a function to quickly look at specific row and colums in a readable format; with the added functionality of specifying number of rows and columns if we wish:

In [2]:
def explore_data(dataset, start, end, rows_and_columns=False):
    
    dataset_slice = dataset[start:end] 
    
    for row in dataset_slice:
        print(row)
        print('\n') 

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

A quick example to showcase the function, and the sucessful data import:

In [3]:
print('Example of Android data rows')
print('\n')
explore_data(android,0,3, True)  #number of rows does not take into account the header!

print('\n')

print('Example of Apple data rows')
print('\n')
explore_data(ios,0,3,True)



Example of Android data rows


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


Number of rows: 10841
Number of columns: 13


Example of Apple data rows


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 

What the data headers look like:

In [4]:
print(android_header) #android[0] and ios[0] also work
print('\n')
print(ios_header)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


The IOS header may look harder to read but it is the same header content and order as the Android one. 
***Except*** for the fact that the app name on Apple's data is row 2, which will slighlty impact our code down the road.

---
---
---

## Data Cleaning

### Duplicate Processing


On the discussion boards of the data source, a few errors have been raised. For example [row 10472 on the Android data set has a misplaced column](https://www.kaggle.com/lava18/google-play-store-apps/discussion/66015), rendering it's analysis flawed.

In [5]:
explore_data(android, 10471, 10473)

['Xposed Wi-Fi-Pwd', 'PERSONALIZATION', '3.5', '1042', '404k', '100,000+', 'Free', '0', 'Everyone', 'Personalization', 'August 5, 2014', '3.0.0', '4.0.3 and up']


['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']




Indeed, we can see that the second entry does not feature properly its second element which is supposed to be its `Category` tag. We can also notice the shift in its 9th entry which is simply a blank string `''`, which is normally the `Genre` tag.

Hence we simply delete that data entry:

In [6]:
del android[10472]

The two discussion boards of these data harvests do not indicate any other column-shift errors. The data set is small enough and popular enough for us to trust this conclusion.

What about duplicate data? With over 17 000 apps listed they surely exist. Hence we can check for repeating app names. Just to cite [another discussion board](https://www.kaggle.com/lava18/google-play-store-apps/discussion/82616), a few apps amongst Instagram have several entries:

In [7]:
for i in android:   #`i` refers to a row
    
    if i[0]=='Instagram':
        print([i]) #print row whose index 0 (first column, the one featuring app names) is exactly 'Instagram'
        print('\n') #empty line between generated outputs
        
    
    

[['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]


[['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]


[['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]


[['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]




Hence there are indeed 4 entries for Instagram on Android, and that is merely an example. However are these entries **identical?** 
A sharp eye will notice that the 4th column of those lists, aka the number of reviews, is slightly different. More on that specific point soon.

Now we need to formally identify all the other duplicates for both datasets.

In [8]:
android_dupes=[] #Generating empty lists that will be basis of sorting
android_uniques=[]

for i in android: #go through all the apps (in this case the rows of the android data set) once again this specific
                  #... list does not include the header
    
    name = i[0]
    
    if name in android_uniques:   #put specific rows in duplicate basket if their exact name already exists
        android_dupes.append(name)
        
    else:         # triggers if it's the first time the name shows up, hence a non-duplicate entry, add it to unique
                  #... pile
        android_uniques.append(name)

print('There are '+str(len(android_dupes))+' duplicates in the Android dataset')

There are 1181 duplicates in the Android dataset


Now doing the same for the IOS dataset:

In [9]:
ios_dupes=[] #Generating empty lists that will be basis of sorting
ios_uniques=[]

for i in ios: #go through all the apps (in this case the rows of the android data set)
    
    name = i[0]
    
    if name in ios_uniques:   #put specific rows in duplicate basket if their exact name already exists
        ios_dupes.append(name)
   
    else:         # triggers if it's the first time the name shows up, hence a non-duplicate entry
        ios_uniques.append(name)

print('There are '+str(len(ios_dupes))+' duplicates in the Apple dataset')

There are 0 duplicates in the Apple dataset


The duplicates have been identified, and are stored in a list, which is needed to filter them out later.


As we can see from Instagram's example above, what differs is the number of ratings. As the data was being scrapped, popular apps such as Instagram had its number of reviews change, and this is why it was considered another entry. It is, for the sake of data analysis, more relavant to chose the most recent data points. This is why it is logical to keep the most recent entry, aka the one with the most reviews. This is what the program below does. 

*Note: the original entries (with the least amount of reviews) will be deleted all the time, as it can be reasonably assumed that they were made before the duplicates. Apps may have their number of reviews reduced during the scrapping due to them being moderated as being fake for example, but we'll consider this a rare scenario.*

In [10]:
good_dupe = {}   #dictionary of all the duplicates featuring the highest amount of reviews amongst other duplicate entries.

for i in android:   #cycling through the Android dataset (minus its header)
    
    app_name = i[0]      #identifying the app name and review count data points to consider
    rev_count = float(i[3])
    
    if app_name in good_dupe and good_dupe[app_name] < rev_count:   #if a duplicate is found but I now have a higher review count to show, 
                                                                    #...I will overwrite the lower review count with the higher one.  
        good_dupe[app_name] = rev_count
    
    elif app_name not in good_dupe:   # If not a duplicate, goes to the "good" app masterlist
        good_dupe[app_name] = rev_count
        
print('Adjusting for duplicates, there are ' + str(len(good_dupe)) + ' apps in the Android dataset')

Adjusting for duplicates, there are 9659 apps in the Android dataset


We previously found 1181 duplicates in the Android dataset. In total there are 10841 rows so 10840 apps (and the header) in the Android dataset as previously seen.

10840 - 1181 = 9659 apps

Thus we sucessfully identified all of the Android duplicates using the aforementioned discrimination logic. 

It was seen that Apple's dataset did not have any duplicates.

From now on, we need to only use our non-dupe list generated above(`good_dupe`) as our Android 'masterlist':

In [11]:
android_clean=[]        # the new masterlist
whats_in_it=[]          # tracker of what apps are already in masterlist

for i in android:      #cycling through all the apps
    
    app_name = i[0]      #identifying the app name and review count data points to consider
    rev_count = float(i[3])  
    
    if (good_dupe[app_name] == rev_count) and app_name not in whats_in_it:  #if we face the duplicate with higher review count AND 
                                                                            #...it is the first of the dupes of that app encountered
        android_clean.append(i)                #add it to the master list
        whats_in_it.append(app_name)           #add it to the masterlist tracker

    #nothing technically happens to the duplicates that are not the "good" duplicates, but that's made by design!
    #if an app is already on the masterlist, but not with the highest possible review count (if it has duplicates), it does not
    #matter as it will be overridden by the "good duplicate" either way.
 
print('The new masterlist features ' + str(len(android_clean)) + ' apps.' +'\n' + 'If it features, after program execution, '
      +'9659 apps, then the cleaning is sucessful.')
        
    
    
    

The new masterlist features 9659 apps.
If it features, after program execution, 9659 apps, then the cleaning is sucessful.


A quick check of the state of the masterlist, `android_clean`, header included, is due.
We can re-use the previously made `explore_data()` function

In [12]:
print(android_header)
print('\n')
explore_data(android_clean,0,4,True)    #number of rows once again does not take into account the header!

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


Number of rows: 9659
Number of columns: 13


We also have the same amount of columns as the original dataset!

Once again no adjustments of this kind needed for the Apple dataset.

### Non-English Apps Processing

We need to filter out non-English based apps, as our client cannot create similar apps **and** they do not target that audience, thus they are irrelevant to our analysis.

Upon brief inspection of the datasets, here are examples of apps we wish to filter out:

In [13]:
print('''On Apple's dataset:''')
print('\n')
print(ios[813][1])
print(ios[2372][1])
print(ios[2563][1])

print('\n')

print('''On Android's dataset:''')
print('\n')
print(android_clean[4412][0])
print(android_clean[7940][0])

On Apple's dataset:


爱奇艺PPS -《欢乐颂2》电视剧热播
酷我音乐HD-无损在线播放
高德地图（精准专业的手机地图）


On Android's dataset:


中国語 AQリスニング
لعبة تقدر تربح DZ


Note: this sample is only for illustrative purposes. Most of the non-English apps on those datasets are Chinese! 

We now need a function to recognize if a given string (text) input is English or not. 

In Python letters and characters are coded according to the ASCII format, which essentially assigns them to a number. If that number is above 127, then the character is not English. For example the letter `é` has an ASCII value of 130.


In [14]:
def true_means_english_old(character) :   #input a string

    for i in str(character):    #check each character (letter or number or sign) in a string (text)
        if ord(i) < 127:        #ord refers to the ASCII number of the character
            return True
        else:
            return False        #sets non English text to the `False` value

print(true_means_english_old('Instagram'))
print(true_means_english_old('爱奇艺PPS -《欢乐颂2》电视剧热播'))

print('\n')

list_ascii=[]
                                #quick illustration of what characters we classify as English values
for i in range(32,127):         #ASCII below 32 are not relevant in this case, the first value, aka 32 is a space
    list_ascii.append(chr(i))   #add it to the list 
    
print('The relevant characters are: ')
print('\n')
print(list_ascii)



True
False


The relevant characters are: 


[' ', '!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '<', '=', '>', '?', '@', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '[', '\\', ']', '^', '_', '`', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '{', '|', '}', '~']


We can see that relevant characters above do not consider characters such as emojis, which feature outside of this scope, this is why a sligh modification needs to be done to the code, to include *exotic* characters such as `😜` or `™` as valid.

A workaround is to keep the logic of the function above, but saying that we need 4 non conventional characters for the app to be considered non-english. 

In [15]:
def true_means_english_new(character):   
    
    strike=0

    for i in character:
        
        if ord(i) > 127:
            strike += 1
            
    if strike > 3:         #Sets a 4 strike system before considering an app not English!
        return False
    else: 
        return True
    
            
print(true_means_english_new('NBA JAM by EA SPORTS™'))
print(true_means_english_new('Vkids Español: Spanish for kids'))

True
True


We might lose a few relevant apps over this arbitrary 4 limit but it's a fair compromise for our analysis.

Now that the function is generated, we need to use it on our dataset to generate a list of non-English named apps.

In [16]:
english_android=[]
not_english_android=[]

for i in android_clean:
    
    name = i[0]
    
    if true_means_english_new(name):
        english_android.append(i)
        
    elif true_means_english_new(name)== False :
        not_english_android.append(i)
 
def Extract(lst):
    return [item[0] for item in lst]

print('List of non-English apps found: ')
print('\n')
print(Extract(not_english_android))

print('\n')
print('A snippet of the corrected Android database: ')
print('\n')
explore_data(english_android,0,3,True)
        
        

List of non-English apps found: 


['Flame - درب عقلك يوميا', 'သိင်္ Astrology - Min Thein Kha BayDin', 'РИА Новости', 'صور حرف H', 'L.POINT - 엘포인트 [ 포인트, 멤버십, 적립, 사용, 모바일 카드, 쿠폰, 롯데]', 'RMEduS - 음성인식을 활용한 R 프로그래밍 실습 시스템', 'AJ렌터카 법인 카셰어링', 'Al Quran Free - القرآن (Islam)', '中国語 AQリスニング', '日本AV历史', 'Ay Yıldız Duvar Kağıtları', 'বাংলা টিভি প্রো BD Bangla TV', 'Cъновник BG', 'CSCS BG (в български)', '뽕티비 - 개인방송, 인터넷방송, BJ방송', 'BL 女性向け恋愛ゲーム◆俺プリクロス', 'SecondSecret ‐「恋を読む」BLノベルゲーム‐', 'BL 女性向け恋愛ゲーム◆ごくメン', 'あなカレ【BL】無料ゲーム', '감성학원 BL 첫사랑', 'BQ-መጽሐፍ ቅዱሳዊ ጥያቄዎች', 'BS Calendar / Patro / पात्रो', 'Vip视频免费看-BT磁力搜索', 'Билеты ПДД CD 2019 PRO', 'Offline Jízdní řády CG Transit', 'Bonjour 2017 Abidjan CI ❤❤❤❤❤', 'CK 初一 十五', 'الفاتحون Conquerors', 'DG ग्राम / Digital Gram Panchayat', 'DM הפקות', 'DW فارسی By dw-arab.com', 'لعبة تقدر تربح DZ', 'বাংলাflix', 'RPG ブレイジング ソウルズ アクセレイト', '英漢字典 EC Dictionary', 'ECナビ×シュフー', 'أحداث وحقائق | خبر عاجل في اخبار العالم', 'EG SIM CARD (EGSIMCARD, 이지심카드)', 'パーリーゲイツ公式通販｜EJ

The `Extract` function is one made to receive only the name of the app (the first element of a list in a list of lists) and not the whole row, and was written by [Geeksforgeeks.com](https://www.geeksforgeeks.org/python-get-first-element-of-each-sublist/).

We now do the same for our IOS dataset:

In [17]:
english_ios=[]
not_english_ios=[]

for i in ios:
    
    name = i[1]    # ***APP NAME IS ON COLUMN 2 ON IOS DATA***
    
    if true_means_english_new(name):
        english_ios.append(i)
        
    elif true_means_english_new(name)== False :
        not_english_ios.append(i)
 
def Extract_ios(lst):
    return [item[1] for item in lst]   #modified extract function because once again the app name is the second
                                       #... element of the list, so at index 1.

print('List of non-English apps found, SCROLL DOWN FOR OTHER DATA: ')
print('\n')
print(Extract_ios(not_english_ios))

print('\n')
print('A snippet of the corrected ios database: ')
print('\n')
explore_data(english_ios,0,3,True)
        
        


List of non-English apps found, SCROLL DOWN FOR OTHER DATA: 


['爱奇艺PPS -《欢乐颂2》电视剧热播', '聚力视频HD-人民的名义,跨界歌王全网热播', '优酷视频', '网易新闻 - 精选好内容，算出你的兴趣', '淘宝 - 随时随地，想淘就淘', '搜狐视频HD-欢乐颂2 全网首播', '阴阳师-全区互通现世集结', '百度贴吧-全球最大兴趣交友社区', '百度网盘', '爱奇艺HD -《欢乐颂2》电视剧热播', '乐视视频HD-白鹿原,欢乐颂,奔跑吧全网热播', '万年历-值得信赖的日历黄历查询工具', '新浪新闻-阅读最新时事热门头条资讯视频', '喜马拉雅FM（听书社区）电台有声小说相声英语', '央视影音-海量央视内容高清直播', '腾讯视频HD-楚乔传,明日之子6月全网首播', '手机百度 - 百度一下你就得到', '百度视频HD-高清电视剧、电影在线观看神器', 'MOMO陌陌-开启视频社交,用直播分享生活', 'QQ 浏览器-搜新闻、选小说漫画、看视频', '同花顺-炒股、股票', '聚力视频-蓝光电视剧电影在线热播', '快看漫画', '乐视视频-白鹿原,欢乐颂,奔跑吧全网热播', '酷我音乐HD-无损在线播放', '随手记（专业版）-好用的记账理财工具', 'Dictionary ( قاموس عربي / انجليزي + ودجيت الترجمة)', '滴滴出行', '高德地图（精准专业的手机地图）', '百度HD-极速安全浏览器', '美丽说-潮流穿搭快人一步', '百度地图-智能的手机导航，公交地铁出行必备', 'Majiang Mahjong（单机+川麻+二人+武汉+国标）', '土豆视频HD—高清影视综艺视频播放器', '360手机卫士-超安全的来电防骚扰助手', 'QQ浏览器HD-极速搜索浏览器', '搜狗输入法-Sogou Keyboard', '百度网盘 HD', '大众点评-发现品质生活', '讯飞输入法-智能语音输入和表情斗图神器', '美柚 - 女生助手', '爱奇艺 - 电视剧电影综艺娱乐视频播放器', '搜狐视频-欢乐颂2 全网首播', '百度地图HD', 'QQ同步助手-新机一键换机必备工具', 'QQ音乐-来这里“发现・音乐”', '腾

We are now left with 9614 Android apps and 6183 Apple apps. The previous numbers were 9659 and 7917.

The Apple dataset clearly needed more data cleaning.

### Paid Apps Processing

A gentle reminder that we need to study trends of free apps only! Hence it is of low utility to look at paid apps, as we all know users typically are hardly convinced to pay for apps. Thus we need to identify what apps are paid, and filter them out.

In [18]:
list_android_free=[]
def free_android(dataset):
    for i in dataset:
        app_price = (i[6])       #We could have used the price row, but non free apps are in the $X.XX format which
        if app_price == 'Free':  #is harder to analyze than the `Free` indicator on the `Type` column
            list_android_free.append(i)
            
free_android(english_android)

print('List of free apps, our new Android masterlist: ')
print('\n')

explore_data(list_android_free,0,3, True)

List of free apps, our new Android masterlist: 


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 8863
Number of columns: 13


Before commenting on it, we need to do the same with our `ios` dataset.

In [19]:
list_ios_free=[]
def free_ios(dataset):
    for i in dataset:
        app_price = (i[4])       
        if app_price == '0.0' :      #free apps had a 0.0 price int 
            list_ios_free.append(i)
            
free_ios(english_ios)

print('List of free apps, our new IOS masterlist: ')
print('\n')

explore_data(list_ios_free,0,3, True)



List of free apps, our new IOS masterlist: 


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


Number of rows: 3222
Number of columns: 16


We now have 8863 free Android apps and 3222 free Apple apps, from our previous dataset of 9614 Android apps and 6183 Apple apps.



As we are done with data cleaning, to avoid confusion, let's rename our new masterlists:

In [20]:
android_master = list_android_free

ios_master = list_ios_free

## Data Analysis



The company for whom we are doing the analysis for has a set app development strategy.

They wish to build early stage apps, and if they show promising download amounts they develop them further.
They wish to do this exclusively on the Android Google Play Store. If the app is profitable in 6 months, then they will release an IOS version of the app.

The goal however is to use the two datasets to find types of apps that are profitable in BOTH platforms.

Let's have a look at the headers of our two datasets, and get an idea of what we wish to analyse to achieve these goals.


In [21]:
print(android_header)

print('\n')

print(ios_header)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


The `Genres` Android column and the `prime_genre` IOS column seem like good starting points. They even indicate the same thing: what genre is the app we're looking at? 
It is also relevant to study the `Category` column for the Android data, as it fits the common theme.

We need two functions:

 1. One that calculates the overall percentages of all the possible values (adds up to 100)
 2. Another that sorts the percentages from highest to lowest
 
What inputs we can give to make these functions as re-usable as possible:

 1. Which dataset do we want to analyse
 2. What is the index of the column we want to focus on?

In [22]:
    
def frequency_table(dataset,index):   #function for generating a table of percentages of given values
    table = {}
    total = 0
    
    for i in dataset:         # keep count of how many times a column element shows up, starting from a blank count
        total += 1            # keeping track of how many elements we track in total
        variable = i[index]
        
        if variable in table:
            table[variable] += 1
        else:
            table[variable] = 1
    
    table_percents = {}
    
    for i in table:           #converting those from frequencies to percentages
        percent = (100*table[i])/total
        table_percents[i] = percent
        
    return table_percents

def disp_freq_table(dataset, index):         #sort and display the percentages
    table = frequency_table(dataset,index)
    table_disp = []
    for i in table:                          #as it is hard to sort dictionaries, we need to convert the table to just
        values_to_tuple = (table[i],i)       #... give us a tuple of (value, its percentage), ignoring the rest
        table_disp.append(values_to_tuple)
        
    table_in_order = sorted(table_disp, reverse = True)  #sort highest to lowest percentage
    for val in table_in_order:
        print(val[1], ':', val[0])


print('''The IOS app store dataset, features this category distribution (prime genre data): ''')
print('\n')
disp_freq_table(ios_master, -5)
        

The IOS app store dataset, features this category distribution (prime genre data): 


Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.6623215394165114
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.017380509000621
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


### Prime Genre Analysis (IOS)

More than half (58%) of the IOS apps are `Games`. It is far more popular than its runner-up, the `Entertainment` category (7.88%).

A trend that can be noticed is that application genres covered by the default apps are much more likely to not be as popular.

For example, games are not installed by default on your Apple device when you buy it. But there is a default camera app, that makes most `Photo & Video` apps quasi-useless. Which is surprising in a way since Apple handheld devices, mainly Iphones, are most probably the most popular photo taking device in the world.

Another trend that is noticeable is that there exists two types of apps. Those that we only need to install once and those that we use a bit then delete to then install another similar app.

For example Facebook, part of the `Social Networking` genre, only needs to be installed once and has few reasons to be uninstalled (and re-downloaded). On the other hand, a racing game might be uninstalled when it is finished, and the user will replace it with another racing game. An app that shows only one movie will be cycled (installed and un-installed) rather quickly compared to an app that shows many movies. 

This is what most propably causes the overwhelming majority of `Games` apps. **Instead of 1 gaming app that gives access to 20 games, we get 20 single gaming apps. This sort of meta is hard to replicate for other types of apps.**

Either way, leisure-based apps `Games`, `Entertainment`, `Social Networking`, `Sports`, `Music`, `Lifestyle` represent roughly 75% of all IOS apps! If the company's strategy is to focus on making low initial quality types of apps, they can focus on these single-use biased genre of apps.



### Category and Genres Analysis (Android)

Let's look at the `Category` and the `Genres` data from the Android data:

In [23]:
print('Category tab for Google Play Store apps')
print('\n')
disp_freq_table(android_master, 1)
print('\n')

Category tab for Google Play Store apps


FAMILY : 18.8987927338373
GAME : 9.725826469592688
TOOLS : 8.462146000225657
BUSINESS : 4.592124562789123
LIFESTYLE : 3.9038700214374367
PRODUCTIVITY : 3.8925871601038025
FINANCE : 3.700778517432021
MEDICAL : 3.5315355974275078
SPORTS : 3.3961412614238973
PERSONALIZATION : 3.317161232088458
COMMUNICATION : 3.2381812027530184
HEALTH_AND_FITNESS : 3.0802211440821394
PHOTOGRAPHY : 2.944826808078529
NEWS_AND_MAGAZINES : 2.798149610741284
SOCIAL : 2.6627552747376733
TRAVEL_AND_LOCAL : 2.335552296062281
SHOPPING : 2.2452894053932075
BOOKS_AND_REFERENCE : 2.1437436533904997
DATING : 1.8616721200496447
VIDEO_PLAYERS : 1.7939749520478394
MAPS_AND_NAVIGATION : 1.399074805370642
FOOD_AND_DRINK : 1.241114746699763
EDUCATION : 1.1621347173643235
ENTERTAINMENT : 0.9590432133589079
LIBRARIES_AND_DEMO : 0.9364774906916394
AUTO_AND_VEHICLES : 0.9251946293580052
HOUSE_AND_HOME : 0.8236488773552973
WEATHER : 0.8010831546880289
EVENTS : 0.7108202640189553
PARENTIN

In [24]:
print('Genre tab for Google Play Store apps')
print('\n')
disp_freq_table(android_master, 9)

Genre tab for Google Play Store apps


Tools : 8.450863138892023
Entertainment : 6.070179397495205
Education : 5.348076272142615
Business : 4.592124562789123
Productivity : 3.8925871601038025
Lifestyle : 3.8925871601038025
Finance : 3.700778517432021
Medical : 3.5315355974275078
Sports : 3.4638384294257025
Personalization : 3.317161232088458
Communication : 3.2381812027530184
Action : 3.102786866749408
Health & Fitness : 3.0802211440821394
Photography : 2.944826808078529
News & Magazines : 2.798149610741284
Social : 2.6627552747376733
Travel & Local : 2.324269434728647
Shopping : 2.2452894053932075
Books & Reference : 2.1437436533904997
Simulation : 2.042197901387792
Dating : 1.8616721200496447
Arcade : 1.8503892587160105
Video Players & Editors : 1.771409229380571
Casual : 1.7601263680469368
Maps & Navigation : 1.399074805370642
Food & Drink : 1.241114746699763
Puzzle : 1.1282861333634209
Racing : 0.9928917973598105
Role Playing : 0.9364774906916394
Libraries & Demo : 0.93647749069163

It can be seen that the `Games` genre is much less popular on Android than on IOS, this is due to 2 mains reasons:
 1. Android features a much more customizable experience, thus driving the popularity of non-game apps much higher compared to Apple's ecosystem. Consumers can express their individuality more easily than on IOS. 
 2. They simply categorize their apps differently. For example `Family` apps are sometimes clearly games. Examples include `Red Embrace (BL/Yaoi Game)`, `Beauty Rental Shop`, `City Builder 2016: County Mall`... Thus their categorizations lead to a perceived underrepresentation of their presence.
 
 
 
 ### General Notes
 1. Number of apps for a category does not always correlate with amount of users.
 2. Not all app installs/reviews are equal. One on Facebook is obviously worth more than one on an app you use for 5 minutes.
 3. Lastly the main metric we are interested in is how much time do individuals spend on types of apps?
 
 
 ### Most Popular Apps IOS
 
 At the end of the day, app popularity is the main driving metric. This is why it is useful to look at the top apps for each categories above. 
 
For the `android_master` list it is easy, as it features at `index 5` its `Number of installs`.
For the `ios_master` list we unfortunately do not have this index. This is why we'll use `Rating_count` as a proxy. It is at index 5. 


In [25]:
genres_ios = frequency_table(ios_master, 11)

for i in genres_ios:
    total = 0
    len_genre = 0
    for val in ios_master:
        genre_app = val[-5]
        if genre_app == i:            
            n_ratings = float(val[5])
            total += n_ratings
            len_genre += 1
    avg_n_ratings = total / len_genre
    print(i, ':', avg_n_ratings)



Social Networking : 71548.34905660378
Photo & Video : 28441.54375
Games : 22788.6696905016
Music : 57326.530303030304
Reference : 74942.11111111111
Health & Fitness : 23298.015384615384
Weather : 52279.892857142855
Utilities : 18684.456790123455
Travel : 28243.8
Shopping : 26919.690476190477
News : 21248.023255813954
Navigation : 86090.33333333333
Lifestyle : 16485.764705882353
Entertainment : 14029.830708661417
Food & Drink : 33333.92307692308
Sports : 23008.898550724636
Book : 39758.5
Finance : 31467.944444444445
Education : 7003.983050847458
Productivity : 21028.410714285714
Business : 7491.117647058823
Catalogs : 4004.0
Medical : 612.0


The two most popular categories are `Navigation` and `Social Networking`. Due to the popularity of juggernauts such as Google Maps and Facebook, one can assume their real average rating to be much lower. 

The `Weather` type is quite popular, but one can assume that people do not spend much time at all on these apps, thus rendering them useless for our app maker.

`Food & Drink` seems like a promising segment, due to its limitless amount of subjects, its prolonged user utility, and ease of development due to strong online ressources on the subject.

Checking the top 5 apps in the segment, we get:

In [26]:
for app in ios_master:
    if app[-5] == 'Food & Drink':
        print(app[1], ':', app[5]) 


Starbucks : 303856
Domino's Pizza USA : 258624
OpenTable - Restaurant Reservations : 113936
Allrecipes Dinner Spinner : 109349
DoorDash - Food Delivery : 25947
UberEATS: Uber for Food Delivery : 17865
Postmates - Food Delivery, Faster : 9519
Dunkin' Donuts - Get Offers, Coupons & Rewards : 9068
Chick-fil-A : 5665
McDonald's : 4050
Deliveroo: Restaurant Delivery - Order Food Nearby : 1702
SONIC Drive-In : 1645
Nowait Guest : 1625
7-Eleven, Inc. : 1356
Outback : 805
Bon Appetit : 750
Starbucks Keyboard : 457
Whataburger : 197
Delish Eatmoji Keyboard : 154
Lieferheld - Delicious food delivery service : 29
Lieferando.de : 29
McDo France : 22
Chefkoch - Rezepte, Kochen, Backen & Kochbuch : 20
Youmiam : 9
Marmiton Twist : 2
Open Food Facts : 1


Thus the high average ratings come from mostly A-list brands and food delivery services, both outside the scope of our company. This segment hence has low actual use for our company.

The `Education` genre is another promising one, due to it's immense scalibility and low barrier of entry. Let us check if this segment is heavily biased towards immense apps:



In [27]:
for app in ios_master:
    if app[-5] == 'Education':
        print(app[1], ':', app[5]) 


Duolingo - Learn Spanish, French and more : 162701
Guess My Age  Math Magic : 123190
Lumosity - Brain Training : 96534
Elevate - Brain Training and Games : 58092
Fit Brains Trainer : 46363
ClassDojo : 35440
Memrise: learn languages : 20383
Peak - Brain Training : 20322
Canvas by Instructure : 19981
ABCmouse.com - Early Learning Academy : 18749
Quizlet: Study Flashcards, Languages & Vocabulary : 16683
Photomath - Camera Calculator : 16523
iTunes U : 15801
Blackboard Mobile Learn™ : 13567
Star Chart : 13482
Remind: Fast, Efficient School Messaging : 9796
PBS KIDS Video : 8651
Toca Kitchen Monsters : 8062
Toca Hair Salon - Christmas Gift : 8049
Edmodo : 7197
Prodigy Math Game : 6683
Epic! - Unlimited Books for Kids : 6676
ChineseSkill -Learn Mandarin Chinese Language Free : 6077
Google Classroom : 5942
TED : 5782
Khan Academy: you can learn anything : 5459
Got It - Homework Help Math, Chem, Physics Solver : 4903
PowerSchool Mobile : 4547
SkyView® Free - Explore the Universe : 4188
Hopsco

It can be seen that bar the few top apps, many small apps have quite a strong number of reviews. 

The subject vary from languages(`Duolinguo`, `HelloTalk`, `Mondly`), brain-branded games (`Lumosity`, `Elevate`, `Memorado`), kid orientated apps (`Kids A-Z`, `ABCmouse.com`, `Toca Kitchen Monsters`) and others. 

Many of these are inspired on already existing ideas and some strong 'learning' marketing. Due to how hard it is to gauge our own learning skill, many pseudo-science learning apps are made popular. IQ-based apps is a classic example.

This segment shows some potential for our company, and it should be looked into more through the Android dataset.

### Most Popular Apps Google Play

The android popularity gauge is slightly misleading. It features it in this manner:


In [28]:
disp_freq_table(android_master, 5)

1,000,000+ : 15.728308699086089
100,000+ : 11.553650005641432
10,000,000+ : 10.549475346947986
10,000+ : 10.199706645605325
1,000+ : 8.394448832223851
100+ : 6.916393997517771
5,000,000+ : 6.826131106848697
500,000+ : 5.562450637481666
50,000+ : 4.77265034412727
5,000+ : 4.5131445334536835
10+ : 3.542818458761142
500+ : 3.2494640640866526
50,000,000+ : 2.3017037120613786
100,000,000+ : 2.1324607920568655
50+ : 1.9180864267178157
5+ : 0.7898002933543947
1+ : 0.5077287600135394
500,000,000+ : 0.270788672007221
1,000,000,000+ : 0.2256572266726842
0+ : 0.04513144533453684


The data is not really precise, and it would be hard to make accurate statements with it. But our goal is to understand key trends, which still can be analysed using this rough data. Since we cannot directly analyse this data, we need to delete the commas and the `+` signs, which is done below.

The fonction below also asseses, such as our IOS program above, the average number of installs per category.

In [29]:
pop_android = frequency_table(android_master, 1)

for i in pop_android:
    total = 0
    len_category = 0
    for app in android_master:
        category_app = app[1]
        if category_app == i:
            n_installs = app[5]
            n_installs = n_installs.replace(',', '')
            n_installs = n_installs.replace('+', '')
            total += float(n_installs)
            len_category += 1
    average_installs = total / len_category
    print(i, ':', average_installs)

ART_AND_DESIGN : 1986335.0877192982
AUTO_AND_VEHICLES : 647317.8170731707
BEAUTY : 513151.88679245283
BOOKS_AND_REFERENCE : 8767811.894736841
BUSINESS : 1712290.1474201474
COMICS : 817657.2727272727
COMMUNICATION : 38456119.167247385
DATING : 854028.8303030303
EDUCATION : 1833495.145631068
ENTERTAINMENT : 11640705.88235294
EVENTS : 253542.22222222222
FINANCE : 1387692.475609756
FOOD_AND_DRINK : 1924897.7363636363
HEALTH_AND_FITNESS : 4188821.9853479853
HOUSE_AND_HOME : 1331540.5616438356
LIBRARIES_AND_DEMO : 638503.734939759
LIFESTYLE : 1437816.2687861272
GAME : 15588015.603248259
FAMILY : 3697848.1731343283
MEDICAL : 120550.61980830671
SOCIAL : 23253652.127118643
SHOPPING : 7036877.311557789
PHOTOGRAPHY : 17840110.40229885
SPORTS : 3638640.1428571427
TRAVEL_AND_LOCAL : 13984077.710144928
TOOLS : 10801391.298666667
PERSONALIZATION : 5201482.6122448975
PRODUCTIVITY : 16787331.344927534
PARENTING : 542603.6206896552
WEATHER : 5074486.197183099
VIDEO_PLAYERS : 24727872.452830188
NEWS_AND_

The `Communications` category is hugely popular but once again is it due to apps such as WhatsApp and Facebook Messenger? 
Let's check it:

In [30]:
for app in android_master:
    if app[1] == 'COMMUNICATION' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

WhatsApp Messenger : 1,000,000,000+
imo beta free calls and text : 100,000,000+
Android Messages : 100,000,000+
Google Duo - High Quality Video Calls : 500,000,000+
Messenger – Text and Video Chat for Free : 1,000,000,000+
imo free video calls and chat : 500,000,000+
Skype - free IM & video calls : 1,000,000,000+
Who : 100,000,000+
GO SMS Pro - Messenger, Free Themes, Emoji : 100,000,000+
LINE: Free Calls & Messages : 500,000,000+
Google Chrome: Fast & Secure : 1,000,000,000+
Firefox Browser fast & private : 100,000,000+
UC Browser - Fast Download Private & Secure : 500,000,000+
Gmail : 1,000,000,000+
Hangouts : 1,000,000,000+
Messenger Lite: Free Calls & Messages : 100,000,000+
Kik : 100,000,000+
KakaoTalk: Free Calls & Text : 100,000,000+
Opera Mini - fast web browser : 100,000,000+
Opera Browser: Fast and Secure : 100,000,000+
Telegram : 100,000,000+
Truecaller: Caller ID, SMS spam blocking & Dialer : 100,000,000+
UC Browser Mini -Tiny Fast Private & Secure : 100,000,000+
Viber Mess

Indeed these apps and other juggernauts are not relevant to our analysis skew heavily this average number of installs per category.

In [31]:
under_100_m = []

for app in android_master:
    n_installs = app[5]
    n_installs = n_installs.replace(',', '')
    n_installs = n_installs.replace('+', '')
    if (app[1] == 'COMMUNICATION') and (float(n_installs) < 100000000):
        under_100_m.append(float(n_installs))
        
sum(under_100_m) / len(under_100_m)

3603485.3884615386

The average number of downloads for such category, without the 100 mil + downloads is an order of magnitude lower!

Similar statements can be made for `Categories` such as `MAPS_AND_NAVIGATION` or `SOCIAL`.

Going back to our observation that `Education` related games might be what we are looking for, let's look more closely at the `Education` category.

In [32]:
for app in android_master:
    if app[1] == 'EDUCATION' and ( 1000 < float(n_installs)):
        print(app[0], ':', app[5])

English Communication - Learn English for Chinese (Learn English for Chinese) : 100,000+
Khan Academy : 5,000,000+
Ai La Trieu Phu - ALTP Free : 100,000+
Learn Spanish - Español : 1,000,000+
Speed Reading : 500,000+
English for beginners : 1,000,000+
Mermaids : 5,000,000+
Learn Japanese, Korean, Chinese Offline & Free : 1,000,000+
Kids Mode : 500,000+
Dinosaurs Coloring Pages : 500,000+
Cars Coloring Pages : 1,000,000+
Math Tricks : 10,000,000+
Learn English Words Free : 5,000,000+
Japanese / English one-shop search dictionary - Free Japanese - English - Japanese dictionary application : 50,000+
English speaking texts : 1,000,000+
Thai Handwriting : 1,000,000+
THAI DICT 2018 : 1,000,000+
Kanji test · Han search Kanji training (free version) : 1,000,000+
Flippy Campus - Buy & sell on campus at a discount : 500,000+
Free intellectual training game application | : 1,000,000+
ABC Preschool Free : 5,000,000+
PINKFONG Baby Shark : 1,000,000+
English words application mikan : 500,000+
Learn E

Here we are looking at all the `Education` apps with more that 1000 downloads. 

Roughly the same observations could be said as on IOS's similar category. 

Brain games or IQ training type of games are notably absent, most probably due to them not being in the `Education` section. The Education section has a more serious connotation to it on Android, [as seen here](https://play.google.com/store/apps/category/EDUCATION?hl=en&gl=US).

If one [looks at the puzzle subcategory](https://play.google.com/store/apps/category/GAME_PUZZLE?hl=en&gl=US) they'll notice all the brain training and IQ games there!







In [33]:
for app in android_master:
    if app[1] == 'GAME' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'
                                      or app[5] == '10,000,000+'
                                      or app[5] == '1,000,000+'):
        print(app[0], ':', app[5])

Solitaire : 10,000,000+
Sonic Dash : 100,000,000+
PAC-MAN : 100,000,000+
Race the Traffic Moto : 10,000,000+
Marble - Temple Quest : 10,000,000+
Shooting King : 10,000,000+
Geometry Dash World : 10,000,000+
Roll the Ball® - slide puzzle : 100,000,000+
Farm Fruit Pop: Party Time : 1,000,000+
Piano Tiles 2™ : 100,000,000+
Pokémon GO : 100,000,000+
Paint Hit : 10,000,000+
Rolly Vortex : 10,000,000+
Woody Puzzle : 1,000,000+
Stack Jump : 10,000,000+
Extreme Car Driving Simulator : 100,000,000+
Bricks n Balls : 1,000,000+
The Fish Master! : 1,000,000+
Color Road : 10,000,000+
Draw In : 10,000,000+
Looper! : 1,000,000+
Trivia Crack : 100,000,000+
Baseball Boy! : 10,000,000+
Hello Stars : 10,000,000+
Tank Stars : 10,000,000+
Hole.io : 10,000,000+
Flip the Gun - Simulator Game : 10,000,000+
Mad Skills BMX 2 : 1,000,000+
MMX Hill Dash 2 – Offroad Truck, Car & Bike Racing : 1,000,000+
Word Link : 10,000,000+
Last Day on Earth: Survival : 10,000,000+
Partymasters - Fun Idle Game : 10,000,000+
Har

Heart of Vegas™ Slots – Free Slot Casino Games : 10,000,000+
Garena Free Fire : 100,000,000+
BROTHER IN WARS: GUNNER CITY WARLORDS : 1,000,000+
Police Car Driver : 10,000,000+
Moto Fighter 3D : 10,000,000+
Fields of Battle : 1,000,000+
Family Guy The Quest for Stuff : 10,000,000+
SnowMobile Parking Adventure : 1,000,000+
Fast Motorcycle Driver 2016 : 1,000,000+
Rope Hero: Vice Town : 10,000,000+
Motocross Mayhem : 1,000,000+
Block Gun 3D: Haunted Hollow : 1,000,000+
Navy Gunner Shoot War 3D : 10,000,000+
Drift Legends : 1,000,000+
FRONTLINE COMMANDO : 10,000,000+
Gun Builder ELITE : 1,000,000+
Magnum 3.0 Gun Custom SImulator : 1,000,000+
Gun Club Armory : 1,000,000+
4x4 Jeep Racer : 1,000,000+
Frontline Terrorist Battle Shoot: Free FPS Shooter : 1,000,000+
Modern Strike Online : 10,000,000+
Big Hunter : 10,000,000+
Soccer Clubs Logo Quiz : 1,000,000+
Fatal Raid - No.1 Mobile FPS : 1,000,000+


The number of games with over a million downloads on the Android Google Play Store is impressive, and proof that this market is full of potential, even for non AAA-type of games.

## Conclusion

Our suggested niche is based on "Getting Smarter" or puzzle sort of games, often with an IQ related branding. 
These games often suceed at gathering 100'000+ downloads, even if they have a cheap feel, examples [here](https://play.google.com/store/apps/details?id=com.pixign.smart.puzzles), [here](https://play.google.com/store/apps/details?id=com.easybrain.brain.test.easy.game&hl=en&gl=US), [here](https://play.google.com/store/apps/details?id=com.mind.quiz.brain.out), [and here](https://play.google.com/store/apps/details?id=net.rention.mind.skillz). These apps were all found on the front page of the `Top Rated Puzzle Games` section! 

The more profitable and sucessfull apps could be swiftly ported over to IOS, as they were proved to be a successful niche on that platform as well.

Users will use these apps much longer than weather apps for example, and furthermore most ads for these games have a cheap look to it and it even creates an appeal for such games. 






