## Profitable App Profiles for the App Store and Google Play Markets

This is the analyze for a company that builds Android and iOS mobile apps.These apps are free and the main source of revenue are in-app ads. In this project, we will analyze data to help developers understand what type of apps are likely to attract more users on Google Play and App Store.

There are over four millions apps in which approximately 2 million iOS apps available on the App Store, and 2.1 million Android apps on Google Play. However, in this project, we only analyze a sample of data instead. We choose two dataset that seem suitable for our goals.

- A data set containing data about approximately 10,000 Android apps from Google Play; the data was collected in August 2018. You can download the data set directly from this link.
- A data set containing data about approximately 7,000 iOS apps from the App Store; the data was collected in July 2017. You can download the data set directly from this link.

Google Play Market Dataset
1. Application name
2. Category: Category the app belongs to
3. Rating: Overall user rating of the app (as when scraped)
4. Reviews: Number of user reviews for the app (as when scraped)
5. Size: Size of the app
6. Installs: Number of user downloads/installs for the app
7. Type: Paid or Free
8. Price: Price of the app
9. Content Rating: Age group the app is targeted at - Children / Mature 21+ / Adult
10. Genres: An app can belong to multiple genres (apart from its main category). For eg, a musical family game will belong to Music, Game, Family genres.

App Store Dataset
1. "id" : App ID
2. "track_name": App Name
3. "size_bytes": Size (in Bytes)
4. "currency": Currency Type
5. "price": Price amount
6. "ratingcounttot": User Rating counts (for all version)
7. "ratingcountver": User Rating counts (for current version)
8. "user_rating" : Average User Rating value (for all version)
9. "userratingver": Average User Rating value (for current version)
10. "ver" : Latest version code
11. "cont_rating": Content Rating
12. "prime_genre": Primary Genre
13. "sup_devices.num": Number of supporting devices
14. "ipadSc_urls.num": Number of screenshots showed for display
15. "lang.num": Number of supported languages
16. "vpp_lic": Vpp Device Based Licensing Enabled

In [1]:
from csv import reader

def open_csvfile(filename):
    opened_file = open(filename, encoding='utf8')
    file_content = reader(opened_file)
    lst = list(file_content)
    opened_file.close()
    
    return lst[0], lst[1:]

In [2]:
ios_header, ios_apps = open_csvfile('AppleStore.csv')
android_header, android_apps = open_csvfile('googleplaystore.csv')

To explore the dataset, we use the function explore_data that allows us to slice the dataset by position. We also use the optional parameter rows_and_columns to print out the information of the dataset (number of rows and number of columns).

In [3]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

In [4]:
explore_data(ios_apps, 0, 3, True)

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


Number of rows: 7197
Number of columns: 16


In [5]:
explore_data(android_apps, 0, 3, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


Number of rows: 10841
Number of columns: 13


## Data preparation
### 1. Remove the inaccurate data from Google Play dataset
The Google Play data set has a dedicated discussion section, and we can see that one of the discussions describes an error for a certain row.Let's print it out and compare it with other records.

In [6]:
android_apps[10472:10474]

[['Life Made WI-Fi Touchscreen Photo Frame',
  '1.9',
  '19',
  '3.0M',
  '1,000+',
  'Free',
  '0',
  'Everyone',
  '',
  'February 11, 2018',
  '1.0.19',
  '4.0 and up'],
 ['osmino Wi-Fi: free WiFi',
  'TOOLS',
  '4.2',
  '134203',
  '4.1M',
  '10,000,000+',
  'Free',
  '0',
  'Everyone',
  'Tools',
  'August 7, 2018',
  '6.06.14',
  '4.4 and up']]

The record of 10472 is wrong. It missed the Category column and all remaining columns (except App name) moved up. As a consequence, we delete the row.

In [7]:
print(len(android_apps))
# remove the row using the del statement. 
del android_apps[10472]
print(len(android_apps))

10841
10840


### 2. Removing Duplicate Entries

Some apps in the dataset occurs more than once. In this step, we need to remove the duplicate entries and keep only one entry per app. We start with Android apps.

In [8]:
unique_apps  = []
duplicate_apps  = []

for row in android_apps:
    name = row[0]
    if name in unique_apps:
        duplicate_apps .append(name)
    else:
        unique_apps.append(name)

In [10]:
print(len(unique_apps))
print(len(duplicate_apps))

9659
1181


Sample of duplicate Android apps:

In [12]:
duplicate_apps[:10]

['Quick PDF Scanner + OCR FREE',
 'Box',
 'Google My Business',
 'ZOOM Cloud Meetings',
 'join.me - Simple Meetings',
 'Box',
 'Zenefits',
 'Google Ads',
 'Google My Business',
 'Slack']

I am curious why the data is duplicate and which fields stay the same?

In [13]:
# return the list of apps has the corresponding name
def find_app(app_name, ds):
    app_lst = []
    for row in ds:
        if app_name == row[0]:
            app_lst.append(row)
            
    return app_lst

In [14]:
lst = find_app('Quick PDF Scanner + OCR FREE', android_apps)
print(lst)

[['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up'], ['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up'], ['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80804', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']]


In [15]:
lst = find_app('Instagram', android_apps)
print(lst)

[['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]


The main difference is the fourth column: the number of reviews. In the next step, we will remove duplicates. Rather than removing duplicates randomly, we'll only keep the row with the highest number of reviews and remove the other entries for any given app. 

To remove the duplicates, we will:
- Create a dictionary, where each dictionary key is a unique app name and the corresponding dictionary value is the highest number of reviews of that app.
- Use the information stored in the dictionary and create a new data set, which will have only one entry per app (and for each app, we'll only select the entry with the highest number of reviews).

In [16]:
reviews_max = {}
for row in android_apps:
    name = row[0]
    n_reviews = float(row[3])
    
    # duplicate records
    if name in reviews_max:
        # compare two number of reviews and choose the highest one
        if n_reviews > reviews_max[name]:
            reviews_max[name] = n_reviews
    else:
        # add this app to the dict
        reviews_max[name] = n_reviews

We will inspect the dictionary and find out whether the result is correct

In [17]:
lst = find_app('Quick PDF Scanner + OCR FREE', android_apps)
print(lst)

[['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up'], ['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up'], ['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80804', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']]


In [18]:
print(reviews_max['Quick PDF Scanner + OCR FREE'])

80805.0


In [19]:
len(reviews_max)

9659

In the next step, we will use the dictionary created to remove the duplicate rows.
- Start by creating two empty lists: android_clean (which will store our new cleaned data set) and already_added (which will just store app names).
- Loop through the Google Play data set (make sure you don't include the header row), and for each iteration:
    + Assign the app name to a variable named name.
    + Convert the number of reviews to float, and assign it to a variable named n_reviews.
- If n_reviews is the same as the number of maximum reviews of the app name (the number can be found in the reviews_max dictionary) and name is not already in the list already_added (read the solution notebook to find out why we need this supplementary condition):
    + Append the entire row to the android_clean list (which will eventually be a list of list and store our cleaned data set).
    + Append the name of the app name to the already_added list — this helps us to keep track of apps that we already added.

In [20]:
android_clean = [] # store our new cleaned data set
already_added  = [] # which will just store app names

for row in android_apps:
    name = row[0]
    n_reviews = float(row[3])
    
    if (n_reviews==reviews_max[name]) and (name not in already_added):
        android_clean.append(row)
        already_added.append(name)

In [21]:
print(len(android_clean))

9659


### 3. Removing Non English Apps

In [22]:
def is_english_name(app_name):
    for character in app_name:
        if ord(character) > 127:
            return False
    return True

We use this function to check whether these app names are detected as English or non-English:

In [23]:
print(is_english_name('Instagram'))
print(is_english_name('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(is_english_name('Docs To Go™ Free Office Suite'))
print(is_english_name('Instachat 😜'))

True
False
False
False


- The function seems to work fine, but some English app names use emojis or other symbols (™, — (em dash), – (en dash), etc.) that fall outside of the ASCII range. Because of this, we'll remove useful apps if we use the function in its current form.
- If we're going to use the function we've created, we'll lose useful data since many English apps will be incorrectly labeled as non-English. To minimize the impact of data loss, we'll only remove an app if its name has more than three characters with corresponding numbers falling outside the ASCII range. This means all English apps with up to three emoji or other special characters will still be labeled as English. Our filter function is still not perfect, but it should be fairly effective.
- Let's edit the function we created in the previous screen, and then use it to filter out the non-English apps.

In [24]:
def is_english_name(app_name):
    strange_chars = 0
    for character in app_name:
        if ord(character) > 127:
            strange_chars += 1
            
    if strange_chars > 3:
        return False
    else:
        return True

In [25]:
print(is_english_name('Instagram'))
print(is_english_name('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(is_english_name('Docs To Go™ Free Office Suite'))
print(is_english_name('Instachat 😜'))

True
False
True
True


In [26]:
android_english = []
ios_english = []
android_non_english = []
ios_non_english = []

for row in android_clean:
    name = row[0]
    if is_english_name(name):
        android_english.append(row)
    else:
        android_non_english.append(name)
    
for row in ios_apps:
    name = row[1]
    if is_english_name(name):
        ios_english.append(row)
    else:
        ios_non_english.append(name)  

In [27]:
print(android_english[0][0])
print(android_english[1][0])
print(android_english[2][0])

Photo Editor & Candy Camera & Grid & ScrapBook
U Launcher Lite – FREE Live Cool Themes, Hide Apps
Sketch - Draw & Paint


In [28]:
print(ios_english[0][1])
print(ios_english[1][1])
print(ios_english[2][1])

Facebook
Instagram
Clash of Clans


In [29]:
android_non_english[:10]

['Flame - درب عقلك يوميا',
 'သိင်္ Astrology - Min Thein Kha BayDin',
 'РИА Новости',
 'صور حرف H',
 'L.POINT - 엘포인트 [ 포인트, 멤버십, 적립, 사용, 모바일 카드, 쿠폰, 롯데]',
 'RMEduS - 음성인식을 활용한 R 프로그래밍 실습 시스템',
 'AJ렌터카 법인 카셰어링',
 'Al Quran Free - القرآن (Islam)',
 '中国語 AQリスニング',
 '日本AV历史']

In [30]:
ios_non_english[:10]

['爱奇艺PPS -《欢乐颂2》电视剧热播',
 '聚力视频HD-人民的名义,跨界歌王全网热播',
 '优酷视频',
 '网易新闻 - 精选好内容，算出你的兴趣',
 '淘宝 - 随时随地，想淘就淘',
 '搜狐视频HD-欢乐颂2 全网首播',
 '阴阳师-全区互通现世集结',
 '百度贴吧-全球最大兴趣交友社区',
 '百度网盘',
 '爱奇艺HD -《欢乐颂2》电视剧热播']

In [31]:
print(len(android_english))
print(len(ios_english))

9614
6183


### 4. Removing Paid Apps

In [32]:
android_final = []

for row in android_english:
    price = row[7]
    if price == '0':
        android_final.append(row)        

In [33]:
ios_final = []

for row in ios_english:
    price = float(row[4])
    if price == 0:
        ios_final.append(row)   

In [34]:
print(len(android_final))
print(len(ios_final))

8864
3222


## Most Common Apps by Genre

So far, we are done with cleaning data. As mentioned before, our goal is to find the kinds of apps that are likely to attract more users because our revenue is highly influenced by the number of peope using our apps. Because the end goal is to add the app on both Google Play and the App Store, we need to find app profiles that are successful on both markets. For instance, a profile that works well for both markets might be a productivity app that makes use of gamification.

Let's begin the analysis by getting a sense of what are the most common genres for each market. For this, we'll need to build frequency tables for a few columns in our data sets.

We choose the prime_genre column of the App Store data set, and the Genres and Category columns of the Google Play data set.

We'll build two functions we can use to analyze the frequency tables:

+ One function to generate frequency tables that show percentages
+ Another function we can use to display the percentages in a descending order

In [35]:
def freq_table(dataset, index):
    dict = {}
    total = len(dataset)
    
    for row in dataset:
        val = row[index]
        if val in dict:
            dict[val] += 1
        else:
            dict[val] = 1
            
    for key in dict:
        dict[key] = round(dict[key]*100/total, 2)
            
    return dict

In [36]:
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted[:10]:
        print(entry[1], ':', entry[0])

In [37]:
# We display only the most popular genre (top 10) of applications
display_table(ios_final, 11)

Games : 58.16
Entertainment : 7.88
Photo & Video : 4.97
Education : 3.66
Social Networking : 3.29
Shopping : 2.61
Utilities : 2.51
Sports : 2.14
Music : 2.05
Health & Fitness : 2.02


We can see that among the free English apps, more than a half (58.16%) are games. Entertainment apps are close to 8%, followed by photo and video apps, which are close to 5%. Only 3.66% of the apps are designed for education, followed by social networking apps which amount for 3.29% of the apps in our data set.

The general impression is that App Store (at least the part containing free English apps) is dominated by apps that are designed for fun (games, entertainment, photo and video, social networking, sports, music, etc.), while apps with practical purposes (education, shopping, utilities, productivity, lifestyle, etc.) are more rare. However, the fact that fun apps are the most numerous doesn't also imply that they also have the greatest number of users — the demand might not be the same as the offer.

Let's continue by examining the Genres and Category columns of the Google Play data set (two columns which seem to be related).

In [38]:
# We display only the most popular category (top 10) of applications
display_table(android_final, 1)

FAMILY : 18.91
GAME : 9.72
TOOLS : 8.46
BUSINESS : 4.59
LIFESTYLE : 3.9
PRODUCTIVITY : 3.89
FINANCE : 3.7
MEDICAL : 3.53
SPORTS : 3.4
PERSONALIZATION : 3.32


The landscape seems significantly different on Google Play: there are not that many apps designed for fun, and it seems that a good number of apps are designed for practical purposes (family, tools, business, lifestyle, productivity, etc.). However, if we investigate this further, we can see that the family category (which accounts for almost 19% of the apps) means mostly games for kids.

In [39]:
# We display only the most popular genre (top 10) of applications
display_table(android_final, 9)

Tools : 8.45
Entertainment : 6.07
Education : 5.35
Business : 4.59
Productivity : 3.89
Lifestyle : 3.89
Finance : 3.7
Medical : 3.53
Sports : 3.46
Personalization : 3.32


### Most Popular Apps by Genre on the App Store

In [40]:
genres_ios = freq_table(ios_final, 11)

In [46]:
lst = []
for genre in genres_ios:
    total = 0
    len_genre = 0
    for row in ios_final:
        rating_count = float(row[5]) # rating count tot
        genre_app = row[11]       # prime genre
        if genre == genre_app:
            total += rating_count
            len_genre += 1
            
    avg = round(total/len_genre, 2)
    tuple_e = (avg, genre)
    lst.append(tuple_e)

lst_sorted = sorted(lst, reverse=True)

for element in lst_sorted:
    print(element[1], ": ", element[0])

Navigation :  86090.33
Reference :  74942.11
Social Networking :  71548.35
Music :  57326.53
Weather :  52279.89
Book :  39758.5
Food & Drink :  33333.92
Finance :  31467.94
Photo & Video :  28441.54
Travel :  28243.8
Shopping :  26919.69
Health & Fitness :  23298.02
Sports :  23008.9
Games :  22788.67
News :  21248.02
Productivity :  21028.41
Utilities :  18684.46
Lifestyle :  16485.76
Entertainment :  14029.83
Business :  7491.12
Education :  7003.98
Catalogs :  4004.0
Medical :  612.0


In [47]:
for app in ios_final:
    if app[-5] == 'Navigation':
        print(app[1], ':', app[5]) # print name and number of ratings

Waze - GPS Navigation, Maps & Real-time Traffic : 345046
Google Maps - Navigation & Transit : 154911
Geocaching® : 12811
CoPilot GPS – Car Navigation & Offline Maps : 3582
ImmobilienScout24: Real Estate Search in Germany : 187
Railway Route Search : 5


On average, navigation apps have the highest number of user reviews, but this figure is heavily influenced by Waze and Google Maps, which have close to half a million user reviews together.

In [48]:
for app in ios_final:
    if app[-5] == 'Reference':
        print(app[1], ':', app[5]) # print name and number of ratings

Bible : 985920
Dictionary.com Dictionary & Thesaurus : 200047
Dictionary.com Dictionary & Thesaurus for iPad : 54175
Google Translate : 26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran : 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition : 17588
Merriam-Webster Dictionary : 16849
Night Sky : 12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) : 8535
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools : 4693
GUNS MODS for Minecraft PC Edition - Mods Tools : 1497
Guides for Pokémon GO - Pokemon GO News and Cheats : 826
WWDC : 762
Horror Maps for Minecraft PE - Download The Scariest Maps for Minecraft Pocket Edition (MCPE) Free : 718
VPN Express : 14
Real Bike Traffic Rider Virtual Reality Glasses : 8
教えて!goo : 0
Jishokun-Japanese English Dictionary & Translator : 0


In [None]:
This Reference category seems to be potential.

### Most Popular Apps by Genre on Google Play

In the previous screen, we came up with an app profile recommendation for the App Store based on the number of user ratings. We have data about the number of installs for the Google Play market, so we should be able to get a clearer picture about genre popularity. However, the install numbers don't seem precise enough — we can see that most values are open-ended (100+, 1,000+, 5,000+, etc.):

In [49]:
display_table(android_final, 5) # the Installs columns

1,000,000+ : 15.73
100,000+ : 11.55
10,000,000+ : 10.55
10,000+ : 10.2
1,000+ : 8.39
100+ : 6.92
5,000,000+ : 6.83
500,000+ : 5.56
50,000+ : 4.77
5,000+ : 4.51


One problem with this data is that is not precise. For instance, we don't know whether an app with 100,000+ installs has 100,000 installs, 200,000, or 350,000. However, we don't need very precise data for our purposes — we only want to find out which app genres attract the most users, and we don't need perfect precision with respect to the number of users.

We're going to leave the numbers as they are, which means that we'll consider that an app with 100,000+ installs has 100,000 installs, and an app with 1,000,000+ installs has 1,000,000 installs, and so on. To perform computations, however, we'll need to convert each install number from string to float. This means we need to remove the commas and the plus characters, otherwise the conversion will fail and raise an error.

To remove characters from strings, we can use str.replace(old, new) method (just like list.append() or list.copy(), str.replace() is a special kind of function called method — we'll learn more about this early in the next course). str.replace() takes in two parameters, old and new, and replaces all occurrences of old within a string with new:

In [58]:
android_app_categories = freq_table(android_final, 1) # second column
category_lst = []

for category in android_app_categories:
    total = 0
    len_category = 0
    for app in android_final:
        category_app = app[1]
        if category_app == category:            
            number_of_installs = app[5] # number of install
            number_of_installs = number_of_installs.replace(',', '')
            number_of_installs = number_of_installs.replace('+', '')
            number_of_installs = float(number_of_installs)
            total += number_of_installs
            len_category += 1
            
    avg_n_installs = total / len_category
    
    tuple_e = (avg_n_installs, category)
    category_lst.append(tuple_e)
    
category_lst = sorted(category_lst, reverse=True)

for element in category_lst[:10]:
    print(element[1], ": ", element[0])

COMMUNICATION :  38456119.167247385
VIDEO_PLAYERS :  24727872.452830188
SOCIAL :  23253652.127118643
PHOTOGRAPHY :  17840110.40229885
PRODUCTIVITY :  16787331.344927534
GAME :  15588015.603248259
TRAVEL_AND_LOCAL :  13984077.710144928
ENTERTAINMENT :  11640705.88235294
TOOLS :  10801391.298666667
NEWS_AND_MAGAZINES :  9549178.467741935


In [66]:
for app in android_final:
    if app[1] == "COMMUNICATION" and (app[5] == "1,000,000,000+"
                                      or app[5] == "500,000,000+"
                                      or app[5] == "100,000,000+"):
        print(app[0], ':', app[5])

WhatsApp Messenger : 1,000,000,000+
imo beta free calls and text : 100,000,000+
Android Messages : 100,000,000+
Google Duo - High Quality Video Calls : 500,000,000+
Messenger – Text and Video Chat for Free : 1,000,000,000+
imo free video calls and chat : 500,000,000+
Skype - free IM & video calls : 1,000,000,000+
Who : 100,000,000+
GO SMS Pro - Messenger, Free Themes, Emoji : 100,000,000+
LINE: Free Calls & Messages : 500,000,000+
Google Chrome: Fast & Secure : 1,000,000,000+
Firefox Browser fast & private : 100,000,000+
UC Browser - Fast Download Private & Secure : 500,000,000+
Gmail : 1,000,000,000+
Hangouts : 1,000,000,000+
Messenger Lite: Free Calls & Messages : 100,000,000+
Kik : 100,000,000+
KakaoTalk: Free Calls & Text : 100,000,000+
Opera Mini - fast web browser : 100,000,000+
Opera Browser: Fast and Secure : 100,000,000+
Telegram : 100,000,000+
Truecaller: Caller ID, SMS spam blocking & Dialer : 100,000,000+
UC Browser Mini -Tiny Fast Private & Secure : 100,000,000+
Viber Mess

In [68]:
for app in android_final:
    if app[1] == "VIDEO_PLAYERS" and (app[5] == "1,000,000,000+"
                                      or app[5] == "500,000,000+"
                                      or app[5] == "100,000,000+"):
        print(app[0], ':', app[5])

YouTube : 1,000,000,000+
Motorola Gallery : 100,000,000+
VLC for Android : 100,000,000+
Google Play Movies & TV : 1,000,000,000+
MX Player : 500,000,000+
Dubsmash : 100,000,000+
VivaVideo - Video Editor & Photo Movie : 100,000,000+
VideoShow-Video Editor, Video Maker, Beauty Camera : 100,000,000+
Motorola FM Radio : 100,000,000+
