# Profitable App Profiles for the App Store and Google Play Markets
As we are working as data scientists for a company that builds mobile apps the goal of this project is to explore the App Store and Google Play market datasets to find the most profitable apps.

## Opening and Exploring the data
In this section we will import and explore the following `.csv` datasets:

- [Mobile App Store (7200 apps) dataset](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/home).
- [Google Play Store Apps dataset](https://www.kaggle.com/lava18/google-play-store-apps/home)  
  
To do that we will create a functions to read data and headers from `.csv` and to explore obtained datasets.

In [1]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

In [2]:
from csv import reader

def open_data(dataset_path, include_headers=True):
    opened_file = open(dataset_path)
    read_file = reader(opened_file)
    all_data = list(read_file)
    
    if include_headers:
        return all_data[1:], all_data[0]
    else:

        return all_data

### Exploring Apples dataset
Using the defined functions it is time to get and explore data for Apple Store dataset.

In [3]:
apples_data, apples_headers = open_data('AppleStore.csv')

In [4]:
print(apples_headers)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


In [5]:
explore_data(apples_data, 0, 2, True)

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


Number of rows: 7197
Number of columns: 16


We see that Apple Store dataset has 7197 data rows with 16 columns. At first glance, the main useful attributes for the analysis purpose are `"price"`, `"rating counts"`, `"content rating"`, `"primary genre"`.

| Column Name | Description |
|-------------|-------------|
| "id" | App ID |
| "track_name" | App Name |
| "size_bytes" | Size (in Bytes) |
| "currency" | Currency Type |
| "price" | Price amount |
| "rating_count_tot" | User Rating counts (for all version) |
| "rating_count_ver" | User Rating counts (for current version) |
| "user_rating" | Average User Rating value (for all version) |
| "user_rating_ver" | Average User Rating value (for current version) |
| "ver" | Latest version code |
| "cont_rating" | Content Rating |
| "prime_genre" | Primary Genre |
| "sup_devices.num" | Number of supporting devices |
| "ipadSc_urls.num" | Number of screenshots showed for display |
| "lang.num" | Number of supported languages |
| "vpp_lic" | Vpp Device Based Licensing Enabled |

### Exploring Google Play Store dataset
Now we will do the same thing for the Google Play Store dataset.

In [6]:
googles_data, googles_headers = open_data('googleplaystore.csv')

In [7]:
print(googles_headers)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


In [8]:
explore_data(googles_data, 0, 2, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


Number of rows: 10841
Number of columns: 13


This dataset contains 10841 rows with 13 columns. The attributes we will use for analysis are `"Category"`, `"Rating"`, `"Reviews"`, `"Size"`, `"Installs"`, `"Type"`, `"Price"`, `"Content Rating"`, `"Genres"`.

| Column Name | Description |
|-------------|-------------|
| "App" | Application name |
| "Category" | Category the app belongs to |
| "Rating" | Overall user rating of the app (as when scraped) |
| "Reviews" | Number of user reviews for the app (as when scraped) |
| "Size" | Size of the app (as when scraped) |
| "Installs" | Number of user downloads/installs for the app (as when scraped) |
| "Type" | Paid or Free |
| "Price" | Price of the app (as when scraped) |
| "Content Rating" | Age group the app is targeted at - Children / Mature 21+ / Adult |
| "Genres" | An app can belong to multiple genres (apart from its main category). For eg, a musical family game will belong to Music, Game, Family genres. |
| "Last Updated" | Date when the app was last updated on Play Store (as when scraped) |
| "Current Ver" | Current version of the app available on Play Store (as when scraped) |
| "Android Ver" | Min required Android version (as when scraped) |

## Deleting Wrong Data
Now, after performing a brief exploration of our datasets, we are ready to start the next step. In this section we will detect and fix inaccurate or duplicate data.

### Removing Error Entries

As mentioned in the [one of the Kaggle discussions](https://www.kaggle.com/lava18/google-play-store-apps/discussion/66015), the Google Play Market dataset contains incorrect row 10472 with missing "Category" attribute. One of the ways to deal with error rows is simply to delete them. `¯\_(ツ)_/¯`

In [9]:
for i in range(len(googles_headers)):
    if (len(googles_data[10472]) > i):
        print(googles_headers[i] + ': ' + googles_data[10472][i])
    else:
        print(googles_headers[i] + ' not found!')

App: Life Made WI-Fi Touchscreen Photo Frame
Category: 1.9
Rating: 19
Reviews: 3.0M
Size: 1,000+
Installs: Free
Type: 0
Price: Everyone
Content Rating: 
Genres: February 11, 2018
Last Updated: 1.0.19
Current Ver: 4.0 and up
Android Ver not found!


In [10]:
print(len(googles_data))
del googles_data[10472]
print(len(googles_data))

10841
10840


### Removing Duplicate Entries: Part One

According to [some](https://www.kaggle.com/lava18/google-play-store-apps/discussion/82616) of [the](https://www.kaggle.com/lava18/google-play-store-apps/discussion/67894) Kaggle discussions there are duplicates in the dataset. We will explore and fix them below so the record with the maximum number of reviews will be kept.

In [11]:
unique_apps = []
duplicate_apps = []
for app in googles_data:
    app_name = app[0]
    
    if app_name in unique_apps:
        duplicate_apps.append(app_name)
    else:
        unique_apps.append(app_name)

In [12]:
print('Number of duplicates: ' + str(len(duplicate_apps)))
print('Expected number of apps: ' + str(len(googles_data) - len(duplicate_apps)))

Number of duplicates: 1181
Expected number of apps: 9659


We do not want to count duplicate rows during our analysis. The dataset should have 9659 apps after the cleaning process.

In [13]:
for app in googles_data:
    app_name = app[0]
    
    if app_name == 'Quick PDF Scanner + OCR FREE':
        print(app)

['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']
['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']
['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80804', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up']


There are 3 rows of `"PDF Scanner"` app where only differeence is the number of reviews. This is the measure which we can use to dereminate the newest row. 

### Removing Duplicate Entries: Part Two

To remove the duplicates, we will
- Create a dictionaty with app names and the highest number of reviews
- Create a new data set from the obrained dictionary

In [14]:
reviews_max = {}

for app in googles_data:
    name = app[0]
    n_reviews = int(app[3])
    
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
    elif name not in reviews_max:
        reviews_max[name] = n_reviews

In [15]:
len(reviews_max)

9659

Now we will remove the duplicate rows and keep ones presented in the `"reviews_max"` dictionary.

In [16]:
googles_data_clean = []
already_added = []

for app in googles_data:
    name = app[0]
    n_reviews = int(app[3])
    
    if name not in already_added and n_reviews == reviews_max[name]:
        googles_data_clean.append(app)
        already_added.append(name)

Let's explore the new dataset, and confirm that the number of rows is 9659.

In [17]:
explore_data(googles_data_clean, 0, 2, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


Number of rows: 9659
Number of columns: 13


### Removing Non-English Apps

We can find that both datasets have apps with names that suggest they are not directed toward an English-speaking audience.

In [18]:
print(apples_data[813][1])
print(apples_data[6731][1])
print(googles_data_clean[4412][0])
print(googles_data_clean[7940][0])

爱奇艺PPS -《欢乐颂2》电视剧热播
【脱出ゲーム】絶対に最後までプレイしないで 〜謎解き＆ブロックパズル〜
中国語 AQリスニング
لعبة تقدر تربح DZ


As we are not interested in kepping these apps, we will remove them.

In [19]:
def is_english(word, treshold=3):
    count = 0
    
    for char in word:
        if ord(char) > 127:
            count += 1
    
    if count > treshold:
        return False
    else:
        return True

Defined function helps us to determinate if word contains non english characters. If amount of characters with code larger than 127 (symbols that fall outside of the ASCII range) is greater than treshold value (default 3) then function returns True. Otherwise, function returns False.

In [20]:
print(is_english('Instagram'))
print(is_english('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(is_english('Docs To Go™ Free Office Suite'))
print(is_english('Instachat 😜'))

True
False
True
True


In the test cases we can see that spectial symbols and emojis are still passed as 'english' apps. That's exactly what we wanted! Let's create a new filtered datasets.

In [21]:
apples_en = []

for app in apples_data:
    name = app[1]
    if is_english(name):
        apples_en.append(app)

In [22]:
googles_en = []

for app in googles_data_clean:
    name = app[0]
    if is_english(name):
        googles_en.append(app)

### Isolating the Free Apps

So far in the data cleaning process we removed inaccurate, duplicate and non-English data. As the last step of preparing data for analysis we will focus on keeping free apps.

In [23]:
apples_free = []

for app in apples_en:
    price = float(app[4])
    if price == 0:
        apples_free.append(app)

In [24]:
googles_free = []

for app in googles_en:
    app_type = app[6]
    if app_type == 'Free':
        googles_free.append(app)

In [25]:
print('Apple Store Apps remaining: ' + str(len(apples_free)))
print('Google Play Apps remaining: ' + str(len(googles_free)))

Apple Store Apps remaining: 3222
Google Play Apps remaining: 8863


In the end of the cleaning and preparing data process we have 3222 Apple Store and 8863 Google Play Apps remaining, which should be enough for our analysis.

## Most Common Apps by Genre

Finally, it is time to analyze gathered and prepared data! Our company revenue is highly influenced by the number of people using our apps. Let's see the most common genres for both Apple and Google Markets.

In [26]:
def freq_table(dataset, index):
    table = {}
    count = 0
    
    for row in dataset:
        attribute = row[index]
        count += 1
        
        if attribute in table:
            table[attribute] += 1
        else:
            table[attribute] = 1

    for row in table:
        table[row] = (table[row] / count) * 100
            
    return table

In [27]:
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)
        
    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])

### Genre of Apple Store Apps

Our analysis journey starts on exploring genres of free English Apple Store applications. We can see that more than a half (58.16%) are games, followed by entertainment apps (close to 8%). Photo & Video applications take almost 5%, 3.66% of the apps are designed for education and 3.29% of the apps designed for social networking.

The general impression is that App Store (at least its free English segment) is dominated by games and entertainment applications, while productivity apps with more practical purposes are more rare.

However, this does not mean that fun apps have the greatest number of users - the demand might not be the same as the offer.

In [28]:
 display_table(apples_free, 11)

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


### Category of Google Play Apps

We can a significant difference between Apple Store and Google Play markets. The family category (which have almost 19% of the market) could contain educational games for kids. However, unlike App Store there are a good number of apps are designed for productivity.

In [29]:
 display_table(googles_free, 1)

FAMILY : 18.898792733837304
GAME : 9.725826469592688
TOOLS : 8.462146000225657
BUSINESS : 4.592124562789123
LIFESTYLE : 3.9038700214374367
PRODUCTIVITY : 3.8925871601038025
FINANCE : 3.7007785174320205
MEDICAL : 3.5315355974275078
SPORTS : 3.396141261423897
PERSONALIZATION : 3.317161232088458
COMMUNICATION : 3.2381812027530184
HEALTH_AND_FITNESS : 3.0802211440821394
PHOTOGRAPHY : 2.944826808078529
NEWS_AND_MAGAZINES : 2.798149610741284
SOCIAL : 2.6627552747376737
TRAVEL_AND_LOCAL : 2.335552296062281
SHOPPING : 2.245289405393208
BOOKS_AND_REFERENCE : 2.1437436533904997
DATING : 1.8616721200496444
VIDEO_PLAYERS : 1.7939749520478394
MAPS_AND_NAVIGATION : 1.399074805370642
FOOD_AND_DRINK : 1.241114746699763
EDUCATION : 1.1621347173643235
ENTERTAINMENT : 0.9590432133589079
LIBRARIES_AND_DEMO : 0.9364774906916393
AUTO_AND_VEHICLES : 0.9251946293580051
HOUSE_AND_HOME : 0.8236488773552973
WEATHER : 0.8010831546880289
EVENTS : 0.7108202640189552
PARENTING : 0.6544059573507841
ART_AND_DESIGN : 0

### Genres of Google Play Apps

Despite the difference between Genres and the Category columns of Google Play dataset is not crystal clear, we have more details about genre synergies. 

In [30]:
 display_table(googles_free, 9)

Tools : 8.450863138892023
Entertainment : 6.070179397495204
Education : 5.348076272142616
Business : 4.592124562789123
Productivity : 3.8925871601038025
Lifestyle : 3.8925871601038025
Finance : 3.7007785174320205
Medical : 3.5315355974275078
Sports : 3.463838429425702
Personalization : 3.317161232088458
Communication : 3.2381812027530184
Action : 3.102786866749408
Health & Fitness : 3.0802211440821394
Photography : 2.944826808078529
News & Magazines : 2.798149610741284
Social : 2.6627552747376737
Travel & Local : 2.324269434728647
Shopping : 2.245289405393208
Books & Reference : 2.1437436533904997
Simulation : 2.042197901387792
Dating : 1.8616721200496444
Arcade : 1.8503892587160102
Video Players & Editors : 1.771409229380571
Casual : 1.7601263680469368
Maps & Navigation : 1.399074805370642
Food & Drink : 1.241114746699763
Puzzle : 1.128286133363421
Racing : 0.9928917973598104
Role Playing : 0.9364774906916393
Libraries & Demo : 0.9364774906916393
Auto & Vehicles : 0.9251946293580051
S

## Most Popular Apps by Genre on the App Store

To find out what genres are the most popular we will calculate the average 
number of reviews for each genre.

In [31]:
apples_genres = freq_table(apples_free, 11)
apples_ratings = []

for genre in apples_genres:
    total = 0
    len_genre = 0
    
    for app in apples_free:
        genre_app = app[11]
        if genre == genre_app:
            n_ratings = int(app[5])
            total += n_ratings
            len_genre += 1
            
    avg_rating = total / len_genre
    avg_tuple = (avg_rating, genre)
    apples_ratings.append(avg_tuple)

apples_ratings = sorted(apples_ratings, reverse = True)
for rating in apples_ratings:
    print(rating[1], ':', rating[0])

Navigation : 86090.33333333333
Reference : 74942.11111111111
Social Networking : 71548.34905660378
Music : 57326.530303030304
Weather : 52279.892857142855
Book : 39758.5
Food & Drink : 33333.92307692308
Finance : 31467.944444444445
Photo & Video : 28441.54375
Travel : 28243.8
Shopping : 26919.690476190477
Health & Fitness : 23298.015384615384
Sports : 23008.898550724636
Games : 22788.6696905016
News : 21248.023255813954
Productivity : 21028.410714285714
Utilities : 18684.456790123455
Lifestyle : 16485.764705882353
Entertainment : 14029.830708661417
Business : 7491.117647058823
Education : 7003.983050847458
Catalogs : 4004.0
Medical : 612.0


In [32]:
for app in apples_free:  
    if app[11] == 'Navigation':
        print(app[1], ':', app[5])

Waze - GPS Navigation, Maps & Real-time Traffic : 345046
Google Maps - Navigation & Transit : 154911
Geocaching® : 12811
CoPilot GPS – Car Navigation & Offline Maps : 3582
ImmobilienScout24: Real Estate Search in Germany : 187
Railway Route Search : 5


In [33]:
for app in apples_free:  
    if app[11] == 'Reference':
        print(app[1], ':', app[5])

Bible : 985920
Dictionary.com Dictionary & Thesaurus : 200047
Dictionary.com Dictionary & Thesaurus for iPad : 54175
Google Translate : 26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran : 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition : 17588
Merriam-Webster Dictionary : 16849
Night Sky : 12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) : 8535
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools : 4693
GUNS MODS for Minecraft PC Edition - Mods Tools : 1497
Guides for Pokémon GO - Pokemon GO News and Cheats : 826
WWDC : 762
Horror Maps for Minecraft PE - Download The Scariest Maps for Minecraft Pocket Edition (MCPE) Free : 718
VPN Express : 14
Real Bike Traffic Rider Virtual Reality Glasses : 8
教えて!goo : 0
Jishokun-Japanese English Dictionary & Translator : 0


In [34]:
for app in apples_free:  
    if app[11] == 'Book':
        print(app[1], ':', app[5])

Kindle – Read eBooks, Magazines & Textbooks : 252076
Audible – audio books, original series & podcasts : 105274
Color Therapy Adult Coloring Book for Adults : 84062
OverDrive – Library eBooks and Audiobooks : 65450
HOOKED - Chat Stories : 47829
BookShout: Read eBooks & Track Your Reading Goals : 879
Dr. Seuss Treasury — 50 best kids books : 451
Green Riding Hood : 392
Weirdwood Manor : 197
MangaZERO - comic reader : 9
ikouhoushi : 0
MangaTiara - love comic reader : 0
謎解き : 0
謎解き2016 : 0


The navigation, reference and social networking apps have the highest number of user reviews. This is caused by the main market giants like Waze and Spotify, so the average number of ratings is blurred, leaving rest of the apps far away in the rating table.

The good value can be achived with Book, Multimedia (Photo & Video), Weather or Finance apps. However, it could be hard to develop an independent finance application without special experience.

## Most Popular Apps by Genre on Google Play

Google Play dataset has categorical values in the Number of Installs column. We will convert it to numerical values to count the average value and see if the situation is different from App Store market.

In [35]:
googles_categories = freq_table(googles_free, 1)
googles_ratings = []

for category in googles_categories:
    total = 0
    len_category = 0
    
    for app in googles_free:
        category_app = app[1]
        if category == category_app:
            n_installs = app[5].replace('+','').replace(',', '')
            total += int(n_installs)
            len_category += 1
            
    avg_rating = total / len_category
    avg_tuple = (avg_rating, category)
    googles_ratings.append(avg_tuple)

googles_ratings = sorted(googles_ratings, reverse = True)
for rating in googles_ratings:
    print(rating[1], ':', rating[0])

COMMUNICATION : 38456119.167247385
VIDEO_PLAYERS : 24727872.452830188
SOCIAL : 23253652.127118643
PHOTOGRAPHY : 17840110.40229885
PRODUCTIVITY : 16787331.344927534
GAME : 15588015.603248259
TRAVEL_AND_LOCAL : 13984077.710144928
ENTERTAINMENT : 11640705.88235294
TOOLS : 10801391.298666667
NEWS_AND_MAGAZINES : 9549178.467741935
BOOKS_AND_REFERENCE : 8767811.894736841
SHOPPING : 7036877.311557789
PERSONALIZATION : 5201482.6122448975
WEATHER : 5074486.197183099
HEALTH_AND_FITNESS : 4188821.9853479853
MAPS_AND_NAVIGATION : 4056941.7741935486
FAMILY : 3697848.1731343283
SPORTS : 3638640.1428571427
ART_AND_DESIGN : 1986335.0877192982
FOOD_AND_DRINK : 1924897.7363636363
EDUCATION : 1833495.145631068
BUSINESS : 1712290.1474201474
LIFESTYLE : 1437816.2687861272
FINANCE : 1387692.475609756
HOUSE_AND_HOME : 1331540.5616438356
DATING : 854028.8303030303
COMICS : 817657.2727272727
AUTO_AND_VEHICLES : 647317.8170731707
LIBRARIES_AND_DEMO : 638503.734939759
PARENTING : 542603.6206896552
BEAUTY : 51315

In [45]:
for app in googles_free:  
    if app[1] == 'COMMUNICATION' and (
        app[5] == '1,000,000,000+' or
        app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

WhatsApp Messenger : 1,000,000,000+
imo beta free calls and text : 100,000,000+
Android Messages : 100,000,000+
Messenger – Text and Video Chat for Free : 1,000,000,000+
Skype - free IM & video calls : 1,000,000,000+
Who : 100,000,000+
GO SMS Pro - Messenger, Free Themes, Emoji : 100,000,000+
Google Chrome: Fast & Secure : 1,000,000,000+
Firefox Browser fast & private : 100,000,000+
Gmail : 1,000,000,000+
Hangouts : 1,000,000,000+
Messenger Lite: Free Calls & Messages : 100,000,000+
Kik : 100,000,000+
KakaoTalk: Free Calls & Text : 100,000,000+
Opera Mini - fast web browser : 100,000,000+
Opera Browser: Fast and Secure : 100,000,000+
Telegram : 100,000,000+
Truecaller: Caller ID, SMS spam blocking & Dialer : 100,000,000+
UC Browser Mini -Tiny Fast Private & Secure : 100,000,000+
WeChat : 100,000,000+
Yahoo Mail – Stay Organized : 100,000,000+
BBM - Free Calls & Messages : 100,000,000+


In [44]:
for app in googles_free:  
    if app[1] == 'VIDEO_PLAYERS' and (
        app[5] == '1,000,000,000+' or
        app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

YouTube : 1,000,000,000+
Motorola Gallery : 100,000,000+
VLC for Android : 100,000,000+
Google Play Movies & TV : 1,000,000,000+
Dubsmash : 100,000,000+
VivaVideo - Video Editor & Photo Movie : 100,000,000+
VideoShow-Video Editor, Video Maker, Beauty Camera : 100,000,000+
Motorola FM Radio : 100,000,000+


In [46]:
for app in googles_free:  
    if app[1] == 'BOOKS_AND_REFERENCE' and (
        app[5] == '1,000,000,000+' or
        app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

Google Play Books : 1,000,000,000+
Bible : 100,000,000+
Amazon Kindle : 100,000,000+
Wattpad 📖 Free Books : 100,000,000+
Audiobooks from Audible : 100,000,000+


In [47]:
for app in googles_free:  
    if app[1] == 'BOOKS_AND_REFERENCE':
        print(app[0], ':', app[5])

E-Book Read - Read Book for free : 50,000+
Download free book with green book : 100,000+
Wikipedia : 10,000,000+
Cool Reader : 10,000,000+
Free Panda Radio Music : 100,000+
Book store : 1,000,000+
FBReader: Favorite Book Reader : 10,000,000+
English Grammar Complete Handbook : 500,000+
Free Books - Spirit Fanfiction and Stories : 1,000,000+
Google Play Books : 1,000,000,000+
AlReader -any text book reader : 5,000,000+
Offline English Dictionary : 100,000+
Offline: English to Tagalog Dictionary : 500,000+
FamilySearch Tree : 1,000,000+
Cloud of Books : 1,000,000+
Recipes of Prophetic Medicine for free : 500,000+
ReadEra – free ebook reader : 1,000,000+
Anonymous caller detection : 10,000+
Ebook Reader : 5,000,000+
Litnet - E-books : 100,000+
Read books online : 5,000,000+
English to Urdu Dictionary : 500,000+
eBoox: book reader fb2 epub zip : 1,000,000+
English Persian Dictionary : 500,000+
Flybook : 500,000+
All Maths Formulas : 1,000,000+
Ancestry : 5,000,000+
HTC Help : 10,000,000+
E

As we can see, the average number of installs of Communication and Video Players  apps is much distorted by giant apps. Despite this, Books and Reference categories have not much noise of huge applications.

## Conclusion

During this project we analyzed the Profitable App Profiles for the mobile markets. The most popular genres and categories of mobile applications was explored.



We have found out that the most profitable application for our company, which is focused on free Android and iOS mobile apps, is to develop a Book or Media application. To get to the top of the market lists the application should have some advantages or spectial features compared to the related applications.