# Profitable App Profiles for the App Store and Google Play Markets


## Table of contents <a name="begin"></a>
### 1. [Introduction](#introduction)
* [About The Project](#about)
* [Project Goal](#goal)

### 2. [Opening Data](#open)


### 3. [Exploring Data](#explore)


### 4. [Data Cleaning](#clean)
* [Checking and deleting rows that do not have complete entries for both dataset](#delete)
* [Removing Non-English Apps](#remove)
* [Isolating the Free Apps on English App Dataset](#isolate)


### 5. [Data Analysis](#analysis)
* [Determining the most common genres in each market](#common)
* [Most Popular Apps by Genre on the Apple Store](#apple)
* [Most Popular Apps by Genre on the Google Play Store](#google)


### 6. [ Conclusions](#conclusion)


### 7. [Limitations](#limitation)

***
***
## Introduction <a name="introduction"></a> 

#### About The Project  <a name="about"></a>
This Project is for a company that builds free Android and iOS mobile apps and has made their apps available on Google Play and in the App Store. Our main source of revenue consists of in-app ads.
This means that the number of users of our apps determines our revenue for any given app. The more users who see and engage with the ads, the better.

#### Project Goal <a name="goal"></a>
The goal for this project is to analyze data to help our developers
understand what type of apps is likely to attract more users.

***
***
## Opening Data <a name="open"></a>

In [3]:
open_file=open("AppleStore.csv")
from csv import reader
read_file=reader(open_file)
apple_data=list(read_file)

print(apple_data)



In [47]:
# Opening AppleStore.csv and GooglePlayStore.csv

open_file1=open("AppleStore.csv")
open_file2=open("googleplaystore.csv")
from csv import reader
read_file1=reader(open_file1)
read_file2=reader(open_file2)

# Converting file to list of list
app_store_data=list(read_file1)
google_store_data=list(read_file2)



***
***
## Exploring the Data <a name="explore"></a>


In [48]:
# Creating an explore_data() function

def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice=dataset[start:end]
    for row in dataset_slice:
        print(row)
        print('\n') # this would add a newline for easy readability
    if rows_and_columns:
        print('Number of rows: ', len(dataset[1:]))
        print('Number of columns: ', len(dataset[0]))

In [49]:
# Exploring data for AppleStore Data

explor_app_data= explore_data(app_store_data,0,5,rows_and_columns=True)


['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


Number of rows:  7197
Number of columns:  16


In [50]:
# Exploring data for GoogleStore Data

explor_goog_data= explore_data(google_store_data,0,5,rows_and_columns=True)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite ‚Äì FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows:  10841
Number of columns:  13


## Data Cleaning <a name="clean"></a>

### Checking and deleting rows that do not have complete entries for both dataset <a name="delete"></a>

In [8]:
# Printing only the column names, to identify the columns that would help us in our analysis on both dataset

explor_app_data= explore_data(app_store_data,0,1)
explor_goog_data= explore_data(google_store_data,0,1)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']




In [51]:
# To check if the dataset rows have the same number of columns in header.


def num_of_columns(dataset):
    for row in dataset:
        if len(row)!= len(dataset[0]):
            print(row)
            print("\n")
            print("Index position is:", dataset.index(row))
        

In [52]:
# using number of column checker function on Apple Store data
print(num_of_columns(app_store_data))

None


In [11]:
# using number of column checker function on Google store data
print(num_of_columns(google_store_data))

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']


Index position is: 10473
None


In [64]:
# deleting wrong data on google store data with index 10473
del google_store_data[10473]

In [65]:
# Re-running the earlier column checker function to Check if there is still an incomplete-column row.
# Using number of column checker function on Google store data
print(num_of_columns(google_store_data))

None


##### Checking and deleting duplicated rows for both dataset

In [68]:
# Creating duplicate-checker function with 3 attributes:
# Dataset, column index of app name, and the app to be checked.

def dupli_checker(dataset,column_index_of_app_name,name):
    for app in dataset[1:]:
        app_name = app[column_index_of_app_name]
        list_of_duplicates=[]
        if app_name == name:
            list_of_duplicates.append(app)
            print(list_of_duplicates)
            print('\n')
            

In [69]:
# checking duplicate for Google store data, using the dupliChecker function 
# app name falls on index 0 ('App')

dupli_checker(google_store_data,0,'Twitter')

[['Twitter', 'NEWS_AND_MAGAZINES', '4.3', '11667403', 'Varies with device', '500,000,000+', 'Free', '0', 'Mature 17+', 'News & Magazines', 'August 6, 2018', 'Varies with device', 'Varies with device']]


[['Twitter', 'NEWS_AND_MAGAZINES', '4.3', '11667403', 'Varies with device', '500,000,000+', 'Free', '0', 'Mature 17+', 'News & Magazines', 'August 6, 2018', 'Varies with device', 'Varies with device']]


[['Twitter', 'NEWS_AND_MAGAZINES', '4.3', '11657972', 'Varies with device', '500,000,000+', 'Free', '0', 'Mature 17+', 'News & Magazines', 'July 30, 2018', 'Varies with device', 'Varies with device']]




In [117]:
# checking duplicate for Apple Store data, using the dupliChecker function
# app name falls on index 1 ('track_name')

dupli_checker(app_store_data,1,'Twitter')

[['333903271', 'Twitter', '210569216', 'USD', '0.0', '354058', '452', '3.5', '4.0', '6.79.1', '17+', 'News', '37', '2', '33', '1']]




***

From the above, we could spot duplications of apps on Google Store Data. 

***

##### To find the number of duplicated apps and displaying samples of the duplicated app on each dataset.

In [99]:
# created a function to show number of duplicated rows and the first 15 examples from the dataset.
def number_of_dupli(dataset,column_index_of_app_name):
    duplicate_apps=[]
    unique_apps=[]

    for app in dataset[1:]:
        app_name=app[column_index_of_app_name]
        if app_name in unique_apps:
            duplicate_apps.append(app_name)
        else:
            unique_apps.append(app_name)

    print('Number of duplicated apps on : ', len(duplicate_apps))
    print('\n')
    print('Examples of duplicated apps on : ', duplicate_apps[:15])
    expect_length = len(dataset[1:]) - len(duplicate_apps)
    print('\n')
    print('Expected lenght of unique data is ', expect_length)
    print(len(unique_apps))

In [100]:
# checking number of duplicated apps on Google store data, using the number_of_dupli function 
# app name falls on index 0 ('App')

number_of_dupli(google_store_data,0)


Number of duplicated apps on :  1181


Examples of duplicated apps on :  ['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings', 'Box', 'Zenefits', 'Google Ads', 'Google My Business', 'Slack', 'FreshBooks Classic', 'Insightly CRM', 'QuickBooks Accounting: Invoicing & Expenses', 'HipChat - Chat Built for Teams', 'Xero Accounting Software']


Expected lenght of unique data is  9659
9659


In [116]:
# checking number of duplicated apps on Apple Store data, using the number_of_dupli function
# using the id instead of track_name. this is because every app on apple store has a unique id number.
# the app id falls on index 0 ('id')

number_of_dupli(app_store_data,0)


Number of duplicated apps on :  0


Examples of duplicated apps on :  []


Expected lenght of unique data is  7197
7197


***
From the ouput, we notice that apple store data has no duplicate row.
While google play store data has 1181 duplicate rows.

We proceed to further removing the duplicate rows in google play store, so each row will represent a unique app data. 
***

#### For Google Play Store Data

##### Create a dictionary, where each dictionary key is a unique app name and the corresponding dictionary value is the highest number of reviews of that app.
##### Use the information stored in the dictionary and create a new dataset, which will have only one entry per app (and for each app, we'll only select the entry with the highest number of reviews).

In [83]:
# creating dictionary for Google store data

reviews_max= {}
    
for app in google_store_data[1:]:
    app_name = app[0]
    n_reviews = float(app[3])
        
    if app_name in reviews_max and reviews_max[app_name]  < n_reviews:
        reviews_max[app_name] = n_reviews
            
    elif app_name not in reviews_max:
        reviews_max[app_name] = n_reviews
    
print('\n')
print('The lenght of unique apps in the review_max dictionary is ',len(reviews_max))
print('\n')
print('\n')
print(reviews_max)
    



The lenght of unique apps in the review_max dictionary is  9659




{'Photo Editor & Candy Camera & Grid & ScrapBook': 159.0, 'Coloring book moana': 974.0, 'U Launcher Lite ‚Äì FREE Live Cool Themes, Hide Apps': 87510.0, 'Sketch - Draw & Paint': 215644.0, 'Pixel Draw - Number Art Coloring Book': 967.0, 'Paper flowers instructions': 167.0, 'Smoke Effect Photo Maker - Smoke Editor': 178.0, 'Infinite Painter': 36815.0, 'Garden Coloring Book': 13791.0, 'Kids Paint Free - Drawing Fun': 121.0, 'Text on Photo - Fonteee': 13880.0, 'Name Art Photo Editor - Focus n Filters': 8788.0, 'Tattoo Name On My Photo Editor': 44829.0, 'Mandala Coloring Book': 4326.0, '3D Color Pixel by Number - Sandbox Art Coloring': 1518.0, 'Learn To Draw Kawaii Characters': 55.0, 'Photo Designer - Write your name with shapes': 3632.0, '350 Diy Room Decor Ideas': 27.0, 'FlipaClip - Cartoon animation': 194216.0, 'ibis Paint X': 224399.0, 'Logo Maker - Small Business': 450.0, "Boys Photo Editor - Six Pack & Men's Suit": 

***
By observation, we noticed that the length of the unique apps in the review_max dictionary (9659) is equal to the expected lenght for new dataset we calculated earlier when subtracting the duplicated app lenght (1181) from the lenght of the Google App Store dataset excluding the header row (10840).
***


##### Using the dictionary above to remove the duplicate rows from google_store_data 

In [160]:
google_data_clean=[]
already_added=[]

for row in google_store_data[1:]:
    app_name=row[0]
    n_reviews=float(row[3])
    
    if (n_reviews == reviews_max[app_name]) and (app_name not in already_added):
        google_data_clean.append(row)
        already_added.append(app_name)
        
print('The total number of rows in cleaned Google Data is',len(google_data_clean))
print('\n')
explore_data(google_data_clean, 0, 3, True)

        

The total number of rows in cleaned Google Data is 9659


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite ‚Äì FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows:  9658
Number of columns:  13


***
From the output, we observed that the new google_clean data lenght of '9659' meets the expected number of rows when duplicate rows are removed from the google_store dataset.
***

### Removing Non-English Apps <a name="remove"></a>

In [140]:
# Creating a function to identify English Language Characters (ASCII 0-127)

def eng_char(string):
    non_ASCII=0
    
    for char in string:
        if ord(char) > 127:
            non_ASCII += 1
    
    if non_ASCII > 3: #this condition is as a result of emoji and special charaters.
        return False
    else:
        return True
            

In [162]:
# Using the English Language identifier function above to 
# filter the English and Non-English Apps on both Dataset.

google_non_eng_apps=[]
google_eng_apps=[]

for row in google_data_clean[1:]:
    app_name = row[0]
    if eng_char(app_name) is False:
        google_non_eng_apps.append(row)
    else:
        google_eng_apps.append(row)
        
        
apple_non_eng_apps=[]
apple_eng_apps=[] 

for row in app_store_data[1:]:
    app_name = row[1]
    if eng_char(app_name) is False:
        apple_non_eng_apps.append(row)
    else:
        apple_eng_apps.append(row)
        

#### Exploring the Non-English Apps for Google and Apple Store Data

In [161]:
print('Samples of non-english app on Google Store')
print('\n')
explore_data(google_non_eng_apps, 0, 3, True)
print('\n')
print('Samples of non-english app on Apple Store')
print('\n')
explore_data(apple_non_eng_apps, 0, 3, True)

Samples of non-english app on Google Store


['Flame - ÿØÿ±ÿ® ÿπŸÇŸÑŸÉ ŸäŸàŸÖŸäÿß', 'EDUCATION', '4.6', '56065', '37M', '1,000,000+', 'Free', '0', 'Everyone', 'Education', 'July 26, 2018', '3.3', '4.1 and up']


['·Äû·Ä≠·ÄÑ·Ä∫·Äπ Astrology - Min Thein Kha BayDin', 'LIFESTYLE', '4.7', '2225', '15M', '100,000+', 'Free', '0', 'Everyone', 'Lifestyle', 'July 26, 2018', '4.2.1', '4.0.3 and up']


['–†–ò–ê –ù–æ–≤–æ—Å—Ç–∏', 'NEWS_AND_MAGAZINES', '4.5', '44274', '8.0M', '1,000,000+', 'Free', '0', 'Everyone', 'News & Magazines', 'August 6, 2018', '4.0.6', '4.4 and up']


Number of rows:  44
Number of columns:  13


Samples of non-english app on Apple Store


['445375097', 'Áà±Â•áËâ∫PPS -„ÄäÊ¨¢‰πêÈ¢Ç2„ÄãÁîµËßÜÂâßÁÉ≠Êí≠', '224617472', 'USD', '0.0', '14844', '0', '4.0', '0.0', '6.3.3', '17+', 'Entertainment', '38', '5', '3', '1']


['405667771', 'ËÅöÂäõËßÜÈ¢ëHD-‰∫∫Ê∞ëÁöÑÂêç‰πâ,Ë∑®ÁïåÊ≠åÁéãÂÖ®ÁΩëÁÉ≠Êí≠', '90725376', 'USD', '0.0', '7446', '8', '4.0', '4.5', '5.0.8', '12+', 'Entertainment', '24', '4',

#### Exploring the English Apps for Google and Apple Store Data

In [159]:
print('Samples of english app on Google Store')
print('\n')
explore_data(google_eng_apps, 0, 3, True)
print('\n')
print('Samples of english app on Apple Store')
print('\n')
explore_data(apple_eng_apps, 0, 3, True)

Samples of english app on Google Store


['U Launcher Lite ‚Äì FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


Number of rows:  9612
Number of columns:  13


Samples of english app on Apple Store


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of C

### Isolating the Free Apps on English App Dataset <a name="isolate"></a>

###### Since our dataset consist of both free and non-free apps, and we only build app that are free to download and install, and our main source of revenue consists of in-app ads, therefore we have to isolate the free apps from English speaking dataset.


In [180]:
# Exploring the heading of the dataset for reference purpose on index.

explore_data(google_store_data,0,1)
print('\n')
explore_data(app_store_data,0,1)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']




['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']




In [183]:
# Isolating the free english apps on both Google and Apple store.
google_free_apps=[]
apple_free_apps=[]

for app in google_eng_apps:
    price = app[6]
    if price == 'Free':
        google_free_apps.append(app)
   
        
for app in apple_eng_apps:
    price = float(app[4])
    if price == 0.0:
        apple_free_apps.append(app)
         
        
    
print('Samples of free english app on Google Store')
print('\n')
explore_data(google_free_apps, 0, 3, True)
print('\n')
print('Samples of free english app on Apple Store')
print('\n')
explore_data(apple_free_apps, 0, 3, True)   
    

Samples of free english app on Google Store


['U Launcher Lite ‚Äì FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


Number of rows:  8861
Number of columns:  13


Samples of free english app on Apple Store


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', '

***
***
## Data Analysis <a name="analysis"></a>

As we mentioned in the introduction, our goal is to determine the kinds of apps that are likely to attract more users because the number of people using our apps affect our revenue.

***To minimize risks and overhead, our validation strategy for an app idea has three steps:***

1. Build a minimal Android version of the app, and add it to Google Play.
2. If the app has a good response from users, we develop it further.
3. If the app is profitable after six months, we build an iOS version of the app and add it to the App Store.


**Because our end goal is to add the app on both Google Play and the App Store, we need to find app profiles that are successful in both markets.** 

### Determining the most common genres in each market <a name="common"></a>

In [187]:
# Inspecting the columns of each dataset to identify columns we could use to 
# generate frequency tables to determine to the most common genres in each market

print('Google Store')
print('\n')
explore_data(google_store_data,0,2)
print('\n')
print('\n')
print('Apple Store')
print('\n')
explore_data(app_store_data,0,2)

Google Store


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']






Apple Store


['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']




#### Beginning the analysis by getting a sense of the most common genres for each market. For this, we'll build a frequency table for the prime_genre column of the App Store data set, and the Genres and Category columns of the Google Play data set.


**Building a frequency table for Genres column and prime_genre column of Google and Apple data respectively**

In [252]:
# Creating a frequency table function that shows percentage. 

def freq_table(dataset,index):
    freq_tab = {}
    total = 0
    
    for app in dataset[1:]:
        total += 1
        genre = app[index]
        
        if genre in freq_tab:
            freq_tab[genre] += 1
        else:
            freq_tab[genre] = 1
            
    # converting the dictionary value to percentage       
    freq_tab_percentage = {} 
    
    for genre in freq_tab:
        percentage = (freq_tab[genre] / total)*100
        freq_tab_percentage[genre]=percentage
        
    return freq_tab_percentage


# We need to sort the frequecy tabe in descending order. But dictionary doesnt 
# have order, hence we create a function that can be use to display the percentages in a descending order.

def display_table(dataset,index):
    table_display = []
    table = freq_table(dataset,index)
    
    for genre in table:
        genre_value_tuple = (table[genre],genre)
        table_display.append(genre_value_tuple)
        
    sorted_table = sorted(table_display,reverse=True) #This will sort on a descending order.
    for item in sorted_table:
        print(item[1],':',item[0])
        
        

In [253]:
# Examining the frequency table in percentage for Google Store dataset.
display_table(google_free_apps,1)

FAMILY : 18.90305834555919
GAME : 9.728021667983297
TOOLS : 8.46405597562352
BUSINESS : 4.593161042771697
LIFESTYLE : 3.9047511567543167
PRODUCTIVITY : 3.8934657487868187
FINANCE : 3.7016138133393524
MEDICAL : 3.532332693826882
SPORTS : 3.3969077982169056
PERSONALIZATION : 3.317909942444419
COMMUNICATION : 3.2389120866719328
HEALTH_AND_FITNESS : 3.080916375126961
PHOTOGRAPHY : 2.9454914795169844
NEWS_AND_MAGAZINES : 2.7987811759395105
SOCIAL : 2.663356280329534
TRAVEL_AND_LOCAL : 2.3360794492720913
SHOPPING : 2.245796185532107
BOOKS_AND_REFERENCE : 2.144227513824625
DATING : 1.8620923146371742
VIDEO_PLAYERS : 1.794379866832186
MAPS_AND_NAVIGATION : 1.3993905879697552
FOOD_AND_DRINK : 1.2413948764247829
EDUCATION : 1.1623970206522967
ENTERTAINMENT : 0.9592596772373322
LIBRARIES_AND_DEMO : 0.9366888613023362
AUTO_AND_VEHICLES : 0.9254034533348381
HOUSE_AND_HOME : 0.8238347816273558
WEATHER : 0.8012639656923597
EVENTS : 0.7109807019523756
PARENTING : 0.6545536621148855
COMICS : 0.62069743

***
From Google Play Store display table, it shows that from the free english apps, about 19% falls on family categories. Game apps are close to 10%, followed by Tools apps which are close to 9%. About 5% apps are designed for Business, followed by Lifestyle apps which is around 4% of the apps in our data set.
***

In [256]:
# Examining the frequency table in percentage for Apple Store dataset.
display_table(apple_free_apps,11)

Games : 58.180689226948154
Entertainment : 7.885749767153058
Photo & Video : 4.967401428127911
Education : 3.6634585532443342
Social Networking : 3.2598571872089415
Shopping : 2.607885749767153
Utilities : 2.5147469729897547
Sports : 2.1421918658801617
Music : 2.049053089102763
Health & Fitness : 2.018006830176964
Productivity : 1.7385904998447685
Lifestyle : 1.5833592052157717
News : 1.334989133809376
Travel : 1.2418503570319777
Finance : 1.11766532132878
Weather : 0.8692952499223843
Food & Drink : 0.8072027320707855
Reference : 0.55883266066439
Business : 0.5277864017385905
Book : 0.43464762496119214
Navigation : 0.18627755355479667
Medical : 0.18627755355479667
Catalogs : 0.12418503570319776


***
From Apple Store display table, the landscape of the free english apps seems significantly different from the Google Play Store. More than half (58.18%) are games. Entertainment apps are close to 8%, followed by photo & video apps which are close to 5%. About 3.66% apps are designed for Education, followed by social networking apps which is around 3.25% of the apps in our data set.

This implies that there are more apps designed for fun on Apple sore than on Google Play Store. And it seems that a good number of apps are designed for pratical purposes such as Family, Tools, Business, lifestyle etc.
***

**Up to this point, we found that the App Store is dominated by apps designed for fun, while Google Play shows a more balanced landscape of both practical and for-fun apps. Now we'd like to get an idea about the kind of apps that have most users.**

### Most Popular Apps by Genre on the Apple Store <a name="apple"></a>

In [259]:
prime_genre_table = freq_table(apple_free_apps,11)
print(prime_genre_table)

{'Photo & Video': 4.967401428127911, 'Games': 58.180689226948154, 'Music': 2.049053089102763, 'Social Networking': 3.2598571872089415, 'Reference': 0.55883266066439, 'Health & Fitness': 2.018006830176964, 'Weather': 0.8692952499223843, 'Utilities': 2.5147469729897547, 'Travel': 1.2418503570319777, 'Shopping': 2.607885749767153, 'News': 1.334989133809376, 'Navigation': 0.18627755355479667, 'Lifestyle': 1.5833592052157717, 'Entertainment': 7.885749767153058, 'Food & Drink': 0.8072027320707855, 'Sports': 2.1421918658801617, 'Book': 0.43464762496119214, 'Finance': 1.11766532132878, 'Education': 3.6634585532443342, 'Productivity': 1.7385904998447685, 'Business': 0.5277864017385905, 'Catalogs': 0.12418503570319776, 'Medical': 0.18627755355479667}


In [270]:
for genre in prime_genre_table:
    total = 0
    len_genre = 0
    for app in apple_free_apps:
        genre_app = app[11]
        if genre_app == genre:
            user_rating = float(app[5])
            total += user_rating
            len_genre += 1
            
    average = total / len_genre
            
    print(genre, 'has', len_genre, 'number of Apps, with an average user rating count of',round(average,2))
    print('\n')
    

Photo & Video has 160 number of Apps, with an average user rating count of 28441.54


Games has 1874 number of Apps, with an average user rating count of 22788.67


Music has 66 number of Apps, with an average user rating count of 57326.53


Social Networking has 106 number of Apps, with an average user rating count of 71548.35


Reference has 18 number of Apps, with an average user rating count of 74942.11


Health & Fitness has 65 number of Apps, with an average user rating count of 23298.02


Weather has 28 number of Apps, with an average user rating count of 52279.89


Utilities has 81 number of Apps, with an average user rating count of 18684.46


Travel has 40 number of Apps, with an average user rating count of 28243.8


Shopping has 84 number of Apps, with an average user rating count of 26919.69


News has 43 number of Apps, with an average user rating count of 21248.02


Navigation has 6 number of Apps, with an average user rating count of 86090.33


Lifestyle has 51 number o

***
From the analysis, Navigation has 6 Apps, with the highest average user rating count of 86090.33, followed by Reference whih has 18 apps with an average of 74942.11, 71548.35 for Social Networking with 106 apps,  Music has 66 apps with an average of 57326.53 and so on... in the apple data set.
***
By further analysis, we would explore the apps and their respective rating total under Navigation, Reference, Social Networking, Music and also for other genres with higher average user rating count, to check out the exact app with the highest user rating.
***

In [330]:
# Creating a genre_display function to explore all apps on Apple Store
# with their respective total rating count in the same category/genre.

def genre_display1(genre):
    for app in apple_free_apps:
        genres = app[11]
        total_ratings = int(app[5])
        app_name = app[1]

        if genres == genre:
            print(app_name,':',total_ratings) 
            

In [331]:
genre_display1('Navigation')

Waze - GPS Navigation, Maps & Real-time Traffic : 345046
Google Maps - Navigation & Transit : 154911
Geocaching¬Æ : 12811
CoPilot GPS ‚Äì Car Navigation & Offline Maps : 3582
ImmobilienScout24: Real Estate Search in Germany : 187
Railway Route Search : 5


***
'Waze - GPS Navigation, Maps & Real-time Traffic' app has the highest user rating of 345,046 in Navigation category.
***

In [332]:
genre_display1('Reference')

Bible : 985920
Dictionary.com Dictionary & Thesaurus : 200047
Dictionary.com Dictionary & Thesaurus for iPad : 54175
Google Translate : 26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran : 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition : 17588
Merriam-Webster Dictionary : 16849
Night Sky : 12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) : 8535
LUCKY BLOCK MOD ‚Ñ¢ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools : 4693
GUNS MODS for Minecraft PC Edition - Mods Tools : 1497
Guides for Pok√©mon GO - Pokemon GO News and Cheats : 826
WWDC : 762
Horror Maps for Minecraft PE - Download The Scariest Maps for Minecraft Pocket Edition (MCPE) Free : 718
VPN Express : 14
Real Bike Traffic Rider Virtual Reality Glasses : 8
Êïô„Åà„Å¶!goo : 0
Jishokun-Japanese English Dictionary & Translator : 0


***
'Bible' app has the highest user rating of 985,920 in Reference category, followed by 'Dictionary.com Dictionary & Thesaurus' app has a total user rating of 254,222.
***

In [333]:
genre_display1('Social Networking')

Facebook : 2974676
Pinterest : 1061624
Skype for iPhone : 373519
Messenger : 351466
Tumblr : 334293
WhatsApp Messenger : 287589
Kik : 260965
ooVoo ‚Äì Free Video Call, Text and Voice : 177501
TextNow - Unlimited Text + Calls : 164963
Viber Messenger ‚Äì Text & Call : 164249
Followers - Social Analytics For Instagram : 112778
MeetMe - Chat and Meet New People : 97072
We Heart It - Fashion, wallpapers, quotes, tattoos : 90414
InsTrack for Instagram - Analytics Plus More : 85535
Tango - Free Video Call, Voice and Chat : 75412
LinkedIn : 71856
Match‚Ñ¢ - #1 Dating App. : 60659
Skype for iPad : 60163
POF - Best Dating App for Conversations : 52642
Timehop : 49510
Find My Family, Friends & iPhone - Life360 Locator : 43877
Whisper - Share, Express, Meet : 39819
Hangouts : 36404
LINE PLAY - Your Avatar World : 34677
WeChat : 34584
Badoo - Meet New People, Chat, Socialize. : 34428
Followers + for Instagram - Follower Analytics : 28633
GroupMe : 28260
Marco Polo Video Walkie Talkie : 27662
Miito

***
'Facebook' app has the highest user rating of 2,974,676 in Socaial Networking category, followed by 'Pinterest' app has a total user rating of 1,061,624.
***

In [334]:
genre_display1('Music')

Pandora - Music & Radio : 1126879
Spotify Music : 878563
Shazam - Discover music, artists, videos & lyrics : 402925
iHeartRadio ‚Äì Free Music & Radio Stations : 293228
SoundCloud - Music & Audio : 135744
Magic Piano by Smule : 131695
Smule Sing! : 119316
TuneIn Radio - MLB NBA Audiobooks Podcasts Music : 110420
Amazon Music : 106235
SoundHound Song Search & Music Player : 82602
Sonos Controller : 48905
Bandsintown Concerts : 30845
Karaoke - Sing Karaoke, Unlimited Songs! : 28606
My Mixtapez Music : 26286
Sing Karaoke Songs Unlimited with StarMaker : 26227
Ringtones for iPhone & Ringtone Maker : 25403
Musi - Unlimited Music For YouTube : 25193
AutoRap by Smule : 18202
Spinrilla - Mixtapes For Free : 15053
Napster - Top Music & Radio : 14268
edjing Mix:DJ turntable to remix and scratch music : 13580
Free Music - MP3 Streamer & Playlist Manager Pro : 13443
Free Piano app by Yokee : 13016
Google Play Music : 10118
Certified Mixtapes - Hip Hop Albums & Mixtapes : 9975
TIDAL : 7398
YouTube 

***
'Pandora - Music & Radio' app has the highest user rating of 1,126,879 in Music category, followed by 'Spotify Music' app has a total user rating of 878,563.
***

In summary from the analysis, the following free english apps attracted more users on Apple Store:
1. 'Waze - GPS Navigation, Maps & Real-time Traffic'
2. 'Bible'
3. 'Dictionary.com Dictionary & Thesaurus'
4. 'Facebook'
5. 'Pinterest'
6. 'Pandora - Music & Radio'
7. 'Spotify Music'


### Most Popular Apps by Genre on the Google Play Store <a name="google"></a>

In [335]:
category_table = freq_table(google_free_apps,1)
print(category_table)
    

{'ART_AND_DESIGN': 0.6206974382123914, 'AUTO_AND_VEHICLES': 0.9254034533348381, 'BEAUTY': 0.5981266222773953, 'BOOKS_AND_REFERENCE': 2.144227513824625, 'BUSINESS': 4.593161042771697, 'COMICS': 0.6206974382123914, 'COMMUNICATION': 3.2389120866719328, 'DATING': 1.8620923146371742, 'EDUCATION': 1.1623970206522967, 'ENTERTAINMENT': 0.9592596772373322, 'EVENTS': 0.7109807019523756, 'FINANCE': 3.7016138133393524, 'FOOD_AND_DRINK': 1.2413948764247829, 'HEALTH_AND_FITNESS': 3.080916375126961, 'HOUSE_AND_HOME': 0.8238347816273558, 'LIBRARIES_AND_DEMO': 0.9366888613023362, 'LIFESTYLE': 3.9047511567543167, 'GAME': 9.728021667983297, 'FAMILY': 18.90305834555919, 'MEDICAL': 3.532332693826882, 'SOCIAL': 2.663356280329534, 'SHOPPING': 2.245796185532107, 'PHOTOGRAPHY': 2.9454914795169844, 'SPORTS': 3.3969077982169056, 'TRAVEL_AND_LOCAL': 2.3360794492720913, 'TOOLS': 8.46405597562352, 'PERSONALIZATION': 3.317909942444419, 'PRODUCTIVITY': 3.8934657487868187, 'PARENTING': 0.6545536621148855, 'WEATHER': 0

In [336]:
for category_app in category_table:
    total = 0
    len_category = 0
    
    for app in google_free_apps:
        category = app[1]
        installs = app[5]
        installs=installs.replace('+','')  # Replacing '+' & ',' with empty string.
        installs=float(installs.replace(',',''))

        if category == category_app:
            total += installs
            len_category += 1

    average = round(total / len_category,2)
    
    print(category_app, 'has', len_category, 'number of Apps, with an average user rating count of',average)
    print('\n')
    

ART_AND_DESIGN has 56 number of Apps, with an average user rating count of 2021626.79


AUTO_AND_VEHICLES has 82 number of Apps, with an average user rating count of 647317.82


BEAUTY has 53 number of Apps, with an average user rating count of 513151.89


BOOKS_AND_REFERENCE has 190 number of Apps, with an average user rating count of 8767811.89


BUSINESS has 407 number of Apps, with an average user rating count of 1712290.15


COMICS has 55 number of Apps, with an average user rating count of 817657.27


COMMUNICATION has 287 number of Apps, with an average user rating count of 38456119.17


DATING has 165 number of Apps, with an average user rating count of 854028.83


EDUCATION has 103 number of Apps, with an average user rating count of 1833495.15


ENTERTAINMENT has 85 number of Apps, with an average user rating count of 11640705.88


EVENTS has 63 number of Apps, with an average user rating count of 253542.22


FINANCE has 328 number of Apps, with an average user rating count o

***
From the analysis, COMMUNICATION has 287 Apps, with the highest average user rating count of 38456119.17, followed by VIDEO_PLAYERS which has 159 apps with an average of 24727872.  SOCIAL has 236 apps with an average of 23253652.13 and so on... in the google data set.
***
By further analysis, we would explore the apps and their respective rating total under COMMUNICATION, VIDEO_PLAYERS, SOCIAL and also for other genres with higher average user rating count, to check out the exact app with the highest user rating.
***

In [350]:
# Creating a genre_display function to explore all apps on Google Play Store
# with their respective total rating count in the same category/genre.

def genre_display2(genre):
    for app in google_free_apps:
        category = app[1]
        app_name = app[0]
        installs = app[5]

        if category == genre:
            print(app_name,':-----------:',installs) 
            

In [352]:
genre_display2('COMMUNICATION')

WhatsApp Messenger :-----------: 1,000,000,000+
Messenger for SMS :-----------: 10,000,000+
My Tele2 :-----------: 5,000,000+
imo beta free calls and text :-----------: 100,000,000+
Contacts :-----------: 50,000,000+
Call Free ‚Äì Free Call :-----------: 5,000,000+
Web Browser & Explorer :-----------: 5,000,000+
Browser 4G :-----------: 10,000,000+
MegaFon Dashboard :-----------: 10,000,000+
ZenUI Dialer & Contacts :-----------: 10,000,000+
Cricket Visual Voicemail :-----------: 10,000,000+
TracFone My Account :-----------: 1,000,000+
Xperia Link‚Ñ¢ :-----------: 10,000,000+
TouchPal Keyboard - Fun Emoji & Android Keyboard :-----------: 10,000,000+
Skype Lite - Free Video Call & Chat :-----------: 5,000,000+
My magenta :-----------: 1,000,000+
Android Messages :-----------: 100,000,000+
Google Duo - High Quality Video Calls :-----------: 500,000,000+
Seznam.cz :-----------: 1,000,000+
Antillean Gold Telegram (original version) :-----------: 100,000+
AT&T Visual Voicemail :-----------: 

'WhatsApp Messenger', Skype, Gmail, Hangouts and Google Chrome apps have over one billion installations on Google Play Store.

In [353]:
genre_display2('VIDEO_PLAYERS')

YouTube :-----------: 1,000,000,000+
All Video Downloader 2018 :-----------: 1,000,000+
Video Downloader :-----------: 10,000,000+
HD Video Player :-----------: 1,000,000+
Iqiyi (for tablet) :-----------: 1,000,000+
Video Player All Format :-----------: 10,000,000+
Motorola Gallery :-----------: 100,000,000+
Free TV series :-----------: 100,000+
Video Player All Format for Android :-----------: 500,000+
VLC for Android :-----------: 100,000,000+
Code :-----------: 10,000,000+
Vote for :-----------: 50,000,000+
XX HD Video downloader-Free Video Downloader :-----------: 1,000,000+
OBJECTIVE :-----------: 1,000,000+
Music - Mp3 Player :-----------: 10,000,000+
HD Movie Video Player :-----------: 1,000,000+
YouCut - Video Editor & Video Maker, No Watermark :-----------: 5,000,000+
Video Editor,Crop Video,Movie Video,Music,Effects :-----------: 1,000,000+
YouTube Studio :-----------: 10,000,000+
video player for android :-----------: 10,000,000+
Vigo Video :-----------: 50,000,000+
Google P

YouTube app has over one billion installations and MX Player has more than 500 Million installations on Google Play Store.

In [354]:
genre_display2('SOCIAL')

Facebook :-----------: 1,000,000,000+
Facebook Lite :-----------: 500,000,000+
Tumblr :-----------: 100,000,000+
Social network all in one 2018 :-----------: 100,000+
Pinterest :-----------: 100,000,000+
TextNow - free text + calls :-----------: 10,000,000+
Google+ :-----------: 1,000,000,000+
The Messenger App :-----------: 1,000,000+
Messenger Pro :-----------: 1,000,000+
Free Messages, Video, Chat,Text for Messenger Plus :-----------: 1,000,000+
Telegram X :-----------: 5,000,000+
The Video Messenger App :-----------: 100,000+
Jodel - The Hyperlocal App :-----------: 1,000,000+
Hide Something - Photo, Video :-----------: 5,000,000+
Love Sticker :-----------: 1,000,000+
Web Browser & Fast Explorer :-----------: 5,000,000+
LiveMe - Video chat, new friends, and make money :-----------: 10,000,000+
VidStatus app - Status Videos & Status Downloader :-----------: 5,000,000+
Love Images :-----------: 1,000,000+
Web Browser ( Fast & Secure Web Explorer) :-----------: 500,000+
SPARK - Live r

Facebook & Google+ apps have over one billion installations on Google Play Store.

In summary from the analysis, the following free english apps attracted more users on Google Store:
1. Whatsapp Messenger
2. Skype
3. Gmail
4. Hangout
5. Google Chrome
6. Youtube
7. MX Player
8. Facebook
9. Google+


***
***
## Conclusions <a name="conclusion"></a>

In this project, data from Apple Store and Google Play Store mobile apps were analysed with the goal of recommending an app profile that can be profitable for both markets. 
The step implemented in achieving this goal is to find the most popular apps by genre on Apple Store and Google Play Store. 

From the analysis, on Apple Store we observe that Navigation, Reference, Social networking and Music genres attract more users. While on Google Play Store, the following genres which are Communication, Video Player and Social genres attract more users as well.

Designing an app that would attract users on both platforms, we need to put together the features of the most user attracting apps by genre on Apple and Google Play store. 
By doing so, I would recommend the developers to design an eBook Reader app that has the following features:
1. Ability to play the audio version of any eBook uploaded on the reader
2. Ability to visualize the concept, ideas or message from each chapter in comic/illustated form though the video version of the eBook reader.
3. A dashboard for all readers where they can be able to track their reading progress, add new friends.
4. A chat forum where potential readers can meet, discuss and share ideas.

These features were put together from Reference, Music, Video Player and Social genres respectively; which are the most user attracting genres on both Apple and Google Play store.

***
***
## Limitations <a name="limitation"></a>

This Mobile App Data project is cleaned and analysed without importing external libraries like Pandas and Matplotlib.

***
***
***
[Back to Table of Content](#begin)