# Attractive App Profiles for the App Store & Google Play Markets

This project aims to identify which apps appeal to users most and why.

Working as a data analyst for a company that builds free Android and iOS mobile apps, the main source of revenue consists of in-app advertisements. Therefore, the number of users for each app determines the company's revenue - the more users who see and engage with the ads, the better. The goal for this project is to analyse data to help the company's developers understand what type of apps are likely to attract more users.

## Opening & Exploring Datasets
Before importing the datasets we plan on exploring, we'll create a reproducible function that will allow for quick exploration of datasets. In doing so, it will output the necessary rows as well as the number of rows and columns present if required.

In [1]:
def explore_data(dataset, start, end, rows_and_columns = False, header = False):
    dataset_slice = dataset[start:end] # slices data using inputted integer values
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row
        
    rows = len(dataset)
    columns = len(dataset[0])
        
    if header:
        rows -= 1 # if dataset contains header, ignore header row
        
    if rows_and_columns:
        print('Number of rows:', rows)
        print('Number of columns:', columns)

We can now import the datasets and save them as 2D-lists. The datasets used in this analysis can be found in the links below:
* [Apple Store Dataset](https://www.kaggle.com/datasets/ramamet4/app-store-apple-data-set-10k-apps)
* [Google Playstore Dataset](https://www.kaggle.com/datasets/lava18/google-play-store-apps)

In [2]:
opened_apple = open("Datasets/AppleStore.csv", encoding = 'utf8')
opened_google = open("Datasets/googleplaystore.csv", encoding = 'utf8')
from csv import reader
read_apple = reader(opened_apple)
read_google = reader(opened_google)
apple_data = list(read_apple)
google_data = list(read_google)

After importing the datasets, exploration of the data can begin.

We begin by printing the first few rows of each dataset as well as the number of rows and columns each dataset has.

In [3]:
explore_data(apple_data, 0, 4, True)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


Number of rows: 7198
Number of columns: 16


In [4]:
explore_data(google_data, 0, 4, True)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


Number of rows: 10842
Number of columns: 13


## Cleaning of Datasets
### Deletion of Incorrect Data

When reading the discussion section for the Google Play dataset, an error in row 10,472 is described in which a column shift has occurred. Lets assess if this error is true and can be identified.

In [5]:
print(google_data[10473])

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']


As identified in the discussion, a column shift has occurred resulting in an empty string being present in the row. Deletion of this row is therefore necessary.

In [6]:
del google_data[10473]

As this was the only error identified in the discussion section of the two datasets, identification of duplicate entries is required.

### Identification of Duplicate Data

In [7]:
def duplicate_check(dataset):
    duplicate_apps = []
    unique_apps = []
    
    for app in dataset:
        name = app[0]
        if name in unique_apps:
            duplicate_apps.append(name)
        else:
            unique_apps.append(name)
    
    print('Number of duplicate apps:', len(duplicate_apps))
    print('\n')
    print('Examples of duplicate apps:', duplicate_apps[:15])

Here, we've created a function that can loop through the dataset passed through and for each row, determine whether it is a unique value or a value that has already been identified. We can now apply this function to the two datasets to determine if there are any duplicates present.

In [8]:
duplicate_check(google_data)

Number of duplicate apps: 1181


Examples of duplicate apps: ['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings', 'Box', 'Zenefits', 'Google Ads', 'Google My Business', 'Slack', 'FreshBooks Classic', 'Insightly CRM', 'QuickBooks Accounting: Invoicing & Expenses', 'HipChat - Chat Built for Teams', 'Xero Accounting Software']


In [9]:
duplicate_check(apple_data)

Number of duplicate apps: 0


Examples of duplicate apps: []


From examining the outputs of the two statements, we see that it is only the Google Play dataset that contains duplicates. Therefore, we need to remove these duplicate entries. The function created only identified if the names of apps were the same, however the data pertaining to that row of data may be different.

In [10]:
def print_duplicates(name, dataset = google_data):
    for app in dataset:
        app_name = app[0]
        if app_name == name:
            print(app)

In [11]:
print_duplicates('Box')

['Box', 'BUSINESS', '4.2', '159872', 'Varies with device', '10,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Box', 'BUSINESS', '4.2', '159872', 'Varies with device', '10,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Box', 'BUSINESS', '4.2', '159872', 'Varies with device', '10,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 31, 2018', 'Varies with device', 'Varies with device']


In [12]:
print_duplicates('Google Ads')

['Google Ads', 'BUSINESS', '4.3', '29313', '20M', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 30, 2018', '1.12.0', '4.0.3 and up']
['Google Ads', 'BUSINESS', '4.3', '29313', '20M', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 30, 2018', '1.12.0', '4.0.3 and up']
['Google Ads', 'BUSINESS', '4.3', '29331', '20M', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 30, 2018', '1.12.0', '4.0.3 and up']


In [13]:
print_duplicates('Insightly CRM')

['Insightly CRM', 'BUSINESS', '3.8', '1383', '51M', '100,000+', 'Free', '0', 'Everyone', 'Business', 'July 12, 2018', '3.24.1', '5.0 and up']
['Insightly CRM', 'BUSINESS', '3.8', '1383', '51M', '100,000+', 'Free', '0', 'Everyone', 'Business', 'July 12, 2018', '3.24.1', '5.0 and up']


In [14]:
print_duplicates('FreshBooks Classic')

['FreshBooks Classic', 'BUSINESS', '4.1', '1802', '26M', '100,000+', 'Free', '0', 'Everyone', 'Business', 'April 18, 2018', '1.7.14', '4.2 and up']
['FreshBooks Classic', 'BUSINESS', '4.1', '1802', '26M', '100,000+', 'Free', '0', 'Everyone', 'Business', 'April 18, 2018', '1.7.14', '4.2 and up']


In [15]:
print_duplicates('Zenefits')

['Zenefits', 'BUSINESS', '4.2', '296', '14M', '50,000+', 'Free', '0', 'Everyone', 'Business', 'June 15, 2018', '3.2.1', '4.1 and up']
['Zenefits', 'BUSINESS', '4.2', '296', '14M', '50,000+', 'Free', '0', 'Everyone', 'Business', 'June 15, 2018', '3.2.1', '4.1 and up']


When examining the rows printed above, we see that the duplicates appear to be exact duplicate entries for each app. However, when examining the rows we printed for the Google Ads app, the main difference happens on the fourth position in the row, which corresponds to the number of reviews it has. This difference shows that the data was collected at different times. 

We can use this information to build a criterion for removing the duplicates. The higher the number of reviews, the more recent the data should be. Rather than removing the duplicate entries randomly, we'll only keep the row with the highest number of reviews and remove the other entries for any given app.

### Deletion of Duplicate Data

We will first make a list of the apps that and their greatest number of reviews. This should align with the number of duplicates identified in the previous code cell, with the outputted value being the size of the dataset subtracted by the number of identified duplicates. To do this, the following steps will be taken:

* We will first instantiate the dictionary which will store each app alongside its maximum review value.
* Loop through the dataset and store the name and the review count for each app.
* We will then compare these values with values currently in the dictionary, checking if the app is present in the dictionary and if `True`, check if the number of reviews is greater than that currently stored.
* If this is also `True`, the value for the app's review count is updated. If `False`, we check if the app is not present in the dictionary to which we add it in if this statement is `True`.

In [16]:
reviews_max = {}
for app in google_data[1:]:
    name = app[0]
    n_reviews = float(app[3])
    
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
    if name not in reviews_max:
        reviews_max[name] = n_reviews

print(len(reviews_max))

9659


`10840 - 1181` is equal to 9659, therefore we have correctly identified each unique app in the dataset and stored it in a dictionary alongside its highest number of reviews. We can now begin creating a new dataset without any duplicated data.

In [17]:
google_clean = []
already_added = []

for app in google_data[1:]:
    name = app[0]
    n_reviews = float(app[3])
    
    if n_reviews == reviews_max[name] and name not in already_added:
        google_clean.append(app)
        already_added.append(name)

Above, we have done the following:
* We have created two empty lists that will be appended to for every iteration of our loop through the dataset.
* After every iteration, we will check if the app's review count is equal to that stored in the dictionary previously created. 
* If `True`, we also check if the app has been iterated through already, as apps in which rows had the same review count had not been accounted for prior to this code cell.
* If both statements are `True`, then we can append the app to our clean dataset and our `already_added` dataset.

In [18]:
explore_data(google_clean,0,4,True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


Number of rows: 9659
Number of columns: 13


### Removal of Non-English Apps
As the company uses English for the apps they develop, we are only concerned on analysing apps designed for an English-speaking audience. Therefore, removal of apps that in a language other than English is required. 

First, we will create a function that can identify whether a string contains characters outside of the English alphabet. We can do this due to the fact that the English alphabet in ASCII ranges from values 0-127. Therefore, we can use this knowledge to help identify app names containing characters outside of the English alphabet.

In [19]:
def check_english(name):
    for character in name:
        if ord(character) > 127: # ord() converts a character into its ASCII value
            return False 
    
    return True

In [20]:
check_english('Instagram')

True

In [21]:
check_english('爱奇艺PPS -《欢乐颂2》电视剧热播')

False

In [22]:
check_english('Docs To Go™ Free Office Suite')

False

In [23]:
check_english('Instachat 😜')

False

After inspection, it appears the function is not complete yet as two apps with English characters were flagged as `False`. This is due to to the presence of ™ and an emoji, with emojis and special characters falling outside of the ASCII range of 0-127. 

If we're going to use the function we've created, we'll lose useful data since many English apps will be incorrectly labeled as non-English. To minimize the impact of data loss, we'll only remove an app if its name has more than three characters with corresponding numbers falling outside the ASCII range. This means all English apps with up to three emoji or other special characters will still be labeled as English. Our filter function is still not perfect, but it should be fairly effective. 


In [24]:
def check_english(name):
    count = 0
    for character in name:
        if ord(character) > 127:
            count += 1
        
        if count > 3:
            return False
    
    return True

In [25]:
check_english('Docs To Go™ Free Office Suite')

True

In [26]:
check_english('Instachat 😜')

True

In [27]:
check_english('爱奇艺PPS -《欢乐颂2》电视剧热播')

False

We can now see that the function is performing as we would like. We can now use this function on our datasets to filter out non-English apps from both datasets.

In [28]:
google_eng = []
apple_eng = []

for app in google_clean:
    app_name = app[0]
    
    if check_english(app_name):
        google_eng.append(app)

for app in apple_data:
    app_name = app[1]
    
    if check_english(app_name):
        apple_eng.append(app)

    

In [29]:
explore_data(google_eng, 0, 5, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'March 26, 2017', '1.0', '2.3 and up']


Number of rows: 9614
Number of columns: 13


In [30]:
explore_data(apple_eng, 0, 5, True, True)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


Number of rows: 6183
Number of columns: 16


### Isolation of Free Apps
As mentioned earlier, the company only builds apps that are free to download and install, and their main source of revenue consists of in-app ads. The datasets contain both free and non-free apps; we'll need to isolate only the free apps for the analysis.

To do this, we will loop through both datasets, identify apps that are free, and append them to a new list.

We use `ord` to convert the first character of the price into an ASCII code. As numerical values fall between ASCII values 48-57, checking if the first character falls outside of this range informs us that a currency sign is likely present. Therefore, removal of this currency sign is necessary to convert the string to a float.

In [31]:
free_google = []
free_apple = []

for app in google_eng:
    if 47 < ord(app[7][0]) < 58: # Checks if price contains currency sign
        price = float(app[7])
    else:
        price = float(app[7][1:]) # Removes currency sign if present
        
    if price == 0:
        free_google.append(app)

for app in apple_eng[1:]:
    price = float(app[4])
    
    if price == 0:
        free_apple.append(app)

In [32]:
explore_data(free_google, 0, 5, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'March 26, 2017', '1.0', '2.3 and up']


Number of rows: 8864
Number of columns: 13


In [33]:
explore_data(free_apple, 0,5, True)

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


['284035177', 'Pandora - Music & Radio', '130242560', 'USD', '0.0', '1126879', '3594', '4.0', '4.5', '8.4.1', '12+', 'Music', '37', '4', '1', '1']


Number of rows: 3222
Number of columns: 16


## Data Analysis
As mentioned previously, the goal is to determine the kinds of apps that are likely to attract more users because the number of people using the apps affects the company's revenue.

To minimise risks and overhead, the validation strategy for an app idea has three steps:
* Build a minimal Android version of the app, and add it to Google Play.
* If the app has a good response from users, develop it further. 
* If the app is profitable after six months, build an iOS version of the app and add it to the App Store.

Because the end goal is to add the app on both Google Play and the App Store, we need to find app profiles that are successful in both markets.

## Identifying Most Common Genres
The first analysis we will perform will be determining the most common genres in each market. From examining the two datasets, we can conclude that we'll need to build a frequency table for the `prime_genre` column of the App Store data set, and for the `Genres` and `Category` columns of the Google Play data set. 

We'll build two functions that can be used to analyse the frequency tables:
* One function to generate frequency tables that show percentages
* Another function to display the percentages in descending order

In [34]:
def freq_table(dataset, index):
    table = {}
    for row in dataset:
        column = row[index]
        
        if column in table:
            table[column] += 1
        else:
            table[column] = 1
            
            
    total = sum(table.values())
    for key in table:
        percentage = (table[key] / total) * 100
        table[key] = percentage    
            
    return table

In [35]:
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)
        
    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])

In [36]:
apple_genre_table = display_table(free_apple, 11)

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


In [37]:
google_genre_table = display_table(free_google, 9)

Tools : 8.449909747292418
Entertainment : 6.069494584837545
Education : 5.347472924187725
Business : 4.591606498194946
Productivity : 3.892148014440433
Lifestyle : 3.892148014440433
Finance : 3.7003610108303246
Medical : 3.531137184115524
Sports : 3.463447653429603
Personalization : 3.3167870036101084
Communication : 3.2378158844765346
Action : 3.1024368231046933
Health & Fitness : 3.0798736462093865
Photography : 2.944494584837545
News & Magazines : 2.7978339350180503
Social : 2.6624548736462095
Travel & Local : 2.3240072202166067
Shopping : 2.2450361010830324
Books & Reference : 2.1435018050541514
Simulation : 2.0419675090252705
Dating : 1.861462093862816
Arcade : 1.8501805054151623
Video Players & Editors : 1.7712093862815883
Casual : 1.7599277978339352
Maps & Navigation : 1.3989169675090252
Food & Drink : 1.2409747292418771
Puzzle : 1.128158844765343
Racing : 0.9927797833935018
Role Playing : 0.9363718411552346
Libraries & Demo : 0.9363718411552346
Auto & Vehicles : 0.9250902527075

In [38]:
google_category_table = display_table(free_google, 1)

FAMILY : 18.907942238267147
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.9034296028880866
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.6543321299638989
ART_AND_DESIGN : 

From creating the frequency tables produced, we can see that 58% of free, non-english apps on the App Store are Games. This shows a major disparity in this market as the second most common genre of app is Entertainment, at only 7.9%. Most notable is that 4 of the top 5 genres on the App Store are for entertainment purposes, with the only practical app being Education at 3.6%. Practical apps can be seen more as we venture further down the list, however due to the level of disparity between the representation of Games in this market and all other genres of apps, Games would be a clear favourite for an app profile for the App store. However, having a large number of apps for a particular genre does not necessarily imply that apps of that genre have a large number of users, therefore a greater level of analysis is required.

From assessing the frequency tables produced from the Google Play dataset, we see that the most common genres are predominantly practical apps, revealing a more balanced landscape in comparison to the App store. However, the number one category is Family at 19%, with Games being second at 10%. So again, we see Games being at the forefront of these frequency tables. But it is important to note that Tools are the number one genre of app from the free, english-speaking Apps found on the Google Play store at 8.4%, as well as being the 3rd most predominant category of app at 8.5%. 

From this analysis, it becomes clear that games are the app profiles found most on both the Google Play store and the App Store. 

## Identifying Most Popular Genres
### App Store
One way to find out what genres are the most popular is to calculate the average number of installs for each app genre. For the Google Play data set, this information can be found in the `Installs` column, but this information is missing for the App Store data set. As a workaround, we can take the total number of user ratings as a proxy, which can be found in the `rating_count_tot` column.

We can start by calculating the average number of user ratings per app genre on the App Store. To do that, we'll need to do the following:
* Isolate the apps of each genre
* Add up the user ratings for the apps of that genre
* Divide the sum by the number of apps belonging to that genre

To calculate the average number of user ratings for each genre, we'll use a nested loop.

In [39]:
temp_list = [] # Will store average number of user ratings for each genre

for genre in freq_table(free_apple,11):
    total = 0 # Total number of user ratings for each genre
    len_genre = 0 # Number of apps in each genre
    
    for app in free_apple:
        genre_app = app[11]
        
        if genre_app == genre:
            total += float(app[5])
            len_genre += 1
    
    avg_rat = total / len_genre
    temp_list.append([avg_rat,genre])

apple_genre_avg_user = sorted(temp_list, reverse = True) # Sorts list of genres in descending order of number of user ratings
for entry in apple_genre_avg_user:
    print(entry[1] + ':' + str(entry[0]))

Navigation:86090.33333333333
Reference:74942.11111111111
Social Networking:71548.34905660378
Music:57326.530303030304
Weather:52279.892857142855
Book:39758.5
Food & Drink:33333.92307692308
Finance:31467.944444444445
Photo & Video:28441.54375
Travel:28243.8
Shopping:26919.690476190477
Health & Fitness:23298.015384615384
Sports:23008.898550724636
Games:22788.6696905016
News:21248.023255813954
Productivity:21028.410714285714
Utilities:18684.456790123455
Lifestyle:16485.764705882353
Entertainment:14029.830708661417
Business:7491.117647058823
Education:7003.983050847458
Catalogs:4004.0
Medical:612.0


Here, we have generated an ordered frequency table which shows each genre of app present on the App Store, alongside its average number of user reviews. From this, we see a completely different set of results compared to what we initally saw when viewing the most common genres present on the App Store. We see practical apps dominating this table, with 7 of the top 10 apps being of a practical nature. Social Networking apps, Music apps, and Book apps are the only entertainment apps present in the top 10. Most interestingly, we see Games at 14th position out of the 23 genres, putting it in the bottom 10 of this list, a stark contrast to the analysis performed earlier. Therefore, as far as the App Store is concerned, an App profile in the Games genre may not be the most suited to our analysis.

Furthermore, we see that Social Networking apps are the only genre of app present in the top 5 of both the most common genre of apps present on the App Store, and the genres with the highest average number of user reviews. Therefore, it could now be said a Social Networking app may be the app profile best suited to the App Store but, it must also be considered that an app profile that fits with the most common genre of apps present on the App store may not be beneficial, due to the market being too highly saturated. As a result, the latter frequency table produced can be deemed of higher analytical importance. We see Navigation, Reference, and Social Networking apps all above the 70,000+ user review count, therefore all three genres may be considered as potential app profiles, with Music and Weather being alternative app profiles to consider. Understanding the market within each genre of app is equally important as the overall scope, as we need to know if these markets are penetrable.

In [40]:
for app in free_apple:
    if app[11] == 'Navigation':
        print(app[1], ':', app[5])

Waze - GPS Navigation, Maps & Real-time Traffic : 345046
Google Maps - Navigation & Transit : 154911
Geocaching® : 12811
CoPilot GPS – Car Navigation & Offline Maps : 3582
ImmobilienScout24: Real Estate Search in Germany : 187
Railway Route Search : 5


This reveals to us that the Navigation market on the App Store is dominated by Waze and Google Maps, therefore this market may not be penetrable.

In [41]:
for app in free_apple:
    if app[11] == 'Reference':
        print(app[1], ':', app[5])

Bible : 985920
Dictionary.com Dictionary & Thesaurus : 200047
Dictionary.com Dictionary & Thesaurus for iPad : 54175
Google Translate : 26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran : 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition : 17588
Merriam-Webster Dictionary : 16849
Night Sky : 12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) : 8535
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools : 4693
GUNS MODS for Minecraft PC Edition - Mods Tools : 1497
Guides for Pokémon GO - Pokemon GO News and Cheats : 826
WWDC : 762
Horror Maps for Minecraft PE - Download The Scariest Maps for Minecraft Pocket Edition (MCPE) Free : 718
VPN Express : 14
Real Bike Traffic Rider Virtual Reality Glasses : 8
教えて!goo : 0
Jishokun-Japanese English Dictionary & Translator : 0


Here, we see the Reference market controlled by the Bible app and Dictionary apps, however this market appears to be more penetrable, with many other apps receiving over 1000 user reviews, and a few receiving over 10000. 

In [42]:
for app in free_apple:
    if app[11] == 'Social Networking':
        print(app[1], ':', app[5])

Facebook : 2974676
Pinterest : 1061624
Skype for iPhone : 373519
Messenger : 351466
Tumblr : 334293
WhatsApp Messenger : 287589
Kik : 260965
ooVoo – Free Video Call, Text and Voice : 177501
TextNow - Unlimited Text + Calls : 164963
Viber Messenger – Text & Call : 164249
Followers - Social Analytics For Instagram : 112778
MeetMe - Chat and Meet New People : 97072
We Heart It - Fashion, wallpapers, quotes, tattoos : 90414
InsTrack for Instagram - Analytics Plus More : 85535
Tango - Free Video Call, Voice and Chat : 75412
LinkedIn : 71856
Match™ - #1 Dating App. : 60659
Skype for iPad : 60163
POF - Best Dating App for Conversations : 52642
Timehop : 49510
Find My Family, Friends & iPhone - Life360 Locator : 43877
Whisper - Share, Express, Meet : 39819
Hangouts : 36404
LINE PLAY - Your Avatar World : 34677
WeChat : 34584
Badoo - Meet New People, Chat, Socialize. : 34428
Followers + for Instagram - Follower Analytics : 28633
GroupMe : 28260
Marco Polo Video Walkie Talkie : 27662
Miitomo : 2

Again, this is another market with dominant forces present, as we see Facebook, Pinterest, Skype Messenger and Tumblr all with review counts over 300000. However, like the References market, we do also see many apps achieving high review counts. But it can be said that there are too many apps achieving this feat, highlighting that this is a very saturated market to enter. Therefore, this may not be the most penetrable market for the company's app. 

In [43]:
for app in free_apple:
    if app[11] == 'Music':
        print(app[1], ':', app[5])

Pandora - Music & Radio : 1126879
Spotify Music : 878563
Shazam - Discover music, artists, videos & lyrics : 402925
iHeartRadio – Free Music & Radio Stations : 293228
SoundCloud - Music & Audio : 135744
Magic Piano by Smule : 131695
Smule Sing! : 119316
TuneIn Radio - MLB NBA Audiobooks Podcasts Music : 110420
Amazon Music : 106235
SoundHound Song Search & Music Player : 82602
Sonos Controller : 48905
Bandsintown Concerts : 30845
Karaoke - Sing Karaoke, Unlimited Songs! : 28606
My Mixtapez Music : 26286
Sing Karaoke Songs Unlimited with StarMaker : 26227
Ringtones for iPhone & Ringtone Maker : 25403
Musi - Unlimited Music For YouTube : 25193
AutoRap by Smule : 18202
Spinrilla - Mixtapes For Free : 15053
Napster - Top Music & Radio : 14268
edjing Mix:DJ turntable to remix and scratch music : 13580
Free Music - MP3 Streamer & Playlist Manager Pro : 13443
Free Piano app by Yokee : 13016
Google Play Music : 10118
Certified Mixtapes - Hip Hop Albums & Mixtapes : 9975
TIDAL : 7398
YouTube Mu

The Music market fits the same pattern seen in the Social Networking market, with the market being too highly saturated to attempt to penetrate it.

The 5th and final market to analyse would be the Weather market, however people don't tend to spend too much time on Weather apps therefore the chances of making profits from in-app ads are low. Furthermore, receiving live weather updates would likely require connecting to a non-free API. Therefore, developing a Weather app is not a recommendable suggestion.

Instead, the final market to analyse will be the Book market.

In [44]:
for app in free_apple:
    if app[11] == 'Book':
        print(app[1], ':', app[5])

Kindle – Read eBooks, Magazines & Textbooks : 252076
Audible – audio books, original series & podcasts : 105274
Color Therapy Adult Coloring Book for Adults : 84062
OverDrive – Library eBooks and Audiobooks : 65450
HOOKED - Chat Stories : 47829
BookShout: Read eBooks & Track Your Reading Goals : 879
Dr. Seuss Treasury — 50 best kids books : 451
Green Riding Hood : 392
Weirdwood Manor : 197
MangaZERO - comic reader : 9
ikouhoushi : 0
MangaTiara - love comic reader : 0
謎解き : 0
謎解き2016 : 0


This appears to be a market similar to the Reference market, in which Kindle and Audible are at the peak of the market, however other apps in this market are still successful. Furthermore, the market is not too saturated therefore this is another suitable app profile. 

In fact, a possible app profile could be merging both the Reference genre with the Book genre by producing an app which is a reference to a popular book of some form, or multiple books from a particular author. As seen in the Reference genre, many book apps were present therefore there is an overlapping of the two genres already present which has proven to be successful. Therefore, an interactive Book app may be a suitable app profile, however as the app profile must fit both the Google Play market and the App store, further analysis is still necessary. 

### Google Play Store
Unlike the App Store, we have data about the number of install for the Google Play market, so we should be able to get a clearer picture about genre popularity. However, the install numbers are not precise enough - most of the values are open-ended (100+, 1000+, 5000+, etc.).

As a result, we don't know whether an ap with 100,000+ installs has 100,000 installs, 200,000, or 350,000. However, we don't need very precise data for these purposes - we only want to find out which app genres attract the most users.

We're going to leave the numbers as they are, which means that we'll consider that an app with 100,000+ installs has 100,000 installs, and an app with 1,000,000+ installs has 1,000,000 installs, and so on. To perform computations, however, we'll need to convert each install number from a string to a float. This means we need to remove the commas and the plus characters, or the conversion will fail and cause an error. To do this, we'll use the `str.replace()` method.

In [45]:
temp_list = [] # Will store average number of user ratings for each genre

for category in freq_table(free_google,1):
    total = 0 # Sum of installs per genre
    len_category = 0 # Number of apps in each genre
    
    for app in free_google:
        category_app = app[1]
        
        if category_app == category:
            install = app[5]
            install = install.replace('+', '')
            install = install.replace(',', '')
            total += float(install)
            len_category += 1
    
    avg_installs = total / len_category
    temp_list.append([avg_installs,category])

google_category_installs = sorted(temp_list, reverse = True) # Sorts list of categories in descending order of installs
for entry in google_category_installs:
    print(entry[1] + ':' + str(entry[0]))

COMMUNICATION:38456119.167247385
VIDEO_PLAYERS:24727872.452830188
SOCIAL:23253652.127118643
PHOTOGRAPHY:17840110.40229885
PRODUCTIVITY:16787331.344927534
GAME:15588015.603248259
TRAVEL_AND_LOCAL:13984077.710144928
ENTERTAINMENT:11640705.88235294
TOOLS:10801391.298666667
NEWS_AND_MAGAZINES:9549178.467741935
BOOKS_AND_REFERENCE:8767811.894736841
SHOPPING:7036877.311557789
PERSONALIZATION:5201482.6122448975
WEATHER:5074486.197183099
HEALTH_AND_FITNESS:4188821.9853479853
MAPS_AND_NAVIGATION:4056941.7741935486
FAMILY:3695641.8198090694
SPORTS:3638640.1428571427
ART_AND_DESIGN:1986335.0877192982
FOOD_AND_DRINK:1924897.7363636363
EDUCATION:1833495.145631068
BUSINESS:1712290.1474201474
LIFESTYLE:1437816.2687861272
FINANCE:1387692.475609756
HOUSE_AND_HOME:1331540.5616438356
DATING:854028.8303030303
COMICS:817657.2727272727
AUTO_AND_VEHICLES:647317.8170731707
LIBRARIES_AND_DEMO:638503.734939759
PARENTING:542603.6206896552
BEAUTY:513151.88679245283
EVENTS:253542.22222222222
MEDICAL:120550.6198083

As we produced for the App Store, here is an ordered frequency table revealing the categories of apps on the Google Play Store with the highest average number of installs. We see Communication dominate this table, with its average number of installs being over double the number of any other category outside the top 5. We will explore the markets of the top 5 categories to determine which are penetrable and if any coincide with what we analysed from the App Store.

In [46]:
for app in free_google:
    if app[1] == 'COMMUNICATION':
        print(app[0], ':', app[5])

WhatsApp Messenger : 1,000,000,000+
Messenger for SMS : 10,000,000+
My Tele2 : 5,000,000+
imo beta free calls and text : 100,000,000+
Contacts : 50,000,000+
Call Free – Free Call : 5,000,000+
Web Browser & Explorer : 5,000,000+
Browser 4G : 10,000,000+
MegaFon Dashboard : 10,000,000+
ZenUI Dialer & Contacts : 10,000,000+
Cricket Visual Voicemail : 10,000,000+
TracFone My Account : 1,000,000+
Xperia Link™ : 10,000,000+
TouchPal Keyboard - Fun Emoji & Android Keyboard : 10,000,000+
Skype Lite - Free Video Call & Chat : 5,000,000+
My magenta : 1,000,000+
Android Messages : 100,000,000+
Google Duo - High Quality Video Calls : 500,000,000+
Seznam.cz : 1,000,000+
Antillean Gold Telegram (original version) : 100,000+
AT&T Visual Voicemail : 10,000,000+
GMX Mail : 10,000,000+
Omlet Chat : 10,000,000+
My Vodacom SA : 5,000,000+
Microsoft Edge : 5,000,000+
Messenger – Text and Video Chat for Free : 1,000,000,000+
imo free video calls and chat : 500,000,000+
Calls & Text by Mo+ : 5,000,000+
free 

The Communication market appears to be highly saturated, therefore this app profile will not be considered.

In [47]:
for app in free_google:
    if app[1] == 'VIDEO_PLAYERS':
        print(app[0], ':', app[5])

YouTube : 1,000,000,000+
All Video Downloader 2018 : 1,000,000+
Video Downloader : 10,000,000+
HD Video Player : 1,000,000+
Iqiyi (for tablet) : 1,000,000+
Video Player All Format : 10,000,000+
Motorola Gallery : 100,000,000+
Free TV series : 100,000+
Video Player All Format for Android : 500,000+
VLC for Android : 100,000,000+
Code : 10,000,000+
Vote for : 50,000,000+
XX HD Video downloader-Free Video Downloader : 1,000,000+
OBJECTIVE : 1,000,000+
Music - Mp3 Player : 10,000,000+
HD Movie Video Player : 1,000,000+
YouCut - Video Editor & Video Maker, No Watermark : 5,000,000+
Video Editor,Crop Video,Movie Video,Music,Effects : 1,000,000+
YouTube Studio : 10,000,000+
video player for android : 10,000,000+
Vigo Video : 50,000,000+
Google Play Movies & TV : 1,000,000,000+
HTC Service － DLNA : 10,000,000+
VPlayer : 1,000,000+
MiniMovie - Free Video and Slideshow Editor : 50,000,000+
Samsung Video Library : 50,000,000+
OnePlus Gallery : 1,000,000+
LIKE – Magic Video Maker & Community : 50,

The Video Players market also appears to be highly saturated, therefore this app profile will also not be considered. As we have already ruled out developing a social networking app, the next market we will assess is Photography.

In [48]:
for app in free_google:
    if app[1] == 'PHOTOGRAPHY':
        print(app[0], ':', app[5])

TouchNote: Cards & Gifts : 1,000,000+
FreePrints – Free Photos Delivered : 1,000,000+
Groovebook Photo Books & Gifts : 500,000+
Moony Lab - Print Photos, Books & Magnets ™ : 50,000+
LALALAB prints your photos, photobooks and magnets : 1,000,000+
Snapfish : 1,000,000+
Motorola Camera : 50,000,000+
HD Camera - Best Cam with filters & panorama : 5,000,000+
LightX Photo Editor & Photo Effects : 10,000,000+
Sweet Snap - live filter, Selfie photo edit : 10,000,000+
HD Camera - Quick Snap Photo & Video : 1,000,000+
B612 - Beauty & Filter Camera : 100,000,000+
Waterfall Photo Frames : 1,000,000+
Photo frame : 100,000+
Huji Cam : 5,000,000+
Unicorn Photo : 1,000,000+
HD Camera : 5,000,000+
Makeup Editor -Beauty Photo Editor & Selfie Camera : 1,000,000+
Makeup Photo Editor: Makeup Camera & Makeup Editor : 1,000,000+
Moto Photo Editor : 5,000,000+
InstaBeauty -Makeup Selfie Cam : 50,000,000+
Garden Photo Frames - Garden Photo Editor : 500,000+
Photo Frame : 10,000,000+
Selfie Camera - Photo Edito

Again, this appears to be another market that is highly saturated therefore will not be considered.

In [49]:
for app in free_google:
    if app[1] == 'PRODUCTIVITY':
        print(app[0], ':', app[5])

Microsoft Word : 500,000,000+
All-In-One Toolbox: Cleaner, Booster, App Manager : 10,000,000+
AVG Cleaner – Speed, Battery & Memory Booster : 10,000,000+
QR Scanner & Barcode Scanner 2018 : 10,000,000+
Chrome Beta : 10,000,000+
Microsoft Outlook : 100,000,000+
Google PDF Viewer : 10,000,000+
My Claro Peru : 5,000,000+
Power Booster - Junk Cleaner & CPU Cooler & Boost : 1,000,000+
Google Assistant : 10,000,000+
Microsoft OneDrive : 100,000,000+
Calculator - unit converter : 50,000,000+
Microsoft OneNote : 100,000,000+
Metro name iD : 10,000,000+
Google Keep : 100,000,000+
Archos File Manager : 5,000,000+
ES File Explorer File Manager : 100,000,000+
ASUS SuperNote : 10,000,000+
HTC File Manager : 10,000,000+
MyMTN : 1,000,000+
Dropbox : 500,000,000+
ASUS Quick Memo : 10,000,000+
HTC Calendar : 10,000,000+
Google Docs : 100,000,000+
ASUS Calling Screen : 10,000,000+
lifebox : 5,000,000+
Yandex.Disk : 5,000,000+
Content Transfer : 5,000,000+
HTC Mail : 10,000,000+
Advanced Task Killer : 50

It is clear that a lot of the markets that peak on the frequency table are due to having powerhouse companies that greatly skew the averages calculated. Given that we have already identified that a Book or Reference app profile may be best to develop, assessment of this market on the App Store is necessary.

In [50]:
for app in free_google:
    if app[1] == 'BOOKS_AND_REFERENCE':
        print(app[0], ':', app[5])

E-Book Read - Read Book for free : 50,000+
Download free book with green book : 100,000+
Wikipedia : 10,000,000+
Cool Reader : 10,000,000+
Free Panda Radio Music : 100,000+
Book store : 1,000,000+
FBReader: Favorite Book Reader : 10,000,000+
English Grammar Complete Handbook : 500,000+
Free Books - Spirit Fanfiction and Stories : 1,000,000+
Google Play Books : 1,000,000,000+
AlReader -any text book reader : 5,000,000+
Offline English Dictionary : 100,000+
Offline: English to Tagalog Dictionary : 500,000+
FamilySearch Tree : 1,000,000+
Cloud of Books : 1,000,000+
Recipes of Prophetic Medicine for free : 500,000+
ReadEra – free ebook reader : 1,000,000+
Anonymous caller detection : 10,000+
Ebook Reader : 5,000,000+
Litnet - E-books : 100,000+
Read books online : 5,000,000+
English to Urdu Dictionary : 500,000+
eBoox: book reader fb2 epub zip : 1,000,000+
English Persian Dictionary : 500,000+
Flybook : 500,000+
All Maths Formulas : 1,000,000+
Ancestry : 5,000,000+
HTC Help : 10,000,000+
E

As this is an amalgamation of the two categories Book and Reference, we see a large number of apps, however this also shows skew.

In [51]:
for app in free_google:
    if app[1] == 'BOOKS_AND_REFERENCE' and (app[5] == '1,000,000,000+'
                                           or app[5] == '500,000,000+'
                                           or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

Google Play Books : 1,000,000,000+
Bible : 100,000,000+
Amazon Kindle : 100,000,000+
Wattpad 📖 Free Books : 100,000,000+
Audiobooks from Audible : 100,000,000+


Relative to the number of apps in this category, we see that there are few apps with extreme popularity. Therefore, instead, we will view apps that are between 1,000,000 and 100,000,000 installs.

In [52]:
for app in free_google:
    if app[1] == 'BOOKS_AND_REFERENCE' and (app[5] == '1,000,000+'
                                           or app[5] == '5,000,000+'
                                           or app[5] == '10,000,000+'
                                           or app[5] == '50,000,000+'):
        print(app[0], ':', app[5])

Wikipedia : 10,000,000+
Cool Reader : 10,000,000+
Book store : 1,000,000+
FBReader: Favorite Book Reader : 10,000,000+
Free Books - Spirit Fanfiction and Stories : 1,000,000+
AlReader -any text book reader : 5,000,000+
FamilySearch Tree : 1,000,000+
Cloud of Books : 1,000,000+
ReadEra – free ebook reader : 1,000,000+
Ebook Reader : 5,000,000+
Read books online : 5,000,000+
eBoox: book reader fb2 epub zip : 1,000,000+
All Maths Formulas : 1,000,000+
Ancestry : 5,000,000+
HTC Help : 10,000,000+
Moon+ Reader : 10,000,000+
English-Myanmar Dictionary : 1,000,000+
Golden Dictionary (EN-AR) : 1,000,000+
All Language Translator Free : 1,000,000+
Aldiko Book Reader : 10,000,000+
Dictionary - WordWeb : 5,000,000+
50000 Free eBooks & Free AudioBooks : 5,000,000+
Al-Quran (Free) : 10,000,000+
Al Quran Indonesia : 10,000,000+
Al'Quran Bahasa Indonesia : 10,000,000+
Al Quran Al karim : 1,000,000+
Al Quran : EAlim - Translations & MP3 Offline : 5,000,000+
Koran Read &MP3 30 Juz Offline : 1,000,000+
H

This now many more successful apps, however the majority of the apps appear to be either libraries, dictionaries, or e-book readers/processers. But two other apps profiles which appear to be successful are apps regarding sacred scriptures and apps relating to successful books, shows or games.

Therefore, it may be best for the company to develop an app which stores sacred scriptures of a religion, allowing the user to interact with it by displaying daily quotes, quizzes or having an audio or translatable version available within the app. This could allow for built-in dictionaries and translators, reducing the need for the user to come off the app to open an alternative library app. 

An alternative app profile could be an app that serves as a reference point for a successfull game or collection of books, providing quizzes, having manuals for the game, or daily quotes and analysis of the books.

## Conclusion
In this project, we analysed data about the App Store and Google Play Store in order to find an app profile that can be profitable for both markets, with the aim of making profits through in-app advertising. 

We concluded that developing an app centred around either a sacred scripture or successful franchise could be profitable for both the Google Play and the App Store markets. The app would have special features that make it interactive for the user, such as daily quotes, quizzes, audio-books, in-app dictionaries, or forums, etc.