# Mobile App Analysis

This analysis is being completed for a company that produces free apps on both Android and iOS mobile platforms. The company's primary revenue stream is in-app ads. As such, the company would like to maximize its revenue by producing free apps that engage a the largest number of users possible. 

To determine what types of apps attract the most users, Google Play & Apple Store app data will be used to to evaluate the type of apps that users engage with most frequently. The goal of this analysis is to make recommendations for mobile apps that are likely to generate the highest revenue.

### 1. Importing & Exploring the Data

As of 2018, the Google Play Store had 2.1 million Android apps and the App Store had 2 million iOS apps available. In order to conduct this analysis, a sample of the total apps on each platform will be used. The sample contains data from approximately 10,000 Android apps and 7,000 iOS apps.

In [1]:
#Open both data sets & save both as lists of lists

from csv import reader

opened_file_ios = open('AppleStore.csv')
read_file_ios = reader(opened_file_ios)
ios_data = list(read_file_ios)
ios_header = ios_data[0]
ios_data = ios_data[1:]

opened_file_android = open('googleplaystore.csv')
read_file_android = reader(opened_file_android)
android_data = list(read_file_android)
android_header = android_data[0]
android_data = android_data[1:]

The code below creates an **explore_data()** function to print rows from the datasets for the purpose of exploring the data.

In [4]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n')

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))
     
#Print the first 3 rows of the Android app dataset
explore_data(android_data, 0, 3, True)

print('\n')

#Print the first 3 rows of the iOS app dataset
explore_data(ios_data, 0, 3, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


Number of rows: 10841
Number of columns: 13


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+

In [5]:
#Print the column names of both datasets
print(android_header)
print('\n')
print(ios_header)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


The categories most likely to help us conduct this analysis include:

| Android Apps | iOS Apps | Content of Category |
| ------------ | -------- | ------------------- |
| App | track_name | Name of the App |
| Category | prime_genre | Type of App |
| Price | price | Price of App |
| Rating | user_rating | Average Rating |
| Reviews | rating_count_tot | Total number of Reviews |
| Content Rating | cont_rating | Audience/Content Rating |

### 2. Data Cleaning

In this next section, the data will be cleaned to remove errors, duplicates and make corrections to allow for analysis.

In [6]:
#Print the row with the suspected error
print(android_data[10472:10473])

[['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']]


This row does have an error. The category, or type of app, is missing. As a result all data has been shifted left. This makes this row unuseable for this analysis and we will delete the row.

In [7]:
del android_data[10472]

In [8]:
#Print the length of the dataset to ensure the row has been deleted.
print(len(android_data))

10840


Upon review of the Google Play Store app data, it is evident that some apps are included in the dataset multiple times. For example, Facebook is included twice as demonstrated below.

In [7]:
for app in android_data:
    name = app[0]
    if name == 'Facebook':
        print(app)

['Facebook', 'SOCIAL', '4.1', '78158306', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'August 3, 2018', 'Varies with device', 'Varies with device']
['Facebook', 'SOCIAL', '4.1', '78128208', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'August 3, 2018', 'Varies with device', 'Varies with device']


Upon review of the Facebook duplicates, all entries are the same with the exception of the number of reviews. In the next cell, we will examine how many duplications exist in the dataset and then we will remove duplications.

In [9]:
duplicate_apps = []
unique_apps = []
for app in android_data:
    name = app[0]
    if name in unique_apps:
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)
        
print('Number of duplicate app entries:', len(duplicate_apps))

Number of duplicate app entries: 1181


As we see above, there are 1181 duplications in the Google Play Store apps data and duplications can skew analyzes we might wish to run on these datasets. As such, we will have to remove duplicates, but cannot do so randomly. In order to systematically remove duplications, we will utilize the number of reviews received to determine with entry to keep, selecting the entry with the greatest number of reviews. This is a reasonable criterion given that it is most likely that the higher number of reviews includes all previous reviews.

In [11]:
#Creating a dictionary of app names and the corresponding highest number of reviews.

#Create an empty dictionary:
reviews_max = {}

#Loop through the Android dataset converting the number of reviews (index 3) to a float 
#and placing the app name in the dictionary, if not already there. If the app name is 
#already in the dictionary, the number or revies associated with the app name will be updated
#in the dictionary, if it is greater than the number already present.

for app in android_data:
    name = app[0]
    n_reviews = float(app[3])
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
    elif name not in reviews_max:
        reviews_max[name] = n_reviews
        
#Examine the length of the created dictionary against what is expected, given the number of 
#duplicate apps in the dataset

print('Expected length:', len(android_data) - 1181)
print('Actual length:', len(reviews_max))

Expected length: 9659
Actual length: 9659


In [12]:
#Use the reviews_max dictionary created above to remove duplicate rows

#Create two empty lists. One for the dataset without duplicates (android_clean) and the 
#other for app names

android_clean = []
already_added = []

#Loop through the Android dataset and add apps to the android_clean list if the app name
#is not already in the already_added list and the number of reviews is equal to the 
#number of reviews in the reviews_max dictionary

for app in android_data:
    name = app[0]
    n_reviews = float(app[3])
    if n_reviews == reviews_max[name] and name not in already_added:
        android_clean.append(app)
        already_added.append(name)

#Review the new dataset
explore_data(android_clean, 0, 3, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9659
Number of columns: 13


In these next couple of cells, we will create a function to identify non-English characters in order to isolate only English language apps in the two datasets

In [14]:
#Create a function, is_english, to identify non-English characters by using ascii numbers
#and returning false if more than 3 characters in the string have an ascii number greater than 127

def is_english(string):
    non_ascii = 0
    for character in string:
        if ord(character) > 127:
            non_ascii += 1
    
    if non_ascii > 3:
        return False
    else:
        return True

#Test the function to ensure it works as intended

print(is_english('Instagram'))
print(is_english('爱奇艺PPS -《欢乐颂2》电视剧热播')) 
print(is_english('Docs To Go™ Free Office Suite'))
print(is_english('Instachat 😜'))

True
False
True
True


In [16]:
#Loop through each dataset to remove non-English language apps using the is_english function created above
english_ios_data = []
english_android_data = []

for app in android_clean:
    name = app[0]
    if is_english(name):
        english_android_data.append(app)
    
for app in ios_data:
    name = app[1]
    if is_english(name):
        english_ios_data.append(app)
    
explore_data(english_android_data, 0, 3, True)
print('\n')
explore_data(english_ios_data, 0, 3, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9614
Number of columns: 13


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 

In [17]:
#Loop through the two datasets to create the final datasets with only free apps included 
ios_final = []
android_final = []

for app in english_android_data:
    price = app[7]
    if price == '0':
        android_final.append(app)
    
for app in english_ios_data:
    price = app[4]
    if price == '0.0':
        ios_final.append(app)
    
explore_data(android_final, 0, 3, True)
print('\n')
explore_data(ios_final, 0, 3, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 8864
Number of columns: 13


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 

### 3. Data Analysis

In this section, we will analyze the data to acheive the end goal of identifying apps likely to attract the most users on both Android & iOS platforms.

Our company builds free apps that appeal to a wide range of mobile users and we earn our revenue through in-app ads. We use a step-wise process to building an app, starting with a minimal Android version on Google Play. Based on user response, we either continue to develop the app or drop it. If the app is profitable after 6 months on Goggle Play, we build an iOS version for the Apple Store. 

We seek to build apps that are successful on both Google Play and the Apple Store. In order to identify future app possibilities, we will be using the Android and iOS datasets to identify genres of apps that are successful on both platforms.

In [19]:
#Print the dataset columns to identify columns to use for generating frequency tables
print(android_header)
print('\n')
print(ios_header)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


#### Part 1: Identifying the genres with the most apps on both iOS & Android

In [20]:
#Create a function to return a frequency table with percetages for any column we select

def freq_table(dataset, index):
    table = {}
    total = 0
    for row in dataset:
        total += 1
        value = row[index]
        if value in table:
            table[value] += 1
        else:
            table[value] = 1
    
    table_perc = {}
    for key in table:
        percentage = (table[key]/total)*100
        table_perc[key] = percentage
    
    return table_perc

#Define the display_table function 
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])

In [21]:
#Return the frequency table for the prime_genre column of the iOS dataset
display_table(ios_final, -5)

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


In [22]:
#Return the frequency table for the category column of the Android dataset
display_table(android_final, 1)

FAMILY : 18.907942238267147
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.9034296028880866
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.6543321299638989
ART_AND_DESIGN : 

In [23]:
#Return the frequency table for the genres column of the Android dataset
display_table(android_final, -4)

Tools : 8.449909747292418
Entertainment : 6.069494584837545
Education : 5.347472924187725
Business : 4.591606498194946
Productivity : 3.892148014440433
Lifestyle : 3.892148014440433
Finance : 3.7003610108303246
Medical : 3.531137184115524
Sports : 3.463447653429603
Personalization : 3.3167870036101084
Communication : 3.2378158844765346
Action : 3.1024368231046933
Health & Fitness : 3.0798736462093865
Photography : 2.944494584837545
News & Magazines : 2.7978339350180503
Social : 2.6624548736462095
Travel & Local : 2.3240072202166067
Shopping : 2.2450361010830324
Books & Reference : 2.1435018050541514
Simulation : 2.0419675090252705
Dating : 1.861462093862816
Arcade : 1.8501805054151623
Video Players & Editors : 1.7712093862815883
Casual : 1.7599277978339352
Maps & Navigation : 1.3989169675090252
Food & Drink : 1.2409747292418771
Puzzle : 1.128158844765343
Racing : 0.9927797833935018
Role Playing : 0.9363718411552346
Libraries & Demo : 0.9363718411552346
Auto & Vehicles : 0.9250902527075

##### Analyzing the frequency tables 

***iOS Genre Frequency:***
The game genre comprises 58% of the free English iOS, which far exceeds the frequency of any other genre. The next most common genre is entertainment apps, which comprise 7.9% of all free English apps. Amongst the top ten genres, six are intended for entertainment purposes and comprise a total of 78.4% of all free English apps. The other 4 apps in the top 10 - education, shopping, utilities, and health & fitness – make up 10.8% of the free English apps. 

Based on this data alone, free English entertainment apps are more common than other types of apps and games are significant more common than any other type of app. However, this data speaks only to the frequency of these types of apps on the iOS platform and does not indicate the number of users or how popular the apps are. Thus, this data suggests that we should further explore entertainment apps, but we must first understand if frequency correlates with usage or popularity. It would be unwise to recommend an app profile for development based solely on this data; further analysis is necessary to better understand user preferences.

***Android Genre Frequency:***
The tools genre (8.45%) and the family category (18.91%) are the more common among the Android free English apps. While the top 10 categories and genres largely overlap in the free English Android dataset, there is some variability, likely due to the definition of categories versus genres. In order to mirror the iOS discussion, we will consider the Android genres. Unlike the iOS app frequencies, there is not one genre that rises far above others within the free English Android apps. The top genre is tools followed by a mix of both entertainment and practical apps. Only 2 of the top 10 apps in the Android dataset are entertainment apps – entertainment & sports – and they comprise 9.5% of all free English apps. The remaining 8 apps in the top 10 are oriented towards lifestyle, business, finance, and productivity. These apps make up 36.8% of the free English Android apps. 

Like the iOS dataset, these numbers reflect the frequency of the genre within the free English apps and not the popularity or reach of the apps. Thus, we can conclude that the frequency of different types of apps varies from the App Store to Google Play. From the frequency data, free English iOS apps appear to be largely for entertainment purposes, while free English Android apps appear to have more balance between entertainment and practical apps. However, we cannot identify which free English apps are the most used or most popular. Further analysis will be needed to make a recommendation for an app to develop. 

#### Part 2: Identifying the genres that are the most popular on both iOS & Android

The fastest way to identify the apps that are most popular would be to identify the average number of installs by genre. However, install data is only available in the Android data. In order to analyze the iOS dataset, we will utilize number of ratings as a proxy for installs.

In [24]:
#Using the freq_table function created previously, we'll generate a frequency table of the 
#prime_genres in the iOS dataset and then we'll loop through the dataset to determine the 
#average number of user ratings for each genre.

genres_ios = freq_table(ios_final, -5)

for genre in genres_ios:
    total = 0
    len_genre = 0
    for app in ios_final:
        genre_app = app[-5]
        if genre_app == genre:            
            n_ratings = float(app[5])
            total += n_ratings
            len_genre += 1
    avg_n_ratings = total / len_genre
    print(genre, ':', avg_n_ratings)

Music : 57326.530303030304
Health & Fitness : 23298.015384615384
Social Networking : 71548.34905660378
Shopping : 26919.690476190477
Games : 22788.6696905016
Productivity : 21028.410714285714
Catalogs : 4004.0
Utilities : 18684.456790123455
Book : 39758.5
Travel : 28243.8
Education : 7003.983050847458
Sports : 23008.898550724636
Navigation : 86090.33333333333
Finance : 31467.944444444445
Medical : 612.0
Reference : 74942.11111111111
Weather : 52279.892857142855
Business : 7491.117647058823
Lifestyle : 16485.764705882353
Entertainment : 14029.830708661417
Food & Drink : 33333.92307692308
Photo & Video : 28441.54375
News : 21248.023255813954


***Analyzing the iOS apps based on average user ratings:***
The type of popular genres in the Apple Store, based on average number of user ratings, is a very different picture than the frequency of genre types. The most popular free English apps by average number of user ratings is balanced between entertainment and practical apps. The most popular is navigation apps, followed by reference, and then social networking.

Based on the frequency of app genres in both datasets and the popularity of app generes in the Apple Store, we would recommend building a free finance app. Free English finance apps are in the top 10 for frequency on Google Play and are in the top 10 most popular on the Apple Store. However, they are also infrequent (1.1% of free English apps) on the Apple Store, indicating that there is room in the market for another app. 

Now we will move on to analyzing the Android dataset.

In [28]:
#We will replicate the code from the iOS frequency table above. Using the freq_table 
#function created previously, we'll generate a frequency table of the categories in 
#the Android dataset and then we'll loop through the dataset to remove string characters, convert 
#number of installs to floats, and determine the average number of installs for each genre.

categories_android = freq_table(android_final, 1)

for category in categories_android:
    total = 0
    len_category = 0
    for app in android_final:
        category_app = app[1]
        if category_app == category:            
            n_installs = app[5]
            n_installs = n_installs.replace(',', '')
            n_installs = n_installs.replace('+', '')
            total += float(n_installs)
            len_category += 1
    avg_n_installs = total / len_category
    print(category, ':', avg_n_installs)

BEAUTY : 513151.88679245283
SOCIAL : 23253652.127118643
COMICS : 817657.2727272727
PARENTING : 542603.6206896552
HOUSE_AND_HOME : 1331540.5616438356
HEALTH_AND_FITNESS : 4188821.9853479853
PERSONALIZATION : 5201482.6122448975
MAPS_AND_NAVIGATION : 4056941.7741935486
PHOTOGRAPHY : 17840110.40229885
TOOLS : 10801391.298666667
DATING : 854028.8303030303
NEWS_AND_MAGAZINES : 9549178.467741935
SPORTS : 3638640.1428571427
PRODUCTIVITY : 16787331.344927534
AUTO_AND_VEHICLES : 647317.8170731707
VIDEO_PLAYERS : 24727872.452830188
LIFESTYLE : 1437816.2687861272
FAMILY : 3695641.8198090694
TRAVEL_AND_LOCAL : 13984077.710144928
EVENTS : 253542.22222222222
LIBRARIES_AND_DEMO : 638503.734939759
MEDICAL : 120550.61980830671
ART_AND_DESIGN : 1986335.0877192982
EDUCATION : 1833495.145631068
BUSINESS : 1712290.1474201474
ENTERTAINMENT : 11640705.88235294
COMMUNICATION : 38456119.167247385
BOOKS_AND_REFERENCE : 8767811.894736841
WEATHER : 5074486.197183099
GAME : 15588015.603248259
FINANCE : 1387692.4756

***Analyzing Android free English apps by average number of installs:***
Using the average number of installs among free English Android apps, we find a simialr pattern to the frequency of genres in the Android dataset. The 10 most frequently installed categories of free English apps is fairly balanced between entertainment and practical, with a slight bias towards entertainment apps.

Given that the goal here is to identify a genre of apps to build that will be popular on both Google Play and the Apple Store, we now have to look at the app genres that are popular on both platforms. The genres that overlap both top 10 lists include social/social networking, photo/video, and travel. However, apps in the books, reference, magazines, and finance genres are also popluar on both platforms. Before deciding on the genre to pursue, we're going to examine some of these categories in more depth.

#### Part 3: Within Genre Analysis

In this section, we're going to look at the apps within some of the most popular genres on both the iOS and Android platforms.

In [29]:
#Examine the apps in the social genres on both platforms

for app in ios_final:
    if app[-5] == 'Social Networking':
        print(app[1], ':', app[5])
        
print('\n')

for app in android_final:
    if app[1] == 'SOCIAL' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

Facebook : 2974676
Pinterest : 1061624
Skype for iPhone : 373519
Messenger : 351466
Tumblr : 334293
WhatsApp Messenger : 287589
Kik : 260965
ooVoo – Free Video Call, Text and Voice : 177501
TextNow - Unlimited Text + Calls : 164963
Viber Messenger – Text & Call : 164249
Followers - Social Analytics For Instagram : 112778
MeetMe - Chat and Meet New People : 97072
We Heart It - Fashion, wallpapers, quotes, tattoos : 90414
InsTrack for Instagram - Analytics Plus More : 85535
Tango - Free Video Call, Voice and Chat : 75412
LinkedIn : 71856
Match™ - #1 Dating App. : 60659
Skype for iPad : 60163
POF - Best Dating App for Conversations : 52642
Timehop : 49510
Find My Family, Friends & iPhone - Life360 Locator : 43877
Whisper - Share, Express, Meet : 39819
Hangouts : 36404
LINE PLAY - Your Avatar World : 34677
WeChat : 34584
Badoo - Meet New People, Chat, Socialize. : 34428
Followers + for Instagram - Follower Analytics : 28633
GroupMe : 28260
Marco Polo Video Walkie Talkie : 27662
Miitomo : 2

Based on the apps in the social genres, it appears that this genre is heavily dominated by a handful of the big social platforms, including Facebook, Pinterest, LinkedIn, Tumblr, etc. Given that the popularity of this genre is heavily skewed by a handful of major social platforms, it does not appear that this a viable genre to recommend. First, it may only be popular as a result of the most common apps and these apps are heavily used and may be difficult to compete with. 

In [30]:
#Examine the apps in the photo and video genres on both platforms

for app in ios_final:
    if app[-5] == 'Photo & Video':
        print(app[1], ':', app[5])
        
print('\n')

for app in android_final:
    if app[1] == 'PHOTOGRAPHY' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

Instagram : 2161558
Snapchat : 323905
YouTube - Watch Videos, Music, and Live Streams : 278166
Pic Collage - Picture Editor & Photo Collage Maker : 123433
Funimate video editor: add cool effects to videos : 123268
musical.ly - your video social network : 105429
Photo Collage Maker & Photo Editor - Live Collage : 93781
Vine Camera : 90355
Google Photos - unlimited photo and video storage : 88742
Flipagram : 79905
Mixgram - Picture Collage Maker - Pic Photo Editor : 54282
Shutterfly: Prints, Photo Books, Cards Made Easy : 51427
Pic Jointer – Photo Collage, Camera Effects Editor : 51330
Color Pop Effects - Photo Editor & Picture Editing : 45320
Photo Grid - photo collage maker & photo editor : 40531
iSwap Faces LITE : 39722
MOLDIV - Photo Editor, Collage & Beauty Camera : 39501
Photo Editor by Aviary : 39501
Photo Lab: Picture Editor, effects & fun face app : 34585
Rookie Cam - Photo Editor & Filter Camera : 33921
FotoRus -Camera & Photo Editor & Pic Collage Maker : 32558
PicsArt Photo St

Within the photo and video genres, Instragram, Snapchat, and YouTube dominate the Apple Store. After those apps, the remaining apps in the Apple Store and the photo apps on Google Play are all various picture modification or picture sharing apps. This is a ganre that may be worth exploring for our company.

In [31]:
#Examine the apps in the photo and video genres on both platforms

for app in ios_final:
    if app[-5] == 'Travel':
        print(app[1], ':', app[5])
        
print('\n')

for app in android_final:
    if app[1] == 'TRAVEL_AND_LOCAL' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

Google Earth : 446185
Yelp - Nearby Restaurants, Shopping & Services : 223885
GasBuddy : 145549
TripAdvisor Hotels Flights Restaurants : 56194
Uber : 49466
Lyft : 46922
HotelTonight - Great Deals on Last Minute Hotels : 32341
Hotels & Vacation Rentals by Booking.com : 31261
Southwest Airlines : 30552
Airbnb : 22302
Expedia Hotels, Flights & Vacation Package Deals : 10278
Fly Delta : 8094
Hopper - Predict, Watch & Book Flights : 6944
United Airlines : 5748
Skiplagged — Actually Cheap Flights & Hotels : 1851
Viator Tours & Activities : 1839
iExit Interstate Exit Guide : 1798
Gogo Entertainment : 1482
Google Street View : 1450
Webcams – EarthCam : 912
HISTORY Here : 685
DB Navigator : 512
Mobike - Dockless Bike Share : 494
MiFlight™ – Airport security line wait times at checkpoints for domestic and international travelers : 493
BlaBlaCar - Trusted Carpooling : 397
Six Flags : 353
Google Trips – Travel planner : 329
Voyages-sncf.com : book train and bus tickets : 268
Trainline UK: Live Tra

Like the social apps, the travel genres are dominated by major players, including Google, Uber, Lyft, TripAdvisor, etc. While a unique approach to travel could be a worthwhile app to pursue, it would be challenging to break into this genre.

In [32]:
#Examine the apps in the photo and video genres on both platforms

for app in ios_final:
    if app[-5] == 'Finance':
        print(app[1], ':', app[5])
        
print('\n')

for app in android_final:
    if app[1] == 'FINANCE' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

Chase Mobile℠ : 233270
Mint: Personal Finance, Budget, Bills & Money : 232940
Bank of America - Mobile Banking : 119773
PayPal - Send and request money safely : 119487
Credit Karma: Free Credit Scores, Reports & Alerts : 101679
Capital One Mobile : 56110
Citi Mobile® : 48822
Wells Fargo Mobile : 43064
Chase Mobile : 34322
Square Cash - Send Money for Free : 23775
Capital One for iPad : 21858
Venmo : 21090
USAA Mobile : 19946
TaxCaster – Free tax refund calculator : 17516
Amex Mobile : 11421
TurboTax Tax Return App - File 2016 income taxes : 9635
Bank of America - Mobile Banking for iPad : 7569
Wells Fargo for iPad : 2207
Stash Invest: Investing & Financial Education : 1655
Digit: Save Money Without Thinking About It : 1506
IRS2Go : 1329
Capital One CreditWise - Credit score and report : 1019
U by BB&T : 790
Paribus - Rebates When Prices Drop : 768
KeyBank Mobile : 623
VyStar Mobile Banking for iPhone : 434
Sparkasse - Your mobile branch : 77
VyStar Mobile Banking for iPad : 57
Zaim : 4

Within the finance genres, Google Pay is the only dominate app on the Android platform. While on iOS, there are a multitude of apps, but none are clear dominant apps like we see in the social genres. Given this, we recommend building an app in the finance genre. These are common apps on both platforms and rank high on popularity. While Google Pay 