# App Popularity and Development Opportunities for iOS and Android
### By Ben Avilez

Our goal for this project is to analyze data to help our developers understand what type of mobile apps are likely to attract more users. We'll determine which app genres are the most popular and which are prone to disruption based on the user base. Since our revenue is generated by in-app ads (and not app sales), we'll assume that the greater the user base, the more revenue generated by the app.

We'll start by taking a few samples of both our supplied iOS and Android datasets. To make them easier to explore, we created a function to print rows in a readable way.

In [1]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

In [2]:
from csv import reader

# iOS Apple Data Set
ios_data_set = open('AppleStore.csv')
ios_reader = reader(ios_data_set)
ios_list = list(ios_reader)
ios_header = ios_list[0]
ios_data = ios_list[1:]

# Android Data Set
android_data_set = open('googleplaystore.csv')
android_reader = reader(android_data_set)
android_list = list(android_reader)
android_header = android_list[0]
android_data = android_list[1:]

## Sample Rows

In [3]:
# Sample Rows of iOS Dataset
print("Sample iOS rows: \n")
explore_data(ios_data, 0, 4)

# Sample Rows of Android Dataset
print("Sample Android rows: \n")
explore_data(android_data, 0, 4)

Sample iOS rows: 

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


Sample Android rows: 

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live C

Now we'll try to acquire a total of the two datasets.

## Dataset Counts

In [4]:
# Row and Column Count of iOS Dataset
print("iOS Dataset:")
explore_data(ios_data, 0, 0, True)

# Row and Column Count of Android Dataset
print("\nAndroid Dataset:")
explore_data(android_data, 0, 0, True)

iOS Dataset:
Number of rows: 7197
Number of columns: 16

Android Dataset:
Number of rows: 10841
Number of columns: 13


We'll take a look at the features of both datasets to see if we can find any useful common threads.

## Header Names

In [5]:
# Print iOS Header Row
print("iOS Header Row:")
explore_data(ios_header, 0, len(ios_header))

# Print Android Header Row
print("Android Header Row:")
explore_data(android_header, 0, len(android_header))

iOS Header Row:
id


track_name


size_bytes


currency


price


rating_count_tot


rating_count_ver


user_rating


user_rating_ver


ver


cont_rating


prime_genre


sup_devices.num


ipadSc_urls.num


lang.num


vpp_lic


Android Header Row:
App


Category


Rating


Reviews


Size


Installs


Type


Price


Content Rating


Genres


Last Updated


Current Ver


Android Ver




The key features we'll look at are the genres/categories for both platforms along with the reviews. We'll also consider the installs category for Android.

Before we move any further, we'll validate our datasets and check for duplicates.

## Checking for Duplicates

In [6]:
for app in ios_data:
    name = app[1]
    if name == "Instagram":
        print(app)

['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


In [7]:
for app in android_data:
    name = app[0]
    if name == "Instagram":
        print(app)

['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']


The Android Dataset has been found to contain multiple entries of particular apps. In order to clean the data, we've opted to keep the duplicates with the highest number of reviews to more accurately reflect the true dataset.

Since some apps with over a million views truncate the value, we'll round to the nearest million.

In [8]:
duplicate_android_apps = []
unique_android_apps = []

for app in android_data:
    name = app[0]
    if name in unique_android_apps:
        duplicate_android_apps.append(name)
    else:
        unique_android_apps.append(name)

print('Number of duplicate apps:', len(duplicate_android_apps))
print('\n')
print('Examples of duplicate apps:', duplicate_android_apps[:15])

Number of duplicate apps: 1181


Examples of duplicate apps: ['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings', 'Box', 'Zenefits', 'Google Ads', 'Google My Business', 'Slack', 'FreshBooks Classic', 'Insightly CRM', 'QuickBooks Accounting: Invoicing & Expenses', 'HipChat - Chat Built for Teams', 'Xero Accounting Software']


In [9]:
print('Expected length:', len(android_data) - len(duplicate_android_apps))

Expected length: 9660


In [10]:
reviews_max = {}
for row in android_data:
    name = row[0]
    if 'M' in row[3]:
        n_reviews = float(row[3][:-1]) * 1000000
    else:
        n_reviews = float(row[3])
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
    if name not in reviews_max:
        reviews_max[name] = n_reviews
        
len(reviews_max)
    

9660

In [11]:
android_clean = []
already_added = []

for row in android_data:
    name = row[0]
    if 'M' in row[3]:
        n_reviews = float(row[3][:-1]) * 1000000
        if n_reviews == reviews_max[name] and name not in already_added:
            android_clean.append(row)
            already_added.append(name)
    else:
        n_reviews = float(row[3])
        if n_reviews == reviews_max[name] and name not in already_added:
            android_clean.append(row)
            already_added.append(name)
        
len(android_clean)

9660

## Focusing on Free and English Apps Only

We've created a function here to determine whether an app is in English or a foreign language. Since we're only developing for English, we'll use this to remove foreign language apps from the datasets. Since some apps include special characters or emojis in the title, we'll account for this by allowing for up to 3 of these characters in a given title.

In [49]:
def is_english(string):
    
    count = 0
    
    for character in string:
        if ord(character) > 127:
            count += 1
            if count >= 3:
                return False
    
    return True

print(is_english('Instagram'))
print(is_english('Docs To Go™ Free Office Suite'))
print(is_english('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(is_english('Instachat 😜'))

True
True
False
True


We further narrow down our dataset to free and English apps.

In [50]:
# Free and English apps only
ios_free_english = []

for row in ios_data:
    if float(row[4]) == 0.0 and is_english(row[1]) == True:
        ios_free_english.append(row)
        
print("Free and English iOS apps: " + str(len(ios_free_english)))

android_free_english = []

for row in android_clean:
    if row[6] == 'Free' and is_english(row[0]) == True:
        android_free_english.append(row)
        
print("Free and English Android apps: " + str(len(android_free_english)))

Free and English iOS apps: 3203
Free and English Android apps: 8847


## Common App Categories in iOS and Android

We want to build our initial app for the Google Play store and then, if our success is validated, build onto the App Store. We want to now look and see what the most common genres are between both platforms to see what the landscape of competition looks like.

We'll throw these into a dictionary to start.

In [51]:
ios_common = {}

for row in ios_free_english:
    if row[11] not in ios_common:
        ios_common[row[11]] = 1
    else:
        ios_common[row[11]] += 1
        
print("Common iOS apps: " + str(ios_common))

android_common = {}

for row in android_free_english:
    if row[9] not in android_common:
        android_common[row[9]] = 1
    else:
        android_common[row[9]] += 1
        
print("\nCommon Android apps: " + str(android_common))

Common iOS apps: {'Social Networking': 106, 'Photo & Video': 160, 'Games': 1866, 'Music': 66, 'Reference': 17, 'Health & Fitness': 65, 'Weather': 28, 'Utilities': 79, 'Travel': 40, 'Shopping': 83, 'News': 43, 'Navigation': 6, 'Lifestyle': 50, 'Entertainment': 251, 'Food & Drink': 26, 'Sports': 69, 'Book': 12, 'Finance': 35, 'Education': 118, 'Productivity': 56, 'Business': 17, 'Catalogs': 4, 'Medical': 6}

Common Android apps: {'Art & Design': 53, 'Art & Design;Creativity': 6, 'Auto & Vehicles': 82, 'Beauty': 53, 'Books & Reference': 189, 'Business': 407, 'Comics': 53, 'Comics;Creativity': 1, 'Communication': 286, 'Dating': 165, 'Education': 474, 'Education;Creativity': 4, 'Education;Education': 30, 'Education;Pretend Play': 5, 'Education;Brain Games': 3, 'Entertainment': 538, 'Entertainment;Brain Games': 7, 'Entertainment;Creativity': 3, 'Entertainment;Music & Video': 15, 'Events': 63, 'Finance': 328, 'Food & Drink': 110, 'Health & Fitness': 273, 'House & Home': 71, 'Libraries & Demo'

We defined a few functions below to create a frequency table of each category so we can attain the percentage of each.

In [52]:
def freq_table(dataset, index):
    table = {}
    total = 0
    
    for row in dataset:
        total += 1
        value = row[index]
        if value in table:
            table[value] += 1
        else:
            table[value] = 1
    
    table_percentages = {}
    for key in table:
        percentage = (table[key] / total) * 100
        table_percentages[key] = percentage 
    
    return table_percentages

def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    count = 1
    for entry in table_sorted:
        print(str(count) + '.', entry[1], ':', entry[0])
        count += 1

In [53]:
display_table(ios_free_english, -5)

1. Games : 58.25788323446769
2. Entertainment : 7.836403371838902
3. Photo & Video : 4.995316890415236
4. Education : 3.6840462066812365
5. Social Networking : 3.3093974399000934
6. Shopping : 2.5913206369029034
7. Utilities : 2.466437714642523
8. Sports : 2.1542304089915705
9. Music : 2.0605682172962845
10. Health & Fitness : 2.0293474867311896
11. Productivity : 1.7483609116453322
12. Lifestyle : 1.5610365282547611
13. News : 1.3424914142990947
14. Travel : 1.248829222603809
15. Finance : 1.0927255697783327
16. Weather : 0.8741804558226661
17. Food & Drink : 0.8117389946924758
18. Reference : 0.5307524196066188
19. Business : 0.5307524196066188
20. Book : 0.3746487667811427
21. Navigation : 0.18732438339057134
22. Medical : 0.18732438339057134
23. Catalogs : 0.1248829222603809


Far and away, the most common type of applications for iOS are Games. The runner-up is Entertainment. The rest of the apps hardly account for much more than a few percentage points. The most created apps seemed to be geared toward entertainment vs practical purposes. That said, Games simply being the most common type of app may not imply that this has the largest user base. It could be that these apps are easier and more gratifying to design. Likewise, the competition could be tougher too.

We'll explore the Android apps next:

In [54]:
display_table(android_free_english, 1)

1. FAMILY : 18.932971628800725
2. GAME : 9.698202780603594
3. TOOLS : 8.45484344975698
4. BUSINESS : 4.600429524132474
5. PRODUCTIVITY : 3.8996269922007465
6. LIFESTYLE : 3.888323725556686
7. FINANCE : 3.7074714592517237
8. MEDICAL : 3.537922459590822
9. SPORTS : 3.39097999321804
10. PERSONALIZATION : 3.3231603933536795
11. COMMUNICATION : 3.2327342602011986
12. HEALTH_AND_FITNESS : 3.0857917938284163
13. PHOTOGRAPHY : 2.950152594099695
14. NEWS_AND_MAGAZINES : 2.803210127726913
15. SOCIAL : 2.6675709279981916
16. TRAVEL_AND_LOCAL : 2.3397761953204474
17. SHOPPING : 2.2493500621679665
18. BOOKS_AND_REFERENCE : 2.136317395727365
19. DATING : 1.8650389962699219
20. VIDEO_PLAYERS : 1.797219396405561
21. MAPS_AND_NAVIGATION : 1.3903017972193965
22. FOOD_AND_DRINK : 1.2433593308466147
23. EDUCATION : 1.1642364643381937
24. ENTERTAINMENT : 0.9607776647451114
25. LIBRARIES_AND_DEMO : 0.938171131456991
26. AUTO_AND_VEHICLES : 0.9268678648129309
27. HOUSE_AND_HOME : 0.8025319317282694
28. WEATH

The most common genres for Android are Family, Game, and Tools. While no one app quite dominates like Games do for iOS, 50% of the overall frequency percentage is accounted for in the top 20% of most common apps.

The Android dataset provides an additional column for subgenres that could be useful too.

In [55]:
display_table(android_free_english, 9)

1. Tools : 8.44354018311292
2. Entertainment : 6.081157454504352
3. Education : 5.357748389284503
4. Business : 4.600429524132474
5. Productivity : 3.8996269922007465
6. Lifestyle : 3.8770204589126256
7. Finance : 3.7074714592517237
8. Medical : 3.537922459590822
9. Sports : 3.458799593082401
10. Personalization : 3.3231603933536795
11. Communication : 3.2327342602011986
12. Action : 3.0970950604724763
13. Health & Fitness : 3.0857917938284163
14. Photography : 2.950152594099695
15. News & Magazines : 2.803210127726913
16. Social : 2.6675709279981916
17. Travel & Local : 2.3284729286763874
18. Shopping : 2.2493500621679665
19. Books & Reference : 2.136317395727365
20. Simulation : 2.045891262574884
21. Dating : 1.8650389962699219
22. Arcade : 1.8424324629818016
23. Video Players & Editors : 1.7746128631174407
24. Casual : 1.763309596473381
25. Maps & Navigation : 1.3903017972193965
26. Food & Drink : 1.2433593308466147
27. Puzzle : 1.1303266644060133
28. Racing : 0.9946874646772917
29.

Considering that Family and Game are no longer present in this count, they must've been further categorized into a subgenre. Tools is still present here. Entertainment and Education were also in the top-4 of the iOS genres too.

Now that we have a pulse for the most common apps out there, we'll take a look at the average number of ratings per genre.

## Average Number of Ratings by Category
### iOS

In [19]:
ios_reviews = freq_table(ios_free_english, -5)

ios_review_display = []

for genre in ios_reviews:
    total = 0
    len_genre = 0
    for row in ios_free_english:
        genre_app = row[-5]
        if genre_app == genre:
            ratings = float(row[5])
            total += ratings
            len_genre += 1
    average_user_ratings = total / len_genre
    ios_review_display.append((average_user_ratings, genre))
    
table_sorted = sorted(ios_review_display, reverse = True)

count = 1
for entry in table_sorted:
    print(str(count) + '.', entry[1], ':', entry[0])
    count += 1

1. Navigation : 86090.33333333333
2. Reference : 79350.4705882353
3. Social Networking : 71548.34905660378
4. Music : 57326.530303030304
5. Weather : 52279.892857142855
6. Book : 46384.916666666664
7. Food & Drink : 33333.92307692308
8. Finance : 32367.02857142857
9. Photo & Video : 28441.54375
10. Travel : 28243.8
11. Shopping : 27230.734939759037
12. Health & Fitness : 23298.015384615384
13. Sports : 23008.898550724636
14. Games : 22886.36709539121
15. News : 21248.023255813954
16. Productivity : 21028.410714285714
17. Utilities : 19156.493670886077
18. Lifestyle : 16815.48
19. Entertainment : 14195.358565737051
20. Business : 7491.117647058823
21. Education : 7003.983050847458
22. Catalogs : 4004.0
23. Medical : 612.0


Interestingly, despite Games and Entertainment accounting for the most iOS apps, the average number of reviews fall into the bottom 50%. Navigation, Reference, and Social Networking round out the top three.

### Android

In [20]:
android_reviews = freq_table(android_free_english, 1)

android_review_display = []

for genre in android_reviews:
    total = 0
    len_category = 0
    for row in android_free_english:
        category_app = row[1]
        if category_app == genre:
            ratings = int(row[3])
            total += ratings
            len_category += 1
    average_user_ratings = total / len_category
    android_review_display.append((average_user_ratings, genre))
    
table_sorted = sorted(android_review_display, reverse = True)
count = 1
for entry in table_sorted:
    print(str(count) + '.', entry[1], ':', entry[0])
    count += 1

1. COMMUNICATION : 999089.6118881119
2. SOCIAL : 965830.9872881356
3. GAME : 684290.0629370629
4. VIDEO_PLAYERS : 425350.08176100627
5. PHOTOGRAPHY : 404081.3754789272
6. TOOLS : 306550.3034759358
7. ENTERTAINMENT : 301752.24705882353
8. SHOPPING : 223887.34673366835
9. PERSONALIZATION : 181122.31632653062
10. WEATHER : 173679.5285714286
11. PRODUCTIVITY : 160634.5420289855
12. MAPS_AND_NAVIGATION : 143611.27642276423
13. TRAVEL_AND_LOCAL : 129484.42512077295
14. SPORTS : 117317.25666666667
15. FAMILY : 113210.54626865672
16. NEWS_AND_MAGAZINES : 93088.03225806452
17. BOOKS_AND_REFERENCE : 88460.62962962964
18. HEALTH_AND_FITNESS : 78094.9706959707
19. FOOD_AND_DRINK : 57478.79090909091
20. EDUCATION : 56293.09708737864
21. COMICS : 43371.57407407407
22. FINANCE : 38535.8993902439
23. LIFESTYLE : 34118.90406976744
24. HOUSE_AND_HOME : 27113.309859154928
25. ART_AND_DESIGN : 24699.42105263158
26. BUSINESS : 24239.727272727272
27. DATING : 21953.272727272728
28. PARENTING : 16378.7068965

Only the Game genre landed in the highest percentage of apps and the average number of reviews in the top 3. Communication and Social both landed in the top slots here despite them landing below the top 20% in the number of apps.

That said, some of the apps could be skewing some of the numbers we're seeing. More ubiquitous apps like Facebook, Instagram, Zoom, etc are inevitably going to launch the average reviews up higher. We'll now revisit this list by looking at the median for each instead.

## Median Number of Ratings by Category
### iOS

In [26]:
from collections import defaultdict
import statistics

ios_reviews = freq_table(ios_free_english, -5)

ios_review_dict = defaultdict(list)
ios_review_display = []

for genre in ios_reviews:
    for row in ios_free_english:
        genre_app = row[-5]
        if genre_app == genre:
            ios_review_dict[genre_app].append(float(row[5]))
    median_user_ratings = statistics.median(ios_review_dict[genre])
    ios_review_display.append((median_user_ratings, genre))
    
table_sorted = sorted(ios_review_display, reverse = True)
count = 1
for entry in table_sorted:
    print(str(count) + '.', entry[1], ':', entry[0])
    count += 1

1. Productivity : 8737.5
2. Reference : 8535.0
3. Navigation : 8196.5
4. Shopping : 6408.0
5. Social Networking : 4199.0
6. Music : 3850.0
7. Health & Fitness : 2459.0
8. Finance : 2207.0
9. Photo & Video : 2206.0
10. Sports : 1628.0
11. Food & Drink : 1490.5
12. Utilities : 1341.0
13. Catalogs : 1229.0
14. Entertainment : 1205.0
15. Lifestyle : 1183.0
16. Business : 1150.0
17. Games : 913.5
18. Travel : 798.5
19. Book : 665.0
20. Education : 606.5
21. Medical : 566.5
22. News : 373.0
23. Weather : 289.0


As expected, the average number of reviews decreased dramatically. Interestingly, reference and navigation still sit near the top.

### Android

In [28]:
android_reviews = freq_table(android_free_english, 1)

android_review_dict = defaultdict(list)
android_review_display = []

for genre in android_reviews:
    for row in android_free_english:
        genre_app = row[1]
        if genre_app == genre:
            android_review_dict[genre_app].append(float(row[3]))
    median_user_ratings = statistics.median(android_review_dict[genre])
    android_review_display.append((median_user_ratings, genre))
    
table_sorted = sorted(android_review_display, reverse = True)
count = 1
for entry in table_sorted:
    print(str(count) + '.', entry[1], ':', entry[0])
    count += 1

1. GAME : 35371.5
2. ENTERTAINMENT : 35279.0
3. PHOTOGRAPHY : 31985.0
4. EDUCATION : 13612.0
5. SHOPPING : 13085.0
6. WEATHER : 11403.5
7. COMMUNICATION : 6527.5
8. VIDEO_PLAYERS : 5555.0
9. HEALTH_AND_FITNESS : 3908.0
10. SOCIAL : 3884.0
11. FOOD_AND_DRINK : 3779.0
12. HOUSE_AND_HOME : 3522.0
13. TRAVEL_AND_LOCAL : 2277.0
14. PRODUCTIVITY : 2131.0
15. SPORTS : 1942.5
16. COMICS : 1844.5
17. MAPS_AND_NAVIGATION : 1688.0
18. FAMILY : 879.0
19. TOOLS : 659.0
20. NEWS_AND_MAGAZINES : 656.5
21. PERSONALIZATION : 652.0
22. PARENTING : 528.5
23. ART_AND_DESIGN : 486.0
24. DATING : 478.0
25. FINANCE : 467.5
26. AUTO_AND_VEHICLES : 352.0
27. BOOKS_AND_REFERENCE : 314.0
28. BEAUTY : 187.0
29. LIFESTYLE : 180.5
30. LIBRARIES_AND_DEMO : 131.0
31. EVENTS : 48.0
32. MEDICAL : 22.0
33. BUSINESS : 15.0


Game, Entertainment, and Photography round out the top 3 for Android. The closest associated app between the iPhone and Android categories is Shopping, which placed in the top 5 for each. While a shopping app does seem like a great opportunity, there's likely more infrastructure that would need to be built up in order to support this.

The next closest between the two appears to be Health & Fitness and Photography (Photo & Video). They both finish in the top 9 for median reviews.

Between these two, my recommendation would be to create a Photography app. It only accounted for about 8% of the total apps between iOS and Android according to their frequency tables (Health & Fitness was slightly less at 5%) so the competition is low. Photography apps are undeniably popular on the Android ranking in the top 3. Although they aren't quite as popular on the iOS, the median reviews still are about the quarter of the size of the top genre so the falloff isn't steep. Especially if the app gains a foothold within the Android market, it's reasonable that the same could occur for the iOS app later. Health & Fitness is only mildly more popular on iOS but it's much less popular on the Android, giving it less of an opportunity to succeed in its initial market.

One last category we wanted to take a look at is the number of app installations by category. Only the Android dataset could account for this number. Even then, the numbers are truncated heavily as only ranges are provided. We'll take a look at the average and then the median as well.

## Average Number of Downloads by Category
### Android

In [44]:
android_reviews = freq_table(android_free_english, 1)

android_review_display = []

for genre in android_reviews:
    total = 0
    len_category = 0
    for row in android_free_english:
        category_app = row[1]
        if category_app == genre:
            n_installs = row[5]
            n_installs = n_installs.replace('+', '')
            n_installs = int(n_installs.replace(',', ''))
            total += n_installs
            len_category += 1
    average_user_ratings = total / len_category
    android_review_display.append((average_user_ratings, genre))
    
table_sorted = sorted(android_review_display, reverse = True)
count = 1
for entry in table_sorted:
    print(str(count) + '.', entry[1], ':', entry[0])
    count += 1

1. COMMUNICATION : 38590581.08741259
2. VIDEO_PLAYERS : 24727872.452830188
3. SOCIAL : 23253652.127118643
4. PHOTOGRAPHY : 17840110.40229885
5. PRODUCTIVITY : 16787331.344927534
6. GAME : 15544014.51048951
7. TRAVEL_AND_LOCAL : 13984077.710144928
8. ENTERTAINMENT : 11640705.88235294
9. TOOLS : 10830251.970588235
10. NEWS_AND_MAGAZINES : 9549178.467741935
11. BOOKS_AND_REFERENCE : 8814199.78835979
12. SHOPPING : 7036877.311557789
13. PERSONALIZATION : 5201482.6122448975
14. WEATHER : 5145550.285714285
15. HEALTH_AND_FITNESS : 4188821.9853479853
16. MAPS_AND_NAVIGATION : 4049274.6341463416
17. FAMILY : 3697848.1731343283
18. SPORTS : 3650602.276666667
19. ART_AND_DESIGN : 1986335.0877192982
20. FOOD_AND_DRINK : 1924897.7363636363
21. EDUCATION : 1833495.145631068
22. BUSINESS : 1712290.1474201474
23. LIFESTYLE : 1446158.2238372094
24. FINANCE : 1387692.475609756
25. HOUSE_AND_HOME : 1360598.042253521
26. DATING : 854028.8303030303
27. COMICS : 832613.8888888889
28. AUTO_AND_VEHICLES : 64

## Median Number of Downloads by Category
### Android

In [48]:
android_reviews = freq_table(android_free_english, 1)

android_install_dict = defaultdict(list)
android_review_display = []

for genre in android_reviews:
    for row in android_free_english:
        genre_app = row[1]
        if genre_app == genre:
            n_installs = row[5]
            n_installs = n_installs.replace('+', '')
            n_installs = int(n_installs.replace(',', ''))
            android_install_dict[genre_app].append(n_installs)
    median_user_ratings = int(statistics.median(android_install_dict[genre]))
    android_review_display.append((median_user_ratings, genre))
    
table_sorted = sorted(android_review_display, reverse = True)
count = 1
for entry in table_sorted:
    print(str(count) + '.', entry[1], ':', entry[0])
    count += 1

1. WEATHER : 1000000
2. VIDEO_PLAYERS : 1000000
3. SHOPPING : 1000000
4. PHOTOGRAPHY : 1000000
5. GAME : 1000000
6. ENTERTAINMENT : 1000000
7. EDUCATION : 1000000
8. HOUSE_AND_HOME : 500000
9. HEALTH_AND_FITNESS : 500000
10. FOOD_AND_DRINK : 500000
11. COMMUNICATION : 500000
12. TRAVEL_AND_LOCAL : 100000
13. TOOLS : 100000
14. SPORTS : 100000
15. SOCIAL : 100000
16. PRODUCTIVITY : 100000
17. PERSONALIZATION : 100000
18. PARENTING : 100000
19. MAPS_AND_NAVIGATION : 100000
20. FAMILY : 100000
21. COMICS : 100000
22. AUTO_AND_VEHICLES : 100000
23. ART_AND_DESIGN : 100000
24. NEWS_AND_MAGAZINES : 50000
25. BOOKS_AND_REFERENCE : 50000
26. BEAUTY : 50000
27. LIFESTYLE : 10000
28. LIBRARIES_AND_DEMO : 10000
29. FINANCE : 10000
30. DATING : 10000
31. MEDICAL : 1000
32. EVENTS : 1000
33. BUSINESS : 1000


Favorably, Photography landed in the top 4 for both the average and median downloads. Health & Fitness only placed in the top 50%.

To conclude, a Photography app would be the most ideal app to develop at this point in time. The landscape for competition is fairly low, no additional infrastructure needs to be developed to get it off the ground, and the apps are highly engaged by users as evidenced by the high number of reviews on both mobile platforms along with the number of downloads on Android.