# Tamco Apps

This project is used to show my skills to use data analytics in order to analyze data from a csv file about apps on Google Play and the App Store.

This will imitate analyzing data for a company that only builds free apps and gains revenue from ads. This data will be used to help developers understand what types of apps are likely to attract customers. 

In [1]:
from csv import reader

#Apple Store csv file
opened_file = open('AppleStore.csv')
read_file = reader(opened_file)
apple = list(read_file)
apple_header = apple[0]
apple = apple[1:]

#Google Play store
opened_file = open('googleplaystore.csv')
read_file = reader(opened_file)
android = list(read_file)
android_header = android[0]
android = android[1:]

In [2]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]
    for row in dataset_slice:
        print(row)
        print('\n') 
        
    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

In [3]:
print(apple_header)
print('\n')
explore_data(apple, 0, 3, True)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0', '2974676', '212', '3.5', '3.5', '95', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0', '2161558', '1289', '4.5', '4', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


Number of rows: 7197
Number of columns: 16


In [4]:
print(android_header)
print('\n')
explore_data(android, 0, 3, True)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


Number of rows: 10841
Number of columns: 13


Possible columns of interest for google play store are:
- app: name of app
- category: type of app
- rating: user rating of the app
- installs: how many users there are
- price: cost of the app
- genres: app genre

Possible columns of interest for apple store:
- track_name: name of app
- price: cost of the app
- user_rating: rating on app store
- prime_genre: app genre
- rating_count_tot: total ratings (closest thing to how many users there are)

# Data Cleaning

Below I will check the data for errors. 

In [5]:
row_number = 0
for row in android:
    row_number += 1
    if len(row) != len(android_header):
        print(row)
        print(row_number)

# Deleting Data

If I needed to delete some of the entries I could use the following:

In [6]:
# del android[row_i_want_to_delete]
# only run it once
# del android[10472]

Original output to find errors indicated that there was an error in row 10473 for: 

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']

Looks like there was a column missing for "Category". I found out the category it should be, "Lifestyle", and added it in. 

Checking for duplicate apps in the list, I will create two new lists and add all the duplicates to one list, and add the uniques into another list. The number of apps in the duplicate list will show how many duplicates there are.

Also, since I know there are duplciates of Instagram in the list, I will print out all the entries for Instagram from the original android list. Since the only difference in these duplciates is the amoutn of reviews, I will keep the entry with the most reviews in the working dataset. 

In [7]:
duplicate_apps = []
unique_apps = []

for app in android:
    name = app[0]
    if name in unique_apps:
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)

print("Number of duplicate apps: " + str(len(duplicate_apps)))
print(duplicate_apps[:5])
print('\n')

for app in android:
    name = app[0]
    if name == 'Twitter':
        print(app)

Number of duplicate apps: 1181
['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings']


['Twitter', 'NEWS_AND_MAGAZINES', '4.3', '11667403', 'Varies with device', '500,000,000+', 'Free', '0', 'Mature 17+', 'News & Magazines', 'August 6, 2018', 'Varies with device', 'Varies with device']
['Twitter', 'NEWS_AND_MAGAZINES', '4.3', '11667403', 'Varies with device', '500,000,000+', 'Free', '0', 'Mature 17+', 'News & Magazines', 'August 6, 2018', 'Varies with device', 'Varies with device']
['Twitter', 'NEWS_AND_MAGAZINES', '4.3', '11657972', 'Varies with device', '500,000,000+', 'Free', '0', 'Mature 17+', 'News & Magazines', 'July 30, 2018', 'Varies with device', 'Varies with device']


Below I will pick the most up to date entries by putting the entries into a dictionary with the key[value] being name[reviews]. It will use the app row with the most reviews if there are duplicates. 

Then I will create a new cleaned up list of apps and store it in android_clean.

In [8]:
reviews_max = {}

for app in android:
    name = app[0]
    number_of_reviews = float(app[3])
    if name in reviews_max and reviews_max[name] < number_of_reviews:
        reviews_max[name] = number_of_reviews
    elif name not in reviews_max:
        reviews_max[name] = number_of_reviews

In [9]:
print(len(reviews_max))

9660


In [10]:
android_clean = []
android_dup = []

for app in android:
    name = app[0]
    number_of_reviews = float(app[3])
    if (number_of_reviews == reviews_max[name]) and (name not in android_dup):
        android_clean.append(app)
        android_dup.append(name)

In [11]:
explore_data(android_clean, 0, 3, True)
print(len(android) - len(android_clean))

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9660
Number of columns: 13
1181


# Removing non ASCII app names

Using the ord() function, we can get the number associated with with each character, and ASCII uses numbers 1-127, so we can remove any apps that have characters outside that range.

In [12]:
# def is_english(string):
#     for letter in string:
#             if ord(letter) > 127:
#                 return False
#     return True

# print(is_english('Instagram'))
# print(is_english('爱奇艺PPS -《欢乐颂2》电视剧热播'))
# print(is_english('Docs To Go™ Free Office Suite'))
# print(is_english('Instachat 😜'))

The above function returns False even for apps that are english, but have a strange character, which would cause the loss of useful data. To stop this, we will try to put a limit on how many outside characters there can be in the name.

In [13]:
def is_english(string):
    non_eng = 0
    for letter in string:
        if ord(letter) > 127:
            non_eng += 1
            
    if non_eng > 3:
        return False
    else:
        return True

print(is_english('Instachat 😜'))
print(is_english('Docs To Go™ Free Office Suite'))
print(is_english('爱奇艺PPS -《欢乐颂2》电视剧热播'))

True
True
False


I made the limit 3 characters, so now if there is an app with more than 3 odd characters in it, it will return False. 

In [14]:
android_english = []
apple_english = []

for app in android_clean:
    name = app[0]
    if is_english(name):
        android_english.append(app)

for app in apple:
    name = app[1]
    if is_english(name):
        apple_english.append(app)

In [15]:
explore_data(android_english, 0, 3, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9615
Number of columns: 13


In [16]:
explore_data(apple_english, 0, 3, True)

['284882215', 'Facebook', '389879808', 'USD', '0', '2974676', '212', '3.5', '3.5', '95', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0', '2161558', '1289', '4.5', '4', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


Number of rows: 6183
Number of columns: 16


Above creates the new list with removed non-english apps.

# Removing Non-Free Apps

Since we are focusing on only looking at free apps, Below I will work on removing all the non free apps from the app lists. I will loop through the apps in each list, for the android list I will check if the app says free in column 7. For the apple list, I will check if the price in column 4 is 0. 

In [17]:
android_free = []
apple_free = []

for row in android_english:
    price = row[6]
    if price == 'Free':
        android_free.append(row)
        
for row in apple_english:
    price = float(row[4])
    if price == 0:
        apple_free.append(row)

In [18]:
explore_data(apple_free, 0, 5, True)

['284882215', 'Facebook', '389879808', 'USD', '0', '2974676', '212', '3.5', '3.5', '95', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0', '2161558', '1289', '4.5', '4', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0', '1724546', '3842', '4.5', '4', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


['284035177', 'Pandora - Music & Radio', '130242560', 'USD', '0', '1126879', '3594', '4', '4.5', '8.4.1', '12+', 'Music', '37', '4', '1', '1']


Number of rows: 3222
Number of columns: 16


In [19]:
explore_data(android_free, 0, 5, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design', 'June 20, 2018', '1.1', '4.4 and up']


['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'March 26, 2017', '1', '2.3 and up']


Number of rows: 8864
Number of columns: 13


# Which app to make

To figure out which app to make, we will find which genres are the most common by creating a frequency table. We want to find an app that will work with both markets. 

I will create a frequency that shows the percentage of genres in each list. 

The frequency table for android will be based on the Genres and Categories columns, and for apple it will be prime_genre. 

To make it easier to view, I will sort the tables into descending order.

In [20]:
def freq_table(dataset, index):
    table = {}
    total = 0
    
    for row in dataset:
        total += 1
        value = row[index]
        if value in table:
            table[value] += 1
        else:
            table[value] = 1

    for key in table:
        percent = (table[key] / total) * 100
        table[key] = percent
    return table

In [21]:
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])

In [22]:
display_table(android_free, 1) # category
print('\n')
display_table(android_free, 9) # genre

FAMILY : 18.896660649819495
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.91471119133574
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.6543321299638989
ART_AND_DESIGN : 0.

In [23]:
display_table(apple_free, 11) # prime_genre

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


# Analyzing the tables

### Apple Store

For the apple store list, apple_free, the most common genre is Games, at 58.2%, and the second highest is entertainment at 7.9%. 

It looks like most of the apps on the app store are made for entertainment. It seems like the best option for an app store app would be a game based on this table.

### Google Play Store

In the Google Play store, there are 2 different columns describing the type of app it is, category and genre. This makes it a little harder to determine which app to use since there is no clear differentiation between the two. 

The most common apps for the category column are Family at 18.9%, Game at 9.7%, Tools at 8.5%. The most common genres are Tools at 8.5%, Entertainment at 6.4%, and Education at 5.9%. 

I noticed that it looks like instead of a broad "games" category for genres, it seems like they are broken down into the types of games. Also, the family apps look like they are still games, just that they are aimed at kids. 

It seems like the best app to make for android would be a game or tool. 

### Summary

While these tables provide quite a bit of useful data to determine which app would be good to create, there is still some information that we are missing, like how many users use these apps. 

# Finding Total Users

For the apple store, I will look at the rating_count_tot column, and for the play store I will use Installs, to determine how many users there are for each app, and then determine how many users there are for apps in each genre. 



In [24]:
apple_genres = freq_table(apple_free, 11)

for genre in apple_genres:
    total = 0 # sum of ratings in each genre
    len_genre = 0 # number of apps in each genre
    for app in apple_free:
        genre_app = app[11]
        if genre_app == genre:
            user_ratings = float(app[5])
            total += user_ratings
            len_genre += 1

    avg_user_ratings = total / len_genre
    print(genre, ':', avg_user_ratings)

Social Networking : 71548.34905660378
Photo & Video : 28441.54375
Games : 22788.6696905016
Music : 57326.530303030304
Reference : 74942.11111111111
Health & Fitness : 23298.015384615384
Weather : 52279.892857142855
Utilities : 18684.456790123455
Travel : 28243.8
Shopping : 26919.690476190477
News : 21248.023255813954
Navigation : 86090.33333333333
Lifestyle : 16485.764705882353
Entertainment : 14029.830708661417
Food & Drink : 33333.92307692308
Sports : 23008.898550724636
Book : 39758.5
Finance : 31467.944444444445
Education : 7003.983050847458
Productivity : 21028.410714285714
Business : 7491.117647058823
Catalogs : 4004.0
Medical : 612.0


In [25]:
def app_summary(dataset, genre):
    app_count = 0

    for app in dataset:
        col = app[11]
        if app_count == 11:
            break
        elif col == genre:
            app_count += 1
            print(app[1], ';', app[11], ';', app[5])            

In [26]:
app_summary(apple_free, 'Navigation')

Waze - GPS Navigation, Maps & Real-time Traffic ; Navigation ; 345046
Google Maps - Navigation & Transit ; Navigation ; 154911
Geocaching® ; Navigation ; 12811
CoPilot GPS – Car Navigation & Offline Maps ; Navigation ; 3582
ImmobilienScout24: Real Estate Search in Germany ; Navigation ; 187
Railway Route Search ; Navigation ; 5


In [27]:
app_summary(apple_free, 'Social Networking')

Facebook ; Social Networking ; 2974676
Pinterest ; Social Networking ; 1061624
Skype for iPhone ; Social Networking ; 373519
Messenger ; Social Networking ; 351466
Tumblr ; Social Networking ; 334293
WhatsApp Messenger ; Social Networking ; 287589
Kik ; Social Networking ; 260965
ooVoo – Free Video Call, Text and Voice ; Social Networking ; 177501
TextNow - Unlimited Text + Calls ; Social Networking ; 164963
Viber Messenger – Text & Call ; Social Networking ; 164249
Followers - Social Analytics For Instagram ; Social Networking ; 112778


In [28]:
app_summary(apple_free, 'Music')

Pandora - Music & Radio ; Music ; 1126879
Spotify Music ; Music ; 878563
Shazam - Discover music, artists, videos & lyrics ; Music ; 402925
iHeartRadio – Free Music & Radio Stations ; Music ; 293228
SoundCloud - Music & Audio ; Music ; 135744
Magic Piano by Smule ; Music ; 131695
Smule Sing! ; Music ; 119316
TuneIn Radio - MLB NBA Audiobooks Podcasts Music ; Music ; 110420
Amazon Music ; Music ; 106235
SoundHound Song Search & Music Player ; Music ; 82602
Sonos Controller ; Music ; 48905


In [29]:
app_summary(apple_free, 'Weather')

The Weather Channel: Forecast, Radar & Alerts ; Weather ; 495626
The Weather Channel App for iPad – best local forecast, radar map, and storm tracking ; Weather ; 208648
WeatherBug - Local Weather, Radar, Maps, Alerts ; Weather ; 188583
MyRadar NOAA Weather Radar Forecast ; Weather ; 150158
AccuWeather - Weather for Life ; Weather ; 144214
Yahoo Weather ; Weather ; 112603
Weather Underground: Custom Forecast & Local Radar ; Weather ; 49192
NOAA Weather Radar - Weather Forecast & HD Radar ; Weather ; 45696
Weather Live Free - Weather Forecast & Alerts ; Weather ; 35702
Storm Radar ; Weather ; 22792
QuakeFeed Earthquake Map, Alerts, and News ; Weather ; 6081


In [30]:
app_summary(apple_free, 'Book')

Kindle – Read eBooks, Magazines & Textbooks ; Book ; 252076
Audible – audio books, original series & podcasts ; Book ; 105274
Color Therapy Adult Coloring Book for Adults ; Book ; 84062
OverDrive – Library eBooks and Audiobooks ; Book ; 65450
HOOKED - Chat Stories ; Book ; 47829
BookShout: Read eBooks & Track Your Reading Goals ; Book ; 879
Dr. Seuss Treasury — 50 best kids books ; Book ; 451
Green Riding Hood ; Book ; 392
Weirdwood Manor ; Book ; 197
MangaZERO - comic reader ; Book ; 9
ikouhoushi ; Book ; 0


In [31]:
app_summary(apple_free, 'Reference')

Bible ; Reference ; 985920
Dictionary.com Dictionary & Thesaurus ; Reference ; 200047
Dictionary.com Dictionary & Thesaurus for iPad ; Reference ; 54175
Google Translate ; Reference ; 26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran ; Reference ; 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition ; Reference ; 17588
Merriam-Webster Dictionary ; Reference ; 16849
Night Sky ; Reference ; 12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) ; Reference ; 8535
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools ; Reference ; 4693
GUNS MODS for Minecraft PC Edition - Mods Tools ; Reference ; 1497


In [32]:
app_summary(apple_free, 'Finance')

Chase Mobile℠ ; Finance ; 233270
Mint: Personal Finance, Budget, Bills & Money ; Finance ; 232940
Bank of America - Mobile Banking ; Finance ; 119773
PayPal - Send and request money safely ; Finance ; 119487
Credit Karma: Free Credit Scores, Reports & Alerts ; Finance ; 101679
Capital One Mobile ; Finance ; 56110
Citi Mobile® ; Finance ; 48822
Wells Fargo Mobile ; Finance ; 43064
Chase Mobile ; Finance ; 34322
Square Cash - Send Money for Free ; Finance ; 23775
Capital One for iPad ; Finance ; 21858


In [33]:
app_summary(apple_free, 'Games')

Clash of Clans ; Games ; 2130805
Temple Run ; Games ; 1724546
Candy Crush Saga ; Games ; 961794
Angry Birds ; Games ; 824451
Subway Surfers ; Games ; 706110
Solitaire ; Games ; 679055
CSR Racing ; Games ; 677247
Crossy Road - Endless Arcade Hopper ; Games ; 669079
Injustice: Gods Among Us ; Games ; 612532
Hay Day ; Games ; 567344
PAC-MAN ; Games ; 508808


## App Recommendation for App Store

There are a couple apps that have a good amount of users, mainly social networking, music, weather, navigation, and books. 

Navigation is almost purely Google Maps, and Waze so I do not think that would be a good choice.

Reference is largely skewed by the bible, and dictionary.com, but there is still potential, possibly with a daily quote app.

Book, Navigation, Weather, Music and Social Networking are all in a similar situation with being dominated by a small number of apps. 

Finance is another possibility, it has a couple that largely raise the users, but the rest are still large, and pretty evenly distributed. Possible issue is that it seems hard to incorporate ads. 

Games has a low average total ratings, but that is due to the sheer amount of apps in the Games genre, and it is very easy to incorporate ads into these apps. 

I would recommend either a Game or Refference app.

In [34]:
display_table(android_free, 5)

1,000,000+ : 15.726534296028879
100,000+ : 11.552346570397113
10,000,000+ : 10.548285198555957
10,000+ : 10.198555956678701
1,000+ : 8.404783393501805
100+ : 6.915613718411552
5,000,000+ : 6.825361010830325
500,000+ : 5.561823104693141
50,000+ : 4.7721119133574
5,000+ : 4.512635379061372
10+ : 3.5424187725631766
500+ : 3.2490974729241873
50,000,000+ : 2.3014440433213
100,000,000+ : 2.1322202166064983
50+ : 1.917870036101083
5+ : 0.78971119133574
1+ : 0.5076714801444043
500,000,000+ : 0.2707581227436823
1,000,000,000+ : 0.22563176895306858
0+ : 0.04512635379061372


In order to do computations with this data, we will need to remove the + sign and convert them to floats. 

In [35]:
freq_table(android_free, 1)

{'ART_AND_DESIGN': 0.6430505415162455,
 'AUTO_AND_VEHICLES': 0.9250902527075812,
 'BEAUTY': 0.5979241877256317,
 'BOOKS_AND_REFERENCE': 2.1435018050541514,
 'BUSINESS': 4.591606498194946,
 'COMICS': 0.6204873646209386,
 'COMMUNICATION': 3.2378158844765346,
 'DATING': 1.861462093862816,
 'EDUCATION': 1.1620036101083033,
 'ENTERTAINMENT': 0.9589350180505415,
 'EVENTS': 0.7107400722021661,
 'FINANCE': 3.7003610108303246,
 'FOOD_AND_DRINK': 1.2409747292418771,
 'HEALTH_AND_FITNESS': 3.0798736462093865,
 'HOUSE_AND_HOME': 0.8235559566787004,
 'LIBRARIES_AND_DEMO': 0.9363718411552346,
 'LIFESTYLE': 3.91471119133574,
 'GAME': 9.724729241877256,
 'FAMILY': 18.896660649819495,
 'MEDICAL': 3.531137184115524,
 'SOCIAL': 2.6624548736462095,
 'SHOPPING': 2.2450361010830324,
 'PHOTOGRAPHY': 2.944494584837545,
 'SPORTS': 3.395758122743682,
 'TRAVEL_AND_LOCAL': 2.33528880866426,
 'TOOLS': 8.461191335740072,
 'PERSONALIZATION': 3.3167870036101084,
 'PRODUCTIVITY': 3.892148014440433,
 'PARENTING': 0.654

In [45]:
android_genres = freq_table(android_free, 1)

for category in android_genres:
    total = 0
    len_category = 0
    for app in android_free:
        category_app = app[1]
        if category_app == category:
            installs = app[5].replace('+', '')
            installs = installs.replace(',', '')
            total += float(installs)
            len_category += 1
      
    avg_installs = total / len_category
    print(category, ':', avg_installs)

ART_AND_DESIGN : 1986335.0877192982
AUTO_AND_VEHICLES : 647317.8170731707
BEAUTY : 513151.88679245283
BOOKS_AND_REFERENCE : 8767811.894736841
BUSINESS : 1712290.1474201474
COMICS : 817657.2727272727
COMMUNICATION : 38456119.167247385
DATING : 854028.8303030303
EDUCATION : 1833495.145631068
ENTERTAINMENT : 11640705.88235294
EVENTS : 253542.22222222222
FINANCE : 1387692.475609756
FOOD_AND_DRINK : 1924897.7363636363
HEALTH_AND_FITNESS : 4188821.9853479853
HOUSE_AND_HOME : 1331540.5616438356
LIBRARIES_AND_DEMO : 638503.734939759
LIFESTYLE : 1433675.5878962537
GAME : 15588015.603248259
FAMILY : 3697848.1731343283
MEDICAL : 120550.61980830671
SOCIAL : 23253652.127118643
SHOPPING : 7036877.311557789
PHOTOGRAPHY : 17840110.40229885
SPORTS : 3638640.1428571427
TRAVEL_AND_LOCAL : 13984077.710144928
TOOLS : 10801391.298666667
PERSONALIZATION : 5201482.6122448975
PRODUCTIVITY : 16787331.344927534
PARENTING : 542603.6206896552
WEATHER : 5074486.197183099
VIDEO_PLAYERS : 24727872.452830188
NEWS_AND_

# App Recomendation for Play Store

The top categories in the Play Store are Books and Reference, Communication, Entertainment, Health and Fitness, Game, Social, Shopping, Photography, Travel and Local, Tools, Personalization, Productivity, Weather, Video Players, New and Magazines.

Since Books and Reference, and games apps are very popular on both platforms, and seem relatively easy to incorporate ads into, I would recommend one of these two genres for the app to make. 

In [46]:
def app_summary_android(dataset, genre):
    app_count = 0

    for app in dataset:
        col = app[1]
        if app_count == 11:
            break
        elif col == genre:
            app_count += 1
            print(app[0], ';', app[1], ';', app[5]) 

In [48]:
app_summary_android(android_free, 'GAME')

Solitaire ; GAME ; 10,000,000+
Sonic Dash ; GAME ; 100,000,000+
PAC-MAN ; GAME ; 100,000,000+
Bubble Witch 3 Saga ; GAME ; 50,000,000+
Race the Traffic Moto ; GAME ; 10,000,000+
Marble - Temple Quest ; GAME ; 10,000,000+
Shooting King ; GAME ; 10,000,000+
Geometry Dash World ; GAME ; 10,000,000+
Jungle Marble Blast ; GAME ; 5,000,000+
Roll the Ball® - slide puzzle ; GAME ; 100,000,000+
Block Craft 3D: Building Simulator Games For Free ; GAME ; 50,000,000+


In [49]:
app_summary_android(android_free, 'BOOKS_AND_REFERENCE')

E-Book Read - Read Book for free ; BOOKS_AND_REFERENCE ; 50,000+
Download free book with green book ; BOOKS_AND_REFERENCE ; 100,000+
Wikipedia ; BOOKS_AND_REFERENCE ; 10,000,000+
Cool Reader ; BOOKS_AND_REFERENCE ; 10,000,000+
Free Panda Radio Music ; BOOKS_AND_REFERENCE ; 100,000+
Book store ; BOOKS_AND_REFERENCE ; 1,000,000+
FBReader: Favorite Book Reader ; BOOKS_AND_REFERENCE ; 10,000,000+
English Grammar Complete Handbook ; BOOKS_AND_REFERENCE ; 500,000+
Free Books - Spirit Fanfiction and Stories ; BOOKS_AND_REFERENCE ; 1,000,000+
Google Play Books ; BOOKS_AND_REFERENCE ; 1,000,000,000+
AlReader -any text book reader ; BOOKS_AND_REFERENCE ; 5,000,000+


# Summary

For these two app stores, I would reccomend making either a game, or a book/reference app. These appear to be the best free apps to make that you would be able to make money off of ad revenue, and have available for free for users. These two genres are very popular in both stores, and have users pretty spread out among the apps. These types of apps also seem very easy to incorporate ads into without seeming intrusive to the consumer. 