Amanda Baker, andrewID: adbaker

### Part 5: Analyze the Data
#### i. Clean the data

In [1]:
import pandas as pd
import numpy as np

appstore_apps = pd.read_csv('a3-adbaker-itunes-topapps-2.csv')

def convert_tick_values(v):
    'Accepts a number v expressed in tick value (e.g. 1M, 2.5K), and returns its numeric value.'
    if (v[-1] == 'K'):
        full_value = float(v[:-1]) * 1000
    elif (v[-1] == 'M'):
        full_value = float(v[:-1]) * 1000000
    elif (v[-1] == 'B'):
        full_value = float(v[:-1]) * 1000000000
    else:
        full_value = v
    return int(full_value)

appstore_apps['star_rating'] = appstore_apps['star_rating'].apply(lambda s: float(s))
appstore_apps['count_ratings'] = appstore_apps['count_ratings'].apply(lambda s: s[:-8].strip())
appstore_apps['count_ratings'] = appstore_apps['count_ratings'].apply(lambda s: convert_tick_values(s))

appstore_apps.head()

Unnamed: 0,app_name,category,appstore_link_url,img_src_url,star_rating,count_ratings
0,PLANK!,Games,https://itunes.apple.com/us/app/plank/id137048...,https://www.apple.com/autopush/us/itunes/chart...,4.5,93700
1,Perfect Hit . . .,Games,https://itunes.apple.com/us/app/perfect-hit/id...,https://www.apple.com/autopush/us/itunes/chart...,4.5,53600
2,Google Photos,Photo & Video,https://itunes.apple.com/us/app/google-photos/...,https://www.apple.com/autopush/us/itunes/chart...,4.7,136000
3,Cash App,Finance,https://itunes.apple.com/us/app/cash-app/id711...,https://www.apple.com/autopush/us/itunes/chart...,4.6,150400
4,Wish - Shopping Made Fun,Shopping,https://itunes.apple.com/us/app/wish-shopping-...,https://www.apple.com/autopush/us/itunes/chart...,4.6,612400


#### ii. List the names of the top apps sorted in descending order based on star rating and within those with the same star rating sort based on number of ratings. If the number of ratings are also the same, sort by app_name.

In [2]:
appstore_apps_sorted = appstore_apps.sort_values(by=['star_rating', 'count_ratings', 'app_name'], ascending=[False, False, True])
appstore_apps_sorted.head(10)

Unnamed: 0,app_name,category,appstore_link_url,img_src_url,star_rating,count_ratings
73,Venmo: Send & Receive Money,Finance,https://itunes.apple.com/us/app/venmo-send-rec...,https://www.apple.com/autopush/us/itunes/chart...,4.9,4800000
80,Lyft,Travel,https://itunes.apple.com/us/app/lyft/id5293790...,https://www.apple.com/autopush/us/itunes/chart...,4.9,3800000
28,Shazam,Music,https://itunes.apple.com/us/app/shazam/id28499...,https://www.apple.com/autopush/us/itunes/chart...,4.9,2800000
64,Bible,Reference,https://itunes.apple.com/us/app/bible/id282935...,https://www.apple.com/autopush/us/itunes/chart...,4.9,2700000
63,PayPal: Mobile Cash,Finance,https://itunes.apple.com/us/app/paypal-mobile-...,https://www.apple.com/autopush/us/itunes/chart...,4.9,1400000
27,Lime - Your Ride Anytime,Travel,https://itunes.apple.com/us/app/lime-your-ride...,https://www.apple.com/autopush/us/itunes/chart...,4.9,673200
37,Bird - Enjoy The Ride,Travel,https://itunes.apple.com/us/app/bird-enjoy-the...,https://www.apple.com/autopush/us/itunes/chart...,4.9,408300
23,Nike,Shopping,https://itunes.apple.com/us/app/nike/id1095459...,https://www.apple.com/autopush/us/itunes/chart...,4.9,384200
52,Wordscapes,Games,https://itunes.apple.com/us/app/wordscapes/id1...,https://www.apple.com/autopush/us/itunes/chart...,4.9,276300
97,Instagram,Photo & Video,https://itunes.apple.com/us/app/instagram/id38...,https://www.apple.com/autopush/us/itunes/chart...,4.8,9500000


#### iii. For each category list the number of apps.

In [3]:
app_cat_pvt = appstore_apps_sorted.pivot_table(index='category', values='app_name', aggfunc=np.count_nonzero)
app_cat_pvt.columns=['apps_per_category']
app_cat_pvt.sort_values(by='apps_per_category', ascending=False)

Unnamed: 0_level_0,apps_per_category
category,Unnamed: 1_level_1
Games,45
Shopping,10
Social Networking,7
Food & Drink,5
Music,5
Travel,5
Entertainment,4
Photo & Video,4
Finance,3
Productivity,3


#### iv. For each category of app, list the average rating of all apps in that category and sort in descending order by average rating.

In [4]:
# code written if the user wanted to display the count of apps and the avg. star rating together
'''app_rating_pvt = appstore_apps_sorted.pivot_table(index='category', values=['app_name', 'star_rating'], aggfunc=[np.count_nonzero, np.mean])
app_rating_pvt.columns = ['apps_per_category','star_rating_counts', 'avg_star_rating']
app_rating_pvt = app_rating_pvt.drop(labels='star_rating_counts', axis=1).sort_values(by='avg_star_rating', ascending=False)
app_rating_pvt'''

# submitted answer
app_rating_pvt = appstore_apps_sorted.pivot_table(index='category', values='star_rating', aggfunc=np.mean)
app_rating_pvt.columns = ['avg_star_rating']
app_rating_pvt.sort_values(by='avg_star_rating', ascending=False)

Unnamed: 0_level_0,avg_star_rating
category,Unnamed: 1_level_1
Education,4.8
Finance,4.8
Travel,4.76
Food & Drink,4.76
Music,4.76
Navigation,4.75
Reference,4.7
Productivity,4.666667
Lifestyle,4.6
Shopping,4.57


#### v. For each category, list the app with the highest star rating. If there is a tie for apps with the highest star rating, list the one with the most number of ratings.

In [5]:
cat_grps_max_rating = appstore_apps.groupby('category')
highest_rated_in_grp = [cat_grps_max_rating.get_group(key).nlargest(1, ['star_rating', 'count_ratings']) 
                        for key in cat_grps_max_rating.groups.keys()]

# concatenate each dataframe containing highest rated app into single dataframe, and name columns appropriately
highest_rated_by_cat = pd.concat(highest_rated_in_grp[df] for df in np.arange(0,15))
highest_rated_by_cat.columns = ['highest_rated_app', 'category', 'appstore_link_url', 
                                'img_src_url', 'star_rating', 'count_ratings']
highest_rated_by_cat

Unnamed: 0,highest_rated_app,category,appstore_link_url,img_src_url,star_rating,count_ratings
72,Remind: School Communication,Education,https://itunes.apple.com/us/app/remind-school-...,https://www.apple.com/autopush/us/itunes/chart...,4.8,222100
47,Amazon Prime Video,Entertainment,https://itunes.apple.com/us/app/amazon-prime-v...,https://www.apple.com/autopush/us/itunes/chart...,4.8,925300
73,Venmo: Send & Receive Money,Finance,https://itunes.apple.com/us/app/venmo-send-rec...,https://www.apple.com/autopush/us/itunes/chart...,4.9,4800000
24,Starbucks,Food & Drink,https://itunes.apple.com/us/app/starbucks/id33...,https://www.apple.com/autopush/us/itunes/chart...,4.8,2700000
52,Wordscapes,Games,https://itunes.apple.com/us/app/wordscapes/id1...,https://www.apple.com/autopush/us/itunes/chart...,4.9,276300
29,Zillow: Houses For Sale & Rent,Lifestyle,https://itunes.apple.com/us/app/zillow-houses-...,https://www.apple.com/autopush/us/itunes/chart...,4.8,2700000
28,Shazam,Music,https://itunes.apple.com/us/app/shazam/id28499...,https://www.apple.com/autopush/us/itunes/chart...,4.9,2800000
5,Waze Navigation & Live Traffic,Navigation,https://itunes.apple.com/us/app/waze-navigatio...,https://www.apple.com/autopush/us/itunes/chart...,4.8,1500000
97,Instagram,Photo & Video,https://itunes.apple.com/us/app/instagram/id38...,https://www.apple.com/autopush/us/itunes/chart...,4.8,9500000
53,Google Drive,Productivity,https://itunes.apple.com/us/app/google-drive/i...,https://www.apple.com/autopush/us/itunes/chart...,4.8,2100000
