# Free App Engagement Analysis

Many companies build free apps to serve their customers and to drive consumer engagement with advertising partners. This analysis examines how customers engage with apps through downloads, use, and clicks on ads.

## Prepare the Data

We will start by piloting analyses with two test data sets from the App Store (data on approximately 7,000 apps collected in July 2017) and the Google Play Store (data on approximately 10,000 apps collected in August 2018).

After downloading the data sets, we'll start by opening the data; reading it with the `reader` function from the `csv` module; and creating a list-formatted table of the data.

In [1]:
# Import the reader function from the csv module
from csv import reader

# Prepare the Apple Store data
opened_app_file = open('AppleStore.csv')
read_app_file = reader(opened_app_file)
app_data = list(read_app_file)
header_apple = app_data[0]

# Prepare the Google Play store data
opened_play_file = open('googleplaystore.csv')
read_play_file = reader(opened_play_file)
play_data = list(read_play_file)
header_play = play_data[0]

*Note*: if there is an error named `UnicodeDecodeError` caused by the `open()` function, then add `encoding='utf8'` to the `open()` function. For example, `open('AppleStore.csv', encoding='utf8')`.

## Explore the Data

Next, we'll create a function `explore_data()` to facilitate viewing the data.

In [2]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]
    for row in dataset_slice:
        print(row)
        print('\n')
        
    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of column:', len(dataset[0]))
        

Some data sets have headers. We'll confirm if this one does as we take a gander at the first few rows.

- Apple Store:

In [3]:
explore_data(app_data, 0, 5, True)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


Number of rows: 7198
Number of column: 16


The Apple Store data contains a header row, 7,197 rows of data, and 16 columns. Some of the most relevant columns for this current project include, 'price', 'rating_count_total', etc.

For complete documentation of this data visit the following [web site](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/home).

And now for a preview of the Google Play store data.

- Google Play store:

In [4]:
explore_data(play_data, 0, 5, True)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 10842
Number of column: 13


The Google Play store data contains a header row, 10,841 rows of data, and 13 columns.

Some of the most relevant columns for this current project include, 'App', 'Installs', etc.

For complete documentation of this data visit the following [web site](https://www.kaggle.com/lava18/google-play-store-apps/home).

## Clean the Data

Discussion on the data documentation web site in late 2018 uncovered an error in the Google Play data at row 10,472 (or nearby depending on the presence of a header row or not). One value is missing in this row, thus causing a column transposition error in subsequent columns for this row.

First, we'll verify if there is an error here by printing a slice of rows before and after the identified error:

In [5]:
print('Header:', header_play, end='\n\n')

for row in play_data[10470:10477]:
    label = play_data.index(row) #determine the row index number
    print('Row Number', label, end=': ')
    print(row, end='\n \n') #print row data and add some nice spacing
    

Header: ['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']

Row Number 10470: ['TownWiFi | Wi-Fi Everywhere', 'COMMUNICATION', '3.9', '2372', '58M', '500,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', '4.2.1', '4.2 and up']
 
Row Number 10471: ['Jazz Wi-Fi', 'COMMUNICATION', '3.4', '49', '4.0M', '10,000+', 'Free', '0', 'Everyone', 'Communication', 'February 10, 2017', '0.1', '2.3 and up']
 
Row Number 10472: ['Xposed Wi-Fi-Pwd', 'PERSONALIZATION', '3.5', '1042', '404k', '100,000+', 'Free', '0', 'Everyone', 'Personalization', 'August 5, 2014', '3.0.0', '4.0.3 and up']
 
Row Number 10473: ['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']
 
Row Number 10474: ['osmino Wi-Fi: free WiFi', 'TOOLS', '4.2', '134203', '4.1M', '10,000,000+', 'Free', '0', 'Everyone', 'Tools', 'Augu

Sure enough, row number 10,473 has a missing value after 'Everyone' and before 'February 11, 2018'. (Our data here in this study includes a header, whereas the user reporting the error had removed the header and thus arrived at 10,472 for the row number of the error.)

### For now, we'll exclude this incomplete row from our data set.

In [6]:
print(len(play_data)) #verify the length of the data set
del play_data[10473] #run this only ONCE to prevent multiple data deletions
print(len(play_data)) #show length of data set after row removed

10842
10841


### Duplicate Entries

The Google Play store data has duplicate entries for certain apps because the data for some apps was collected at different points in time. For the present analysis, we'll only keep the most recent entry. We will deem the most recent entry as one having the highest number of user ratings.

- Here's is a sample of duplicate entries for the Instagram app:

In [7]:
for app in play_data:
    name = app[0]
    if name == 'Instagram':
        print(app, end='\n\n')

['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']

['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']

['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']

['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']



In total, there are 1,181 cases where an app occurs more than once:

In [8]:
duplicate_apps = []
unique_apps = []

for app in play_data:
    name = app[0]
    if name in unique_apps:
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)
        
print('Number of duplicate apps:', len(duplicate_apps))
print('\n')
print('Examples of duplicate apps:', duplicate_apps[:15])


Number of duplicate apps: 1181


Examples of duplicate apps: ['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings', 'Box', 'Zenefits', 'Google Ads', 'Google My Business', 'Slack', 'FreshBooks Classic', 'Insightly CRM', 'QuickBooks Accounting: Invoicing & Expenses', 'HipChat - Chat Built for Teams', 'Xero Accounting Software']


### Dealing with Duplicates

First, we will create a dictionary containing only unique instances of a particular app and its respective highest number of reviews.

In [9]:
# for Google Play data

reviews_max = {}

for app in play_data[1:]:
    name = app[0]
    n_reviews = float(app[3])
    
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews
        
    elif name not in reviews_max:
        reviews_max[name] = n_reviews
        
print(len(reviews_max))

9659


So, given the total number of apps, the number of duplicate apps, and the number of entries in this new dictionary, we should be able to verify the correct unduplicated count of apps.

In [10]:
print('Expected length:', len(play_data[1:]) - len(duplicate_apps))
print('Actual length:', len(reviews_max))


Expected length: 9659
Actual length: 9659


### Remove Duplicates

Next, we will find all of the duplicates in the Google Store data and the App Store data. In brief,

- create empty lists
- loop through Google Play/App Store data
    - if the number of reviews of the current row is equal to our known maximum number of reviews AND the app has not already been added to our tracking list, then we will append the row of app data to the clean data list.


In [11]:
# Google Play store data

android_clean = [] #cleaned data containing all information for each app
already_added = [] #list of app names added to the clean data set

for app in play_data[1:]:
    name = app[0]
    n_reviews = float(app[3])
                 
    if (reviews_max[name] == n_reviews) and (name not in already_added):
        android_clean.append(app)
        already_added.append(name)

In [12]:
explore_data(android_clean, 0, 4, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


Number of rows: 9659
Number of column: 13


### Remove Non-English Language Apps

We market exclusively to English-speaking consumers, so we'll exclude apps that are marketed to non-English customers.

In [13]:
def check_engl(a_string):
    
    for a_char in a_string:
        if ord(a_char) > 127:
            return False
        
    return True

print(check_engl('Instagram'))
print(check_engl('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(check_engl('Instachat 😜'))
print(check_engl('Docs To Go™ Free Office Suite'))

True
False
False
False


**Note**: this is not a perfect language check because emojis and trademark symbols used in English are being counted as non-English. The last two examples above should have been labeled as `True`. Our criterion is too strict and will result in excluding numerous apps unnecessarily.

In [14]:
def is_engl(a_string):
    num_non_engl_chars = 0
    
    for a_char in a_string:
        if ord(a_char) > 127:
            num_non_engl_chars += 1
    
    if num_non_engl_chars > 3:
        return False
    else:
        return True

print(is_engl('Instagram'))
print(is_engl('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(is_engl('Instachat 😜'))
print(is_engl('Docs To Go™ Free Office Suite'))

True
False
True
True


#### Apply Language Filter to Both Data Sets

In [15]:
ios_engl = []
android_engl = []

# create an English-only Android list
for app in android_clean:
    name = app[0]
    
    if is_engl(name):
        android_engl.append(app)
        
explore_data(android_engl, 0, 3, True)

# create an English-only App Store list
for app in app_data[1:]:
    name = app[1]
    
    if is_engl(name):
        ios_engl.append(app)
        
explore_data(ios_engl, 0, 3, True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9614
Number of column: 13
['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Ga

### Isolate the Free Apps

Our company involves free apps, so we'd like to narrow the data down to only free apps.

In [16]:
# for Google Play data
android_free = []

for app in android_engl:
    price = app[7]
    
    if price == '0':
        android_free.append(app)
        
# for App Store data
ios_free = []

for app in ios_engl:
    price = app[4]
    
    if price == '0.0':
        ios_free.append(app)
        
explore_data(android_free, 0, 5, True)
explore_data(ios_free, 0, 5, True)
        

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'March 26, 2017', '1.0', '2.3 and up']


Number of rows: 8864
Number of column: 13
['284882215', 'Facebook', '389879808', 'USD', '0.

## Ideal App Profile for Our Company

In order to minimize overhead and maximize profits, we validate apps by

- developing it in Android first
- if it is successful after 6 months on the Google Play store
- then we develop an iOS version

Apps developed in the first place should have potential appeal for both stores. Here, genre is key:

The columns corresponding to genre in each data set:

- Google Play, column index 1 and index 9
- App Store, column index 11

### Building Frequency Tables

In [19]:
def freq_table(dataset, index):
    frequency_table = {}
    total = 0
    
    for row in dataset:
        total += 1
        a_data_point = row[index]
        
        if a_data_point in frequency_table:
            frequency_table[a_data_point] += 1
        
        else:
            frequency_table[a_data_point] = 1
        
    table_percent = {}
    
    for key in frequency_table:
        percentage = (frequency_table[key] / total) * 100
        table_percent[key] = percentage
        
    return table_percent

def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)
        
    table_sorted = sorted(table_display, reverse = True)
    
    for entry in table_sorted:
        print(entry[1], ':', entry[0])

## Analyze the Tables

- App Store:


In [20]:
display_table(ios_free, -5) #explore prime genre

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


We can see that among the free English apps, more than a half (58.16%) are games. Entertainment apps are close to 8%, followed by photo and video apps, which are close to 5%. Only 3.66% of the apps are designed for education, followed by social networking apps which amount for 3.29% of the apps in our data set.

The general impression is that App Store (at least the part containing free English apps) is dominated by apps that are designed for fun (games, entertainment, photo and video, social networking, sports, music, etc.), while apps with practical purposes (education, shopping, utilities, productivity, lifestyle, etc.) are more rare. However, the fact that fun apps are the most numerous doesn't also imply that they also have the greatest number of users — the demand might not be the same as the offer.

Let's continue by examining the Genres and Category columns of the Google Play data set (two columns which seem to be related).

In [21]:
display_table(android_free, 1) #explore category

FAMILY : 18.907942238267147
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.9034296028880866
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.6543321299638989
ART_AND_DESIGN : 

In [22]:
display_table(android_free, -4) #Genres

Tools : 8.449909747292418
Entertainment : 6.069494584837545
Education : 5.347472924187725
Business : 4.591606498194946
Productivity : 3.892148014440433
Lifestyle : 3.892148014440433
Finance : 3.7003610108303246
Medical : 3.531137184115524
Sports : 3.463447653429603
Personalization : 3.3167870036101084
Communication : 3.2378158844765346
Action : 3.1024368231046933
Health & Fitness : 3.0798736462093865
Photography : 2.944494584837545
News & Magazines : 2.7978339350180503
Social : 2.6624548736462095
Travel & Local : 2.3240072202166067
Shopping : 2.2450361010830324
Books & Reference : 2.1435018050541514
Simulation : 2.0419675090252705
Dating : 1.861462093862816
Arcade : 1.8501805054151623
Video Players & Editors : 1.7712093862815883
Casual : 1.7599277978339352
Maps & Navigation : 1.3989169675090252
Food & Drink : 1.2409747292418771
Puzzle : 1.128158844765343
Racing : 0.9927797833935018
Role Playing : 0.9363718411552346
Libraries & Demo : 0.9363718411552346
Auto & Vehicles : 0.9250902527075

### On to Usage of Apps

In [23]:
genres_ios = freq_table(ios_free, -5)

for genre in genres_ios:
    total = 0
    len_genre = 0
    for app in ios_free:
        genre_app = app[-5]
        if genre_app == genre:
            num_ratings = float(app[5])
            total += num_ratings
            len_genre += 1
        
    avg_n_ratings = total / len_genre
    print(genre, ':', avg_n_ratings)
    


Social Networking : 71548.34905660378
Productivity : 21028.410714285714
Book : 39758.5
Utilities : 18684.456790123455
Navigation : 86090.33333333333
Medical : 612.0
Catalogs : 4004.0
Music : 57326.530303030304
News : 21248.023255813954
Reference : 74942.11111111111
Weather : 52279.892857142855
Entertainment : 14029.830708661417
Games : 22788.6696905016
Photo & Video : 28441.54375
Lifestyle : 16485.764705882353
Travel : 28243.8
Sports : 23008.898550724636
Health & Fitness : 23298.015384615384
Shopping : 26919.690476190477
Food & Drink : 33333.92307692308
Education : 7003.983050847458
Finance : 31467.944444444445
Business : 7491.117647058823


## Conclusions for iOS

We recommend a profile of apps targeted towards travel, which also includes activities such as checking the "Weather"; tracking expenses in "Finance"; listening to "Music" while on the road or in the airport; finding "Food & Drink" recommendations while out and about.

## Looking at Google Play Data

The installs data is messy and imprecise, so we need to convert the "100,000+" style strings to numbers without commas or plus signs.

In [24]:
display_table(android_free, 5)

1,000,000+ : 15.726534296028879
100,000+ : 11.552346570397113
10,000,000+ : 10.548285198555957
10,000+ : 10.198555956678701
1,000+ : 8.393501805054152
100+ : 6.915613718411552
5,000,000+ : 6.825361010830325
500,000+ : 5.561823104693141
50,000+ : 4.7721119133574
5,000+ : 4.512635379061372
10+ : 3.5424187725631766
500+ : 3.2490974729241873
50,000,000+ : 2.3014440433213
100,000,000+ : 2.1322202166064983
50+ : 1.917870036101083
5+ : 0.78971119133574
1+ : 0.5076714801444043
500,000,000+ : 0.2707581227436823
1,000,000,000+ : 0.22563176895306858
0+ : 0.04512635379061372
0 : 0.01128158844765343


In [25]:
categories_android = freq_table(android_free, 1)
for category in categories_android:
    total = 0
    len_category = 0
    for app in android_free:
        category_app = app[1]
        if category_app == category:
            n_installs = app[5]
            n_installs = n_installs.replace('+', '')
            n_installs = n_installs.replace(',', '')
            n_installs = float(n_installs)
            total += n_installs
            len_category += 1
            
    avg_n_installs = total / len_category
    print(category, ":", avg_n_installs)

COMMUNICATION : 38456119.167247385
MAPS_AND_NAVIGATION : 4056941.7741935486
GAME : 15588015.603248259
PHOTOGRAPHY : 17840110.40229885
BOOKS_AND_REFERENCE : 8767811.894736841
SPORTS : 3638640.1428571427
PARENTING : 542603.6206896552
FINANCE : 1387692.475609756
LIBRARIES_AND_DEMO : 638503.734939759
BEAUTY : 513151.88679245283
HOUSE_AND_HOME : 1331540.5616438356
EVENTS : 253542.22222222222
SOCIAL : 23253652.127118643
ART_AND_DESIGN : 1986335.0877192982
DATING : 854028.8303030303
BUSINESS : 1712290.1474201474
PERSONALIZATION : 5201482.6122448975
VIDEO_PLAYERS : 24727872.452830188
FAMILY : 3695641.8198090694
AUTO_AND_VEHICLES : 647317.8170731707
TRAVEL_AND_LOCAL : 13984077.710144928
FOOD_AND_DRINK : 1924897.7363636363
PRODUCTIVITY : 16787331.344927534
WEATHER : 5074486.197183099
SHOPPING : 7036877.311557789
COMICS : 817657.2727272727
MEDICAL : 120550.61980830671
ENTERTAINMENT : 11640705.88235294
NEWS_AND_MAGAZINES : 9549178.467741935
LIFESTYLE : 1437816.2687861272
TOOLS : 10801391.298666667