# Data Analysis to find which apps are likely to attract more users.
> As a Data Analysts our aim is to find which apps are likely to attract more users for a company that builds free Android and iOS apps, which are available on Google Play and App Store.
>
> The company, only build apps that are free to download and install, and their main source of revenue consists of in-app ads. This means that their revenue for any given app is mostly influenced by the number of users who uses their app. Our goal for this project is to analyze data to help developers understand which apps are likely to attract more users on both Google Play and App Store. And thus profit the company.

In [1]:
def explore_data(dataset, start, end, rows_and_columns = False):
    dataset_slice = dataset[start:end]
    for lst in dataset_slice:
        print(lst)
        print('\n')
    if rows_and_columns:
        print('Number of rows: ', len(dataset[1:]))
        print('Number of columns: ', len(dataset[0]))
        print('\n')

In [2]:
# reading our dataset and preparing it for analysis
openedAppleStore = open('AppleStore.csv', encoding = "utf-8")
openedGoogleplay = open('googleplaystore.csv', encoding = "utf-8")
from csv import reader
dataset_AppleStore = reader(openedAppleStore)
dataset_Googleplay = reader(openedGoogleplay)
datasetlst_AppleStore = list(dataset_AppleStore)
datasetlst_Googleplay = list(dataset_Googleplay)

explore_data(dataset = datasetlst_AppleStore, start = 1, end = 6, rows_and_columns = True)
explore_data(dataset = datasetlst_Googleplay, start = 1, end = 6, rows_and_columns = True)

print(datasetlst_AppleStore[0])
print('\n')
print(datasetlst_Googleplay[0])

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


['284035177', 'Pandora - Music & Radio', '130242560', 'USD', '0.0', '1126879', '3594', '4.0', '4.5', '8.4.1', '12+', 'Music', '37', '4', '1', '1']


Number of rows:  7197
Number of columns:  16


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967

>You can download the data set AppleStore.csv and find its documentation on this [link](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/download)
>
>
  | Column Name        |           Description                |
  |:----|:----|
  |"id"                | App ID                               |
  |"track_name"        | App Name                             |
  |"size_bytes"        | Size (in Bytes)                      |
  |"currency"          | Currency Type                        |
  |"price"             | Price amount                         |
  |"rating_count_tot"  | User Rating counts (all ver)         |
  |"rating_count_ver"  | User Rating counts (current ver)     |
  |"user_rating"       | Avg User Rating value (all ver)      |
  |"user_rating_ver"   | Avg User Rating value (current ver)  |
  |"ver"               | Latest version code                  |
  |"cont_rating"       | Content Rating                       |
  |"prime_genre"       | Primary Genre                        |
  |"sup_devices.num"   | Nos of supporting devices            |
  |"ipadSc_urls.num"   | Nos of screenshots showed for display|
  |"lang.num"          | Number of supported languages        |
  |"vpp_lic"           | Vpp Device Based Licensing Enabled   |
                                                                 >    
>You can download the data set googleplaystore.csv and find its documentation on this [link](https://www.kaggle.com/gauthamp10/google-playstore-apps/download#Google-Playstore-Full.csv)
>
> 
  | Column Name        |           Description                |
  |:----|:----|
  |"App"               | Name of the app                      |
  |"Category  "        | Category to which the app belongs    | 
  |"Rating"            | Rating for the app (max - 5)         |
  |"Reviews"           | Review counts                        |
  |"Size"              | Size of the app                      |
  |"Installs"          | Number of app installs               |
  |"Price"             | Price of the app in dollars          |
  |"Content Rating"    | Intended audience or age group target|
  |"Last Updated"      | Last updated date                    |
  |"Minimum Version"   | Minimum android version required     |
  |"Latest Version"    | Current version of the app           |
  

In [3]:
#cleaning our dataset records
# this record has this entry, which has missing 'Rating' datapoint
# and a column shift is happening for next columns
print(datasetlst_Googleplay[10473]) 

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']


In [4]:
# deleting the record we found above having 
# missing datapoint in its record
del datasetlst_Googleplay[10473]

In [5]:
#finding the new length of our dataset
print(len(datasetlst_Googleplay[1:]))

10840


In [6]:
#finding duplicates records
counta, countb = 0 , 0
appA = []
appB = []
for applst in datasetlst_Googleplay:
    appname = applst[0]
    if appname =='Instagram':
        appA.append(applst)
        counta += 1
    elif appname == 'Box':
        appB.append(applst)
        countb += 1

print(appA)
print(counta)
print('\n')
print(appB)
print(countb)

[['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]
4


[['Box', 'BUSINESS', '4.2', '159872', 'Varies with device', '10,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Box', 'BUSINESS', '4.2', '159872', 'Varies with device', '10,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 

In [7]:
# finding total duplicates records in our dataset
duplicate_apps = []
unique_apps = []
for lst in datasetlst_Googleplay:
    appname = lst[0]
    if appname in unique_apps:
        duplicate_apps.append(lst)
    else:
        unique_apps.append(appname)
print('Number of duplicate apps', len(duplicate_apps))
print('\n')
print(duplicate_apps[:20])

Number of duplicate apps 1181


[['Quick PDF Scanner + OCR FREE', 'BUSINESS', '4.2', '80805', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'February 26, 2018', 'Varies with device', '4.0.3 and up'], ['Box', 'BUSINESS', '4.2', '159872', 'Varies with device', '10,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Google My Business', 'BUSINESS', '4.4', '70991', 'Varies with device', '5,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 24, 2018', '2.19.0.204537701', '4.4 and up'], ['ZOOM Cloud Meetings', 'BUSINESS', '4.4', '31614', '37M', '10,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 20, 2018', '4.1.28165.0716', '4.0 and up'], ['join.me - Simple Meetings', 'BUSINESS', '4.0', '6989', 'Varies with device', '1,000,000+', 'Free', '0', 'Everyone', 'Business', 'July 16, 2018', '4.3.0.508', '4.4 and up'], ['Box', 'BUSINESS', '4.2', '159872', 'Varies with device', '10,000,000+', 'Free', '0', '

> As seen in above last two coding -> we have `duplicates records in googleplaystore.csv file`.
>
> `1181 records are duplicate`.
>
> To remove these records we won't remove it randomly -> as in second last coding gave us the output that we have `4 duplicate records of Instagram` and `3 duplicate records of Box` apps. And if we see the output we may notice that Instagram app has main difference on the fourth position (`Reviews`) of each of the 4 duplicate records. Whereas Box has no difference in its all 3 duplicate records.
>
> So we may use this pick-point for deleting duplicate records - like the `highest number of reviews means it is the most recent record`.
>
> Checking all 1181 records will not be possible so we can use some pick-points if ever they exists to help us delete duplicate records rather than deleting it randomly. Like as mentioned above `Reviews` can be one of them. We can also code for other pick-points:
1. `rating, the highest rating is most recent record`
2. `install, the highest number of install is most reccent record`
3. `last Updated, the most recent date is to be considered`
4. `current ver, the newest version is to be considered`

In [8]:
# calculating what length our dataset will reduce after
# deleting duplicate records
print('Expected length after deleting duplicate records:', len(datasetlst_Googleplay[1:]) - 1181)

Expected length after deleting duplicate records: 9659


> For now to delete the duplicate records the only pick-point we will use in this project is the highest `review`

In [9]:
# creating a sub-dataset having unique records with highest reviews
# and for instance finding Instagram highest review records from 
# all its duplicate records
reviews_max = {}
for lst in datasetlst_Googleplay[1:]:
    name = lst[0]
    n_reviews =float(lst[3])
    if name not in reviews_max:
         reviews_max[name] = n_reviews
    elif name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews

print(len(reviews_max))
print('Instagram:',reviews_max['Instagram'])

9659
Instagram: 66577446.0


> In above code we have succesfully seperated duplicate records and got 9659 unique records.
>
> Also note that our Instagram record has that highest number of reviews.

In [10]:
# making a clean dataset as per our above coding
cleandatasetlst_Googleplay = []
already_added = [] 
cleandatasetlst_Googleplay.append(datasetlst_Googleplay[0])
for lst in datasetlst_Googleplay[1:]:
    name = lst[0]
    n_reviews =float(lst[3])
    if n_reviews == reviews_max[name] and name not in already_added:
        cleandatasetlst_Googleplay.append(lst)
        already_added.append(name)
        
print(len(cleandatasetlst_Googleplay[1:]))
print('\n')
print(cleandatasetlst_Googleplay[:15])
print('\n')
for lstb in cleandatasetlst_Googleplay[1:]:
    if lstb[0] == 'Instagram':
        print(lstb)
        break

9659


[['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver'], ['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'], ['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up'], ['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up'], ['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up'], ['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Ev

> already_added is a list that acts as a checkpoint to make sure 
that our cleandatasetlst_Googleplay has only unique records as we 
iterate through datasetlst_Googleplay that has duplicate records.

In [11]:
# lets explore our clean dataset
explore_data(dataset = cleandatasetlst_Googleplay, start = 1, end = 6, rows_and_columns = True)

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art Coloring Book', 'ART_AND_DESIGN', '4.3', '967', '2.8M', '100,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'June 20, 2018', '1.1', '4.4 and up']


['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'March 26, 2017', '1.0', '2.3 and up']


Number of rows:  9659
Number of columns:  13




> Now  if we explore the data long enough, we'll find that both data sets datasetlst_AppleStore and datasetlst_Googleplay have apps with names that suggest they are not directed toward an English-speaking audience. And as the company use English for the apps they develop at their company, and we'd like to analyze only the apps that are directed toward an English-speaking audience. 

In [12]:
#print(datasetlst_AppleStore[813][1])
#print(datasetlst_AppleStore[6731][1])
#print('\n')
#print(cleandatasetlst_Googleplay[4412][0])
#print(datasetlst_Googleplay[7940][0])
#print('\n')
# just finding the characters in app name which are special or 
# non-English words
find = False
for character in datasetlst_AppleStore[1:]:
    find = False
    for eachchr in character[1]:
        if ord(eachchr)>127:
            find = True
    if find:
        print(character[1])
print('\n')
find = False
for characterj in cleandatasetlst_Googleplay[1:]:
    find = False
    for eachchrj in characterj[0]:
        if ord(eachchrj)>127:
            find = True
    if find:
        print(characterj[0])

Google – Search made just for mobile
The Sims™ FreePlay
8 Ball Pool™
Lose It! – Weight Loss Program and Calorie Counter
▻Sudoku
Fruit Ninja®
iHeartRadio – Free Music & Radio Stations
The Simpsons™: Tapped Out
Plants vs. Zombies™ 2
Pokémon GO
Star Wars™: Commander
Kindle – Read eBooks, Magazines & Textbooks
Chase Mobile℠
The Weather Channel App for iPad – best local forecast, radar map, and storm tracking
Call of Duty®: Heroes
ooVoo – Free Video Call, Text and Voice
The Secret Society® - Hidden Mystery
Viber Messenger – Text & Call
Words with Friends – Best Word Game
Jurassic World™: The Game
Flashlight Ⓞ
▻Solitaire
Guess My Age  Math Magic
Tetris® Blitz
Star Wars™: Galaxy of Heroes
Bubble Mania™
Big Fish Casino – Best Vegas Slot Machines & Games
⋆Solitaire
Audible – audio books, original series & podcasts
DoubleDown Casino & Slots  – Vegas Slot Machines!
Walgreens – Pharmacy, Photo, Coupons and Shopping
ABC – Watch Live TV & Stream Full Episodes
QuizUp™
UNO ™ & Friends
Solitaire·
Over

In [13]:
# practice coding to find non-English characters. 
def findEnglishName1(app):
    boolvalue = False
    for character in app:
        if ord(character)>=0 and ord(character)<=127:
            boolvalue = True
        else:
            boolvalue = False
            break
    return boolvalue

print(findEnglishName1('Instagram'))
print(findEnglishName1('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(findEnglishName1('Docs To Go™ Free Office Suite'))
print(findEnglishName1('Instachat 😜'))

True
False
False
False


In [14]:
# creating a function find English app name and 
# having non-English or other special 
# Characters which are not more than 3 in number in our app name
def findEnglishName(app):
    boolvalue = True
    count = 0
    for character in app:
        if not(ord(character)>=0 and ord(character)<=127):
            count +=1 
            if count > 3:
                boolvalue = False
                break
    return boolvalue

print(findEnglishName('Instagram'))
print(findEnglishName('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(findEnglishName('Docs To Go™ Free Office Suite'))
print(findEnglishName('Instachat 😜'))
print(findEnglishName('AJ렌터카 법인 카셰어링'))

True
False
True
True
False


In [15]:
# cleaning our both datasets on basis of above coding
datasetlst_AppleStoreEnglish = []
cleandatasetlst_GoogleplayEnglish = []
datasetlst_AppleStoreEnglish.append(datasetlst_AppleStore[0])
cleandatasetlst_GoogleplayEnglish.append(cleandatasetlst_Googleplay[0])

for lst in datasetlst_AppleStore[1:]:
    appname = lst[1]
    if findEnglishName(appname):
        datasetlst_AppleStoreEnglish.append(lst)
        
for lsta in cleandatasetlst_Googleplay[1:]:
    appname = lsta[0]
    if findEnglishName(appname):
        cleandatasetlst_GoogleplayEnglish.append(lsta)

explore_data(dataset = datasetlst_AppleStoreEnglish, start = 1, end = 6, rows_and_columns = True)
explore_data(dataset = cleandatasetlst_GoogleplayEnglish, start = 1, end = 6, rows_and_columns = True)

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


['284035177', 'Pandora - Music & Radio', '130242560', 'USD', '0.0', '1126879', '3594', '4.0', '4.5', '8.4.1', '12+', 'Music', '37', '4', '1', '1']


Number of rows:  6183
Number of columns:  16


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps'

In [16]:
#for lstc in cleandatasetlst_GoogleplayEnglish[1:]:
    #print(lstc[0])

In [17]:
# cleaning our dataset further to find only free apps
# as the company makes free English apps only
free_AppleStoreEnglish = []
free_GoogleplayEnglish = []
free_AppleStoreEnglish.append(datasetlst_AppleStoreEnglish[0])
free_GoogleplayEnglish.append(cleandatasetlst_GoogleplayEnglish[0])

for lst in datasetlst_AppleStoreEnglish[1:]:
    price = float(lst[4])
    if price == 0.0:
        free_AppleStoreEnglish.append(lst)
        
for lsta in cleandatasetlst_GoogleplayEnglish[1:]:
    price = lsta[7]
    if price == '0':
        free_GoogleplayEnglish.append(lsta)

explore_data(dataset = free_AppleStoreEnglish, start = 1, end = 6, rows_and_columns = True)
explore_data(dataset = free_GoogleplayEnglish, start = 1, end = 6, rows_and_columns = True)

['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


['420009108', 'Temple Run', '65921024', 'USD', '0.0', '1724546', '3842', '4.5', '4.0', '1.6.2', '9+', 'Games', '40', '5', '1', '1']


['284035177', 'Pandora - Music & Radio', '130242560', 'USD', '0.0', '1126879', '3594', '4.0', '4.5', '8.4.1', '12+', 'Music', '37', '4', '1', '1']


Number of rows:  3222
Number of columns:  16


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps'

> Our aim is to determine the kinds of apps that are likely to 
attract more users because the company revenue is highly 
influenced by the number of people using their apps.
>
> To minimize risks and overhead, our validation strategy for 
an app idea is comprised of three steps:
>
1. Build a minimal Android version of the app, and add it to Google Play.
2. If the app has a good response from users, they will develop it further.
3. If the app is profitable after six months, they will build an iOS version of the app and add it to the App Store.
>
> Because their end goal is to add the app on both Google Play and the App Store, we need to find app profiles that are successful on both markets.

In [18]:
# creating function to calculate frequency percentage of a 
# genre or category
def freq_table(dataset, index):
    freqcal_table = {}
    for lst in dataset:
        if lst[index] in freqcal_table:
            freqcal_table[lst[index]] += 1
        else:
            freqcal_table[lst[index]] = 1       
    for lsta in freqcal_table:
        avgcount = (freqcal_table[lsta] / len(dataset)) * 100
        freqcal_table[lsta] = avgcount
    return freqcal_table 

applefreq = {}
applefreq = freq_table(free_AppleStoreEnglish, 11)
for lstc in applefreq:
    print(lstc +': '+ str(applefreq[lstc]))
print('\n')
googleplayfreq = {}
googleplayfreq = freq_table(free_GoogleplayEnglish, 1)
for lstd in googleplayfreq:
    print(lstd +': '+ str(googleplayfreq[lstd]))
print('\n')
googleplayfreqa = {}
googleplayfreqa = freq_table(free_GoogleplayEnglish, 9)
for lstd in googleplayfreqa:
    print(lstd +': '+ str(googleplayfreqa[lstd]))

prime_genre: 0.031026993484331366
Social Networking: 3.288861309339125
Shopping: 2.6062674526838348
Games: 58.144585789636984
Business: 0.5274588892336333
Sports: 2.140862550418864
Book: 0.43437790878063914
Photo & Video: 4.964318957493019
Education: 3.661185231151101
Food & Drink: 0.8067018305926157
Finance: 1.1169717654359292
Reference: 0.5584858827179646
Lifestyle: 1.5823766677008997
Productivity: 1.7375116351225566
Weather: 0.8687558175612783
Catalogs: 0.12410797393732546
Utilities: 2.513186472230841
Navigation: 0.18616196090598822
Entertainment: 7.880856345020168
Health & Fitness: 2.016754576481539
Travel: 1.2410797393732547
Medical: 0.18616196090598822
Music: 2.04778156996587
News: 1.3341607198262488


VIDEO_PLAYERS: 1.793570219966159
PARENTING: 0.6542583192329385
SPORTS: 3.395375070501974
LIFESTYLE: 3.902989283699944
BUSINESS: 4.591088550479413
ART_AND_DESIGN: 0.6429780033840948
PHOTOGRAPHY: 2.9441624365482233
DATING: 1.8612521150592216
AUTO_AND_VEHICLES: 0.924985899605189
HOUSE

In [19]:
# displaying 
def display_table(dataset, index):
    freqcal = {}
    avglistcal = []
    sortedavglist = []
    freqcal = freq_table(dataset, index)
    for lstj in freqcal:
        avglistcal.append((freqcal[lstj], lstj))
    
    sortedavglist = sorted(avglistcal, reverse = True)   
        
    for lstk in sortedavglist:
        print(lstk[1] +': ' + str(lstk[0]))
        
display_table(free_AppleStoreEnglish, 11)
print('\n')
display_table(free_GoogleplayEnglish, 1)
print('\n')
display_table(free_GoogleplayEnglish, 9)
    

Games: 58.144585789636984
Entertainment: 7.880856345020168
Photo & Video: 4.964318957493019
Education: 3.661185231151101
Social Networking: 3.288861309339125
Shopping: 2.6062674526838348
Utilities: 2.513186472230841
Sports: 2.140862550418864
Music: 2.04778156996587
Health & Fitness: 2.016754576481539
Productivity: 1.7375116351225566
Lifestyle: 1.5823766677008997
News: 1.3341607198262488
Travel: 1.2410797393732547
Finance: 1.1169717654359292
Weather: 0.8687558175612783
Food & Drink: 0.8067018305926157
Reference: 0.5584858827179646
Business: 0.5274588892336333
Book: 0.43437790878063914
Navigation: 0.18616196090598822
Medical: 0.18616196090598822
Catalogs: 0.12410797393732546
prime_genre: 0.031026993484331366


FAMILY: 18.905809362662154
GAME: 9.723632261703328
TOOLS: 8.460236886632826
BUSINESS: 4.591088550479413
LIFESTYLE: 3.902989283699944
PRODUCTIVITY: 3.8917089678511
FINANCE: 3.699943598420756
MEDICAL: 3.5307388606880994
SPORTS: 3.395375070501974
PERSONALIZATION: 3.3164128595600673
CO

> Analysing our free_AppleStoreEnglish dataset we found in free app English section Games is on top with 58.14% followed by Entertainment: 7.88%.
> Photo & Video: 4.96%, Education: 3.66%, Social Networking: 3.28%, Shopping: 2.60%, Utilities: 2.51%, Sports: 2.14%, Music: 2.04%, Health & Fitness: 2.01% and Productivity: 1.73%
> Enjoyment app like Games, Entertainment and Photo & Video are high. Whereas Education and Social Networking are next. While Shopping, Utilities , Sports, Music, Health & Fitness are so-so. And all other app including Productivity are towards low.
> Looking at the output we just get a brief idea which genre is more preferred but then it is too broad term. We need to narrow down as to which Games are more in demand like - Puzzle, Brain Games, Action Games etc. Besides having more apps in a particular genre only throws light that this genre is more preferred and so the user are given different options in the same to select for themselves as per their likes - Brain Games / Action Games / etc. But then the point arise which particular app has more likeable and has more users of that app.
>
> Analysing our free_GoogleplayEnglish dataset we found in free app English section category-wise FAMILY: 18.90%, followed by GAME: 9.72% and TOOLS: 8.46%. Then comes BUSINESS: 4.59%, LIFESTYLE: 3.90%, PRODUCTIVITY: 3.89%, FINANCE: 3.69%, MEDICAL: 3.53% and etc. While in genre list we found Tools: 8.44%, 
followed by Entertainment: 6.06% and Education: 5.34%. Then comes Business: 4.59%, Productivity: 3.89%, Lifestyle: 3.89%, Finance: 3.69%, Medical: 3.53% and etc. Here we see somewhat towards even distribution of apps in different categories and genre
>
> The difference we saw between genre in free_AppleStoreEnglish and category and genre in free_GoogleplayEnglish output is that free_AppleStoreEnglish has uneven distribution or a wide difference between total numbers of apps in one genre than in others. Whereas in free_GoogleplayEnglish has somewhat fairly even distribution or a narrow difference between total numbers of apps in one genre than in others. 
>
> As mentioned above total numbers of apps in a genre or a category seems a vague or too broad term to conclude. What we need is see the target user like of a particular app (downloads and frequency of usage)

In [20]:
# cleaning our dataset from unnecessary records
for lsty in applefreq:
    if lsty == 'prime_genre':
        del applefreq['prime_genre']
        break
print('prime_genre' in  applefreq)

False


In [21]:
# just for checking to find any irrelevant data
for lstx in applefreq:
    name = lstx
    print('In freq: ' + name)
    for lsty in free_AppleStoreEnglish:
        name_d = lsty[11]
        #print('In database: ' + name_d)
        if name == name_d:
            print('In database: ' + name_d + ' equal ' + lsty[5])
        

In freq: Social Networking
In database: Social Networking equal 2974676
In database: Social Networking equal 1061624
In database: Social Networking equal 373519
In database: Social Networking equal 351466
In database: Social Networking equal 334293
In database: Social Networking equal 287589
In database: Social Networking equal 260965
In database: Social Networking equal 177501
In database: Social Networking equal 164963
In database: Social Networking equal 164249
In database: Social Networking equal 112778
In database: Social Networking equal 97072
In database: Social Networking equal 90414
In database: Social Networking equal 85535
In database: Social Networking equal 75412
In database: Social Networking equal 71856
In database: Social Networking equal 60659
In database: Social Networking equal 60163
In database: Social Networking equal 52642
In database: Social Networking equal 49510
In database: Social Networking equal 43877
In database: Social Networking equal 39819
In database: S

In [22]:
# finding popular apps for analysis on basis of user rating count for all version  
def user_rating(database, genre_list, genre_index, user_index):
    userrating_lst = []
    #print(genre_list)
    for lstn in genre_list:
        count = 0
        total = 0
        genre_name = lstn
        for lstq in database[1:]:
            gname = lstq[genre_index]
            if gname == genre_name:
                count += 1
                total += float(lstq[user_index])
        avgrating = total / count
        userrating_lst.append((avgrating , genre_name))
    sorted_userrating = []
    sorted_userrating = sorted(userrating_lst, reverse = True)
    for lstu in sorted_userrating:
        print(lstu[1] + ': ' + str(lstu[0]))
        
            
user_rating(database = free_AppleStoreEnglish, genre_list = applefreq, genre_index = 11, user_index = 5)
#user_rating(database = free_GoogleplayEnglish, genre_list = googleplayfreq, genre_index = 1, user_index = 5)

Navigation: 86090.33333333333
Reference: 74942.11111111111
Social Networking: 71548.34905660378
Music: 57326.530303030304
Weather: 52279.892857142855
Book: 39758.5
Food & Drink: 33333.92307692308
Finance: 31467.944444444445
Photo & Video: 28441.54375
Travel: 28243.8
Shopping: 26919.690476190477
Health & Fitness: 23298.015384615384
Sports: 23008.898550724636
Games: 22788.6696905016
News: 21248.023255813954
Productivity: 21028.410714285714
Utilities: 18684.456790123455
Lifestyle: 16485.764705882353
Entertainment: 14029.830708661417
Business: 7491.117647058823
Education: 7003.983050847458
Catalogs: 4004.0
Medical: 612.0


In [23]:
# just checking for irrelevant data
for lstxx in googleplayfreq:
    name = lstxx
    print('In freq: ' + name)
    for lstyy in free_GoogleplayEnglish:
        name_d = lstyy[1]
        #print('In database: ' + name_d)
        if name == name_d:
            print('In database: ' + name_d + ' equal ' + lsty[5])

In freq: VIDEO_PLAYERS
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal 0
In database: VIDEO_PLAYERS equal

In [24]:
# based on installation done in GooglePlay
display_table(free_GoogleplayEnglish, 5)

1,000,000+: 15.724760293288211
100,000+: 11.551043429216017
10,000,000+: 10.547095318668923
10,000+: 10.197405527354766
1,000+: 8.392554991539763
100+: 6.91483361534123
5,000,000+: 6.824591088550479
500,000+: 5.561195713479977
50,000+: 4.771573604060913
5,000+: 4.512126339537507
10+: 3.542019176536943
500+: 3.248730964467005
50,000,000+: 2.3011844331641287
100,000,000+: 2.131979695431472
50+: 1.9176536943034406
5+: 0.7896221094190639
1+: 0.5076142131979695
500,000,000+: 0.2707275803722504
1,000,000,000+: 0.2256063169768754
0+: 0.04512126339537507
Installs: 0.011280315848843767
0: 0.011280315848843767


In [25]:
# finding popular apps for analysis on basis of installation by  
# removing/replacing the '+' and ',' in installation data 

def user_rating(database, genre_list, genre_index, user_index):
    userrating_lst = []
    #print(genre_list)
    for lstn in genre_list:
        count = 0
        total = 0
        genre_name = lstn
        for lstq in database[1:]:
            gname = lstq[genre_index]
            if gname == genre_name:
                count += 1
                #installation = lstq[user_index]
                #print(',' in lstq[user_index])
                #print('+' in lstq[user_index])
                #print(lstq[user_index])
                if ',' in lstq[user_index]:
                    lstq[user_index] = lstq[user_index].replace(',', '')
                    #print(lstq[user_index])
                if '+' in lstq[user_index]:
                    lstq[user_index] = lstq[user_index].replace('+', '')
                    #print(lstq[user_index])
                total += float(lstq[user_index])
                #print(gname + ': '+ str(total) + ' ' + str(count))
        if total != 0:
            avgrating = total / count
        userrating_lst.append((avgrating , genre_name))
    sorted_userrating = []
    sorted_userrating = sorted(userrating_lst, reverse = True)
    for lstu in sorted_userrating:
        print(lstu[1] + ': ' + str(lstu[0]))
        
            
#user_rating(database = free_AppleStoreEnglish, genre_list = applefreq, genre_index = 11, user_index = 5)
user_rating(database = free_GoogleplayEnglish, genre_list = googleplayfreq, genre_index = 1, user_index = 5)

COMMUNICATION: 38456119.167247385
VIDEO_PLAYERS: 24727872.452830188
SOCIAL: 23253652.127118643
PHOTOGRAPHY: 17840110.40229885
PRODUCTIVITY: 16787331.344927534
GAME: 15588015.603248259
TRAVEL_AND_LOCAL: 13984077.710144928
ENTERTAINMENT: 11640705.88235294
TOOLS: 10801391.298666667
NEWS_AND_MAGAZINES: 9549178.467741935
BOOKS_AND_REFERENCE: 8767811.894736841
SHOPPING: 7036877.311557789
PERSONALIZATION: 5201482.6122448975
WEATHER: 5074486.197183099
HEALTH_AND_FITNESS: 4188821.9853479853
MAPS_AND_NAVIGATION: 4056941.7741935486
FAMILY: 3695641.8198090694
Category: 3695641.8198090694
SPORTS: 3638640.1428571427
ART_AND_DESIGN: 1986335.0877192982
FOOD_AND_DRINK: 1924897.7363636363
EDUCATION: 1833495.145631068
BUSINESS: 1712290.1474201474
LIFESTYLE: 1437816.2687861272
FINANCE: 1387692.475609756
HOUSE_AND_HOME: 1331540.5616438356
DATING: 854028.8303030303
COMICS: 817657.2727272727
AUTO_AND_VEHICLES: 647317.8170731707
LIBRARIES_AND_DEMO: 638503.734939759
PARENTING: 542603.6206896552
BEAUTY: 513151.

In [26]:
# just repeating to get both platform data next to each other for analysis
def user_rating(database, genre_list, genre_index, user_index):
    userrating_lst = []
    #print(genre_list)
    for lstn in genre_list:
        count = 0
        total = 0
        genre_name = lstn
        for lstq in database[1:]:
            gname = lstq[genre_index]
            if gname == genre_name:
                count += 1
                total += float(lstq[user_index])
        avgrating = total / count
        userrating_lst.append((avgrating , genre_name))
    sorted_userrating = []
    sorted_userrating = sorted(userrating_lst, reverse = True)
    for lstu in sorted_userrating:
        print(lstu[1] + ': ' + str(lstu[0]))
        
            
user_rating(database = free_AppleStoreEnglish, genre_list = applefreq, genre_index = 11, user_index = 5)
#user_rating(database = free_GoogleplayEnglish, genre_list = googleplayfreq, genre_index = 1, user_index = 5)

Navigation: 86090.33333333333
Reference: 74942.11111111111
Social Networking: 71548.34905660378
Music: 57326.530303030304
Weather: 52279.892857142855
Book: 39758.5
Food & Drink: 33333.92307692308
Finance: 31467.944444444445
Photo & Video: 28441.54375
Travel: 28243.8
Shopping: 26919.690476190477
Health & Fitness: 23298.015384615384
Sports: 23008.898550724636
Games: 22788.6696905016
News: 21248.023255813954
Productivity: 21028.410714285714
Utilities: 18684.456790123455
Lifestyle: 16485.764705882353
Entertainment: 14029.830708661417
Business: 7491.117647058823
Education: 7003.983050847458
Catalogs: 4004.0
Medical: 612.0


In [27]:
def user_rating(database, genre_list, genre_index, user_index):
    userrating_lst = []
    #print(genre_list)
    for lstn in genre_list:
        count = 0
        total = 0
        genre_name = lstn
        for lstq in database[1:]:
            gname = lstq[genre_index]
            if gname == genre_name:
                count += 1
                if ',' in lstq[user_index]:
                    lstq[user_index] = lstq[user_index].replace(',', '')
                    #print(lstq[user_index])
                if '+' in lstq[user_index]:
                    lstq[user_index] = lstq[user_index].replace('+', '')
                    #print(lstq[user_index])
                total += float(lstq[user_index])
                #print(gname + ': '+ str(total) + ' ' + str(count))
        if total != 0:
            avgrating = total / count
        userrating_lst.append((avgrating , genre_name))
    sorted_userrating = []
    sorted_userrating = sorted(userrating_lst, reverse = True)
    for lstu in sorted_userrating:
        print(lstu[1] + ': ' + str(lstu[0]))
        
            
#user_rating(database = free_AppleStoreEnglish, genre_list = applefreq, genre_index = 11, user_index = 5)
user_rating(database = free_GoogleplayEnglish, genre_list = googleplayfreq, genre_index = 1, user_index = 5)

COMMUNICATION: 38456119.167247385
VIDEO_PLAYERS: 24727872.452830188
SOCIAL: 23253652.127118643
PHOTOGRAPHY: 17840110.40229885
PRODUCTIVITY: 16787331.344927534
GAME: 15588015.603248259
TRAVEL_AND_LOCAL: 13984077.710144928
ENTERTAINMENT: 11640705.88235294
TOOLS: 10801391.298666667
NEWS_AND_MAGAZINES: 9549178.467741935
BOOKS_AND_REFERENCE: 8767811.894736841
SHOPPING: 7036877.311557789
PERSONALIZATION: 5201482.6122448975
WEATHER: 5074486.197183099
HEALTH_AND_FITNESS: 4188821.9853479853
MAPS_AND_NAVIGATION: 4056941.7741935486
FAMILY: 3695641.8198090694
Category: 3695641.8198090694
SPORTS: 3638640.1428571427
ART_AND_DESIGN: 1986335.0877192982
FOOD_AND_DRINK: 1924897.7363636363
EDUCATION: 1833495.145631068
BUSINESS: 1712290.1474201474
LIFESTYLE: 1437816.2687861272
FINANCE: 1387692.475609756
HOUSE_AND_HOME: 1331540.5616438356
DATING: 854028.8303030303
COMICS: 817657.2727272727
AUTO_AND_VEHICLES: 647317.8170731707
LIBRARIES_AND_DEMO: 638503.734939759
PARENTING: 542603.6206896552
BEAUTY: 513151.

> As seen in our free_GoogleplayEnglish 
1. `Navigation: 86090.33333333333` is on top followed by 
2. `Reference: 74942.11111111111`
3. `Social Networking: 71548.34905660378` then comes
4. `Music: 57326.530303030304`
5. `Weather: 52279.892857142855` and the rest 
6. `Book: 39758.5`
7 `Food & Drink: 33333.92307692308`
8. `Finance: 31467.944444444445`
9. `Photo & Video: 28441.54375`
10. `Travel: 28243.8`
11. `Shopping: 26919.690476190477` etc
>
> Let us analyse further what makes Navigation, Reference and Social Networking genre be on top three

In [28]:
for app in free_AppleStoreEnglish:
    if app[11] == 'Navigation':
        print(app[1], ':', app[5]) # print name and number of ratings

Waze - GPS Navigation, Maps & Real-time Traffic : 345046
Google Maps - Navigation & Transit : 154911
Geocaching® : 12811
CoPilot GPS – Car Navigation & Offline Maps : 3582
ImmobilienScout24: Real Estate Search in Germany : 187
Railway Route Search : 5


In [29]:
for app in free_AppleStoreEnglish:
    if app[11] == 'Reference':
        print(app[1], ':', app[5]) # print name and number of ratings

Bible : 985920
Dictionary.com Dictionary & Thesaurus : 200047
Dictionary.com Dictionary & Thesaurus for iPad : 54175
Google Translate : 26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran : 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition : 17588
Merriam-Webster Dictionary : 16849
Night Sky : 12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) : 8535
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools : 4693
GUNS MODS for Minecraft PC Edition - Mods Tools : 1497
Guides for Pokémon GO - Pokemon GO News and Cheats : 826
WWDC : 762
Horror Maps for Minecraft PE - Download The Scariest Maps for Minecraft Pocket Edition (MCPE) Free : 718
VPN Express : 14
Real Bike Traffic Rider Virtual Reality Glasses : 8
教えて!goo : 0
Jishokun-Japanese English Dictionary & Translator : 0


In [30]:
for app in free_AppleStoreEnglish:
    if app[11] == 'Social Networking':
        print(app[1], ':', app[5]) # print name and number of ratings

Facebook : 2974676
Pinterest : 1061624
Skype for iPhone : 373519
Messenger : 351466
Tumblr : 334293
WhatsApp Messenger : 287589
Kik : 260965
ooVoo – Free Video Call, Text and Voice : 177501
TextNow - Unlimited Text + Calls : 164963
Viber Messenger – Text & Call : 164249
Followers - Social Analytics For Instagram : 112778
MeetMe - Chat and Meet New People : 97072
We Heart It - Fashion, wallpapers, quotes, tattoos : 90414
InsTrack for Instagram - Analytics Plus More : 85535
Tango - Free Video Call, Voice and Chat : 75412
LinkedIn : 71856
Match™ - #1 Dating App. : 60659
Skype for iPad : 60163
POF - Best Dating App for Conversations : 52642
Timehop : 49510
Find My Family, Friends & iPhone - Life360 Locator : 43877
Whisper - Share, Express, Meet : 39819
Hangouts : 36404
LINE PLAY - Your Avatar World : 34677
WeChat : 34584
Badoo - Meet New People, Chat, Socialize. : 34428
Followers + for Instagram - Follower Analytics : 28633
GroupMe : 28260
Marco Polo Video Walkie Talkie : 27662
Miitomo : 2

> As per our above coding we can analyse that Navigation, Reference and Social Networking genre be on top three because of a few giant companies app. Rest other apps in this genre are either performing so-so or struggling.

> Now lets us analyse our free_GoogleplayEnglish
> As seen 
1. `COMMUNICATION: 38456119.167247385` is on top followed by
2. `VIDEO_PLAYERS: 24727872.452830188`
3. `SOCIAL: 23253652.127118643` then comes
4. `PHOTOGRAPHY: 17840110.40229885`
5. `PRODUCTIVITY: 16787331.344927534`
6. `GAME: 15588015.603248259` and the list goes on.
> Let us dig further in Communication category.

In [31]:
for app in free_GoogleplayEnglish:
    if app[1] == 'COMMUNICATION':
        print(app[0], ':', app[5]) # print name and number of ratings

WhatsApp Messenger : 1000000000
Messenger for SMS : 10000000
My Tele2 : 5000000
imo beta free calls and text : 100000000
Contacts : 50000000
Call Free – Free Call : 5000000
Web Browser & Explorer : 5000000
Browser 4G : 10000000
MegaFon Dashboard : 10000000
ZenUI Dialer & Contacts : 10000000
Cricket Visual Voicemail : 10000000
TracFone My Account : 1000000
Xperia Link™ : 10000000
TouchPal Keyboard - Fun Emoji & Android Keyboard : 10000000
Skype Lite - Free Video Call & Chat : 5000000
My magenta : 1000000
Android Messages : 100000000
Google Duo - High Quality Video Calls : 500000000
Seznam.cz : 1000000
Antillean Gold Telegram (original version) : 100000
AT&T Visual Voicemail : 10000000
GMX Mail : 10000000
Omlet Chat : 10000000
My Vodacom SA : 5000000
Microsoft Edge : 5000000
Messenger – Text and Video Chat for Free : 1000000000
imo free video calls and chat : 500000000
Calls & Text by Mo+ : 5000000
free video calls and chat : 50000000
Skype - free IM & video calls : 1000000000
Who : 1000

In [32]:
for app in free_GoogleplayEnglish:
    if app[1] == 'BOOKS_AND_REFERENCE':
        print(app[0], ':', app[5])

E-Book Read - Read Book for free : 50000
Download free book with green book : 100000
Wikipedia : 10000000
Cool Reader : 10000000
Free Panda Radio Music : 100000
Book store : 1000000
FBReader: Favorite Book Reader : 10000000
English Grammar Complete Handbook : 500000
Free Books - Spirit Fanfiction and Stories : 1000000
Google Play Books : 1000000000
AlReader -any text book reader : 5000000
Offline English Dictionary : 100000
Offline: English to Tagalog Dictionary : 500000
FamilySearch Tree : 1000000
Cloud of Books : 1000000
Recipes of Prophetic Medicine for free : 500000
ReadEra – free ebook reader : 1000000
Anonymous caller detection : 10000
Ebook Reader : 5000000
Litnet - E-books : 100000
Read books online : 5000000
English to Urdu Dictionary : 500000
eBoox: book reader fb2 epub zip : 1000000
English Persian Dictionary : 500000
Flybook : 500000
All Maths Formulas : 1000000
Ancestry : 5000000
HTC Help : 10000000
English translation from Bengali : 100000
Pdf Book Download - Read Pdf Boo

> Here too the story is same Giant companies has the hold.
> But spectacularly we found that BOOKS_AND_REFERENCE category is
doing tremenduously well.
>
> The reason we considered to analyse BOOKS_AND_REFERENCE category in free_GoogleplayEnglish as Reference genre is doing well in free_AppleStoreEnglish as well. And since we found this genre has some potential to work well on the GooglePlay as well as on App Store, and as our aim is also to recommend an app that shows potential for being profitable on both the App Store and Google Play.
>
> So what we can suggest is a app in a Reference category which has a potential to do well on both platforms.
>
> One can create an app, name it 'Universal Talents', this app is all about people who have lots of potential but no way to showcase it. Many people on this earth likes many things and this app can be a platform to show their talents.
>
>There are many other things in this app left to be shared...
>
>This will indeed be a profitable app for the company :)
