# ANALYZING PROFITABLE APPLICATIONS IN THE APP-STORE AND GOOGLE STORE.

The main objective in this project is to analyze information about applications both in the google and app-store. With an end-goal of projecting and understanding what characteristics makes an app more profitable in comparison to another.

The information gained from this analysis *might* allow developers in a company to know what apps are more likely to gain users and have a greater revenue turnover. 

Two main data sources used for this project:
1. Apple Store Data from [Ramanthan on Kaggle](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/home). 
> This Data collected in July 2017 contains information about Apple iOS mobile applications and consists of 7197 rows, 16 columns. 

2. Google Playstore Data from [Lavanya Gupta on Kaggle](https://www.kaggle.com/lava18/google-play-store-apps/home)
> This Data collected in August 2018 contains information about Google Playstore applications and consists of approximately 10,000 rows and 13 columns.

### Step 1: Open the Datasets

In [1]:
open_file1 = open('AppleStore.csv')
open_file2 = open('googleplaystore.csv')
open_file1
open_file2

<_io.TextIOWrapper name='googleplaystore.csv' mode='r' encoding='UTF-8'>

### Step 2: Store the Dataset in a List

In [2]:
from csv import reader
#APP STORE
read_file = reader(open_file1)
appstore_data = list(read_file)
appstore_header = appstore_data [0]
appstore_data = appstore_data [1:]
#Divide the column and rows data
read_file = reader(open_file2)
playstore_data = list(read_file)
playstore_header = playstore_data [0]
playstore_data = playstore_data [1:]

### Step 3: Explore the Dataset
1. Create a function foe exploring data
2. Print the first 5 rows
3. Get the number of rows and columns of each data set
4. Print the column names of each Dataset

In [3]:

def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line between rows
        
    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

In [4]:
print(appstore_header)
print('\n')
explore_data(appstore_data, 0, 3, True)


['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


Number of rows: 7197
Number of columns: 16


In [5]:
print(playstore_header)
print('\n')
explore_data(playstore_data, 0, 3, True)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


Number of rows: 10841
Number of columns: 13


In [6]:
print('Number of rows and columns for Appstore Data:')
print('Columns' + ': '+ str(len(appstore_data)))
print('Rows' + ': '+ str(len(appstore_data[0])))
print('\n')
print('Number of rows and columns for Playstore Data:')
print('Columns' + ': '+ str(len(playstore_data)))
print('Rows' + ': '+ str(len(playstore_data[0])))

Number of rows and columns for Appstore Data:
Columns: 7197
Rows: 16


Number of rows and columns for Playstore Data:
Columns: 10841
Rows: 13


In [7]:
print('Column Names for Appstore Data')
print(appstore_header)

Column Names for Appstore Data
['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


In [8]:
print('Column Names for Playstore Data')
print(playstore_header)

Column Names for Playstore Data
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


### Step 4: Data Cleaning
Prior to data analysis, accurate and correct representation of data has to be ensured in order to make the analysation process faster, cleaner and apt.

This step will make use of syntax from basic python language to:
1. Detect inaccurate data, correct or remove it
2. Detect duplicate data, and remove the duplicates

### 4a: Addressing error in a row from google play store data set.

Fortunately, in the discussion section of the google store dataset, there was a finding for a missing 'Category' column for the 10472th row. 

This anormal occurence makes the row, shorter than it should be and shifted the columns that should come after Ratings an index less than they should be.

To fix this issue, we first get the data of the row.

In [9]:
print(playstore_data[10472])

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']


The steps to deal with this anomaly, could be one of two:
1. Delete the row
2. Update or modify the row.

The first step will be used, i.e the row will be deleted

In [10]:
print(len(playstore_data))
del playstore_data[10472]
print(len(playstore_data))

10841
10840


### 4b: Removing Duplicate Rows
Most datasets tend to be populated with duplicate rows, to examine if such exists in any of the data set. We will check the  and print the name of an app in one of the data set.

In [11]:
for app in playstore_data:
    name = app[0]
    if name == 'Instagram':
        print(app)

['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']


From the above output, we can see that there are 4 rows, with the name of Instagram.

A quick look at the output, makes us see that all the data in the row is the same except for the index[3] which is the 'Rating' column.

The next step is to seperate unique apps from duplicated apps in the playstore dataset. This will further help us in knowing the number of duplicate apps. 

In [12]:
duplicate_playstore_apps = []
unique_playstore_apps = []

for app in playstore_data:
    name = app[0]
    if name in unique_playstore_apps:
        duplicate_playstore_apps.append(name)
    else:
        unique_playstore_apps.append(name)
print('Number of duplicate apps:', len(duplicate_playstore_apps))
print('\n')

Number of duplicate apps: 1181




As seen above, there are 1181 duplicate rows in the playstore dataset. For each of these duplicate rows, only one copy of the data with the same app name will be left in the playstore dataset. 

Since all the duplicate apps, as seen in the 'Instagram' example above have the same data except for the Rating column. We can deduce that the Rating of the apps was collected at different times, and the higher the number of ratings the app recieved, the more recent the app rating is.

Hence this information, will allow us to know which duplicate rows to delete and which to leave.

In this case, only the row with the highest rating in a duplicate set will not be deleted.

#### Seperate Highest Rating
We will be creating a dictionary that stores the name of a unique app and it's rating.
We then loop through the original playstore data, if the app name exist already, we check if the rating of the app is higher than the current one, if it isn't we update the rating of that app to be the higher one.
If the app name does not exist, we create a new key value pair for the app and it's rating.

At the end, we should have a dictionary that stores a unique app name and with the highest rating for that app.

In [13]:
reviews_max = {}
for app in playstore_data:
    name = app[0]
    number_of_reviews = float(app[3])
    if name in reviews_max and reviews_max[name] < number_of_reviews:
        reviews_max[name] = number_of_reviews
    elif name not in reviews_max:
        reviews_max[name] = number_of_reviews
print(len(reviews_max))

9659


Once we have seperated the app name and highest rating for each app. We then create two empty lists.
1. To store the clean, unique data
2. A list to configure if we have already stored that data.

We loop through the original dataset to check if the review for the current row is the same with the one in the dictionary (i.e it's the highest) AND if the app has not been added already in the clean data set.

If both conditions are met, we append the data for that row to the clean data list and also to already added list.

Lastly we print the length of the clean data list if it's matches that of the dictionary, which is 9659

In [14]:
clean_playstore_data = []
already_added = []
for app in playstore_data:
    name = app[0]
    number_of_reviews = float(app[3])
    if (number_of_reviews == reviews_max[name]) and name not in already_added:
        clean_playstore_data.append(app)
        already_added.append(name)
print(len(clean_playstore_data))

9659


To have consistent data and a single list.

We delete all values in the playstore data, append all values from the clean data set, then delete the clean_playstore_data list we created.

In [15]:
playstore_data.clear()
print(len(playstore_data))

0


In [16]:
for app in clean_playstore_data:
    playstore_data.append(app)
print(len(playstore_data))

9659


### Step 5: Removing Non-English Apps
As the company develops only apps for an english-speaking audience, the analysation of apps will be limited to English apps.

Two main steps will be carried out in this process:
1. Remove all app names whose character's corresponding number is larger than 127. As all English characters according to the ASCII are within the range of 0 - 127.
2. Filter out the non-english apps from the  english app
3. Consolidate the original dataset.

In [17]:
def is_English(string):
    non_ascii = 0
    for character in string:
        if ord(character) > 127:
            non_ascii += 1
    if non_ascii > 3:
        return False
    return True
print(is_English('Instagram'))
print(is_English('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(is_English('Docs To Go™ Free Office Suite'))
print(is_English('Instachat 😜'))

True
False
True
True


In [18]:
english_Ios_Apps = []
english_Playstore_Apps = []
#IOS
for app in appstore_data:
    if is_English(app[1]): #Note the header column exist in the apple dataset.
        english_Ios_Apps.append(app)
#PLAYSTORE
for app in playstore_data:
    if is_English(app[0]):
        english_Playstore_Apps.append(app)

    

In [19]:
print(len(english_Ios_Apps))
print(len(english_Playstore_Apps))

6183
9614


In [20]:
print(len(playstore_data))
print(len(appstore_data))
appstore_data.clear()
playstore_data.clear()
for app in english_Ios_Apps:
    appstore_data.append(app)
for app in english_Playstore_Apps:
    playstore_data.append(app)
print(len(playstore_data))
print(len(appstore_data))  

9659
7197
9614
6183


### Step 6: Isolate Free Apps 
As earlier mentioned above, only free apps are developed by the said company, which is why the source of revenue is in-app ads.
Hence, we will have to distinct the free apps from the paid apps.

In [21]:
android_final = []
ios_final = []

for app in appstore_data:
    price = app[4]
    if price == '0.0':
        ios_final.append(app)
        
for app in playstore_data:
    price = app[7]
    if price == '0':
        android_final.append(app)
print(len(android_final))
print(len(ios_final))

8864
3222


In [22]:
print(len(playstore_data))
print(len(appstore_data))
appstore_data.clear()
playstore_data.clear()
for app in ios_final:
    appstore_data.append(app)
for app in android_final:
    playstore_data.append(app)
print(len(playstore_data))
print(len(appstore_data)) 

9614
6183
8864
3222


Finally, The Data is Cleaned, accurate, non-duplicated and 
thus ready for analysis.

For the Appstore Dataset, we are left with 3222 rows and 
For the Playstore Dataset, we are left with 8864 rows

### Step 7: Most common apps by Genre (Part 1)
Since the revenue model of the apps (i.e in apps ads) is highly influenced by the number of users of the apps. Our main focus is to determine which apps are most popular in both APP-store and Playstore Market. 

We start this by analyzing the most popular genres.

Based on the column of the two dataset. We need to build a frequency table for the prime_genre column from the App Store data and Genre, Category column from the playstore data.

In [24]:
def freq_table(dataset, index):
    column_table = {}
    for row in dataset:
        row_value_freq = row[index]
        if row_value_freq in column_table:
            column_table[row_value_freq] += 1
        else:
            column_table[row_value_freq] = 1
    for key in column_table:
        column_table[key] /= len(dataset)
        column_table[key] *= 100
    return column_table
freq_table(appstore_data,11)

{'Book': 0.4345127250155183,
 'Business': 0.5276225946617008,
 'Catalogs': 0.12414649286157665,
 'Education': 3.662321539416512,
 'Entertainment': 7.883302296710118,
 'Finance': 1.1173184357541899,
 'Food & Drink': 0.8069522036002483,
 'Games': 58.16263190564867,
 'Health & Fitness': 2.0173805090006205,
 'Lifestyle': 1.5828677839851024,
 'Medical': 0.186219739292365,
 'Music': 2.0484171322160147,
 'Navigation': 0.186219739292365,
 'News': 1.3345747982619491,
 'Photo & Video': 4.9658597144630665,
 'Productivity': 1.7380509000620732,
 'Reference': 0.5586592178770949,
 'Shopping': 2.60707635009311,
 'Social Networking': 3.2898820608317814,
 'Sports': 2.1415270018621975,
 'Travel': 1.2414649286157666,
 'Utilities': 2.5139664804469275,
 'Weather': 0.8690254500310366}

After creating a function that generates a frequency table, we create another function that sorts the frequency table. By coonverting the dictionary into a set of tuples, and sorting the tuple in ascending order.

In [29]:
def display_table(dataset, index):
    table = freq_table(dataset, index)
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple)

    table_sorted = sorted(table_display, reverse = True)
    for entry in table_sorted:
        print(entry[1], ':', entry[0])


#### Display Frequency Table for Appstore's Genre Column

In [42]:
appstore_freq_genre = display_table(appstore_data, 11)
appstore_freq_genre 

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


The top categories in the APPstore dataset are:
1. Games
2. Entertainment
3. Photo and Video
4. Education
5. Social Networking

We can say that most applications in the appstore dataset, are designed for fun, as games take a portion of over 50% of the data. This could be translated as there is high number of gamers using iOS software, the platform is good and healthy for gaming.

Entertainment, Social Networking and Photos such as Streaming Apps, Youtube, Instagram also take a great portion in the appstore.

At the very bottom, we have applications such as Medical, Navigation, Books, Business, Catalogs.

The general impression we can gain from the short analysis of Free English Apps by Genre in the App Store Data set is that more Fun apps are developed for this software, and less practical apps. This could translate to the iOS devices being used for socializing, having fun, rather than practicality.

#### Display Frequency Table for Plasytore's Category Column

In [38]:
playstore_freq_category = display_table(playstore_data,1)
playstore_freq_category

FAMILY : 18.907942238267147
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.9034296028880866
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.6543321299638989
ART_AND_DESIGN : 

The top categories in the playstore dataset are:
1. Family
2. Games
3. Tools 
4. Business
5. Lifestyle

#### Display Frequency Table for Plasytore's Genre Column

In [36]:
playstore_freq_genre = display_table(playstore_data,9)
playstore_freq_genre

Tools : 8.449909747292418
Entertainment : 6.069494584837545
Education : 5.347472924187725
Business : 4.591606498194946
Productivity : 3.892148014440433
Lifestyle : 3.892148014440433
Finance : 3.7003610108303246
Medical : 3.531137184115524
Sports : 3.463447653429603
Personalization : 3.3167870036101084
Communication : 3.2378158844765346
Action : 3.1024368231046933
Health & Fitness : 3.0798736462093865
Photography : 2.944494584837545
News & Magazines : 2.7978339350180503
Social : 2.6624548736462095
Travel & Local : 2.3240072202166067
Shopping : 2.2450361010830324
Books & Reference : 2.1435018050541514
Simulation : 2.0419675090252705
Dating : 1.861462093862816
Arcade : 1.8501805054151623
Video Players & Editors : 1.7712093862815883
Casual : 1.7599277978339352
Maps & Navigation : 1.3989169675090252
Food & Drink : 1.2409747292418771
Puzzle : 1.128158844765343
Racing : 0.9927797833935018
Role Playing : 0.9363718411552346
Libraries & Demo : 0.9363718411552346
Auto & Vehicles : 0.9250902527075

The Category Column and Genre column in the playstore dataset, differ in that the Genre column is way more granular, whereas the category column is broader.

However, we can see some intersection in the top popular apps in both columns. Which includes, tools, business, productivity and games.

We can here deduce that more practical applications are being developed for the Playstore market. However, Games also have a high number in the dataset, with about 10%. 

The Playstore market seems to have a more balanced scale between practical apps and social/fun apps on it's software.

#### Analyzing Most Common Apps by Genre.

Upon gaining the information on what applications are popular in the dataset, we need to know which genres have the highest number of installments by users.

The Google Dataset, has the Install column where we can fetch this number from. And for the iOS dataset, we will be making use of the number of rating.

We do this by storing the frequency table in a varible. We loop through the key of the frequency table i.e the genres.

A nested loop for the overall  dataset will be utilized in this case, this helps in comparing the current genre in the dataset with the genre in the frequency table. We extract the rating value of the app in that genre, and add up with the rating values of other apps in the same genre. 

Lastly the average ratings of all the apps in the genre is calculated.

In [44]:
appstore_genres = freq_table(appstore_data, 11)

for genre in appstore_genres:
    total  = 0
    len_genre = 0
    for app in appstore_data:
        genre_app = app[11]
        if genre_app == genre:
            rating = float(app[5])
            total += rating
            len_genre += 1
    average_rating = total / len_genre
    print(genre, ': ', average_rating)


Travel :  28243.8
Productivity :  21028.410714285714
Business :  7491.117647058823
Book :  39758.5
Utilities :  18684.456790123455
Medical :  612.0
Sports :  23008.898550724636
Education :  7003.983050847458
Food & Drink :  33333.92307692308
Entertainment :  14029.830708661417
News :  21248.023255813954
Navigation :  86090.33333333333
Shopping :  26919.690476190477
Social Networking :  71548.34905660378
Weather :  52279.892857142855
Music :  57326.530303030304
Photo & Video :  28441.54375
Catalogs :  4004.0
Finance :  31467.944444444445
Games :  22788.6696905016
Health & Fitness :  23298.015384615384
Lifestyle :  16485.764705882353
Reference :  74942.11111111111


### Recommendation for Appstore Market Only
The Genres with the Highest reviews in descending order are:
1. Navigation
2. Reference
3. Social Networking
4. Weather.

If we couple this finding along with the previous analysis of what applications are the highest by genre in the App-Store.

The best recommendation of what application to develop for the iOS store will be 'Social Networking'. As it has a high percenatge of users and popularity.

In [45]:
playstore_categories = freq_table(playstore_data,1)

for category in playstore_categories:
    total_install = 0
    len_category = 0
    for app in playstore_data:
        category_app = app[1]
        if category_app == category:
            num_install = app[5]
            num_install = num_install.replace('+', '')
            num_install = num_install.replace(',', '')
            total_install += float(num_install)
            len_category += 1
    average_install = total_install / len_category
    print(category, ': ', average_install)

PRODUCTIVITY :  16787331.344927534
MEDICAL :  120550.61980830671
SHOPPING :  7036877.311557789
MAPS_AND_NAVIGATION :  4056941.7741935486
TOOLS :  10801391.298666667
FAMILY :  3695641.8198090694
PHOTOGRAPHY :  17840110.40229885
SPORTS :  3638640.1428571427
EVENTS :  253542.22222222222
GAME :  15588015.603248259
BEAUTY :  513151.88679245283
BUSINESS :  1712290.1474201474
COMMUNICATION :  38456119.167247385
VIDEO_PLAYERS :  24727872.452830188
WEATHER :  5074486.197183099
LIBRARIES_AND_DEMO :  638503.734939759
AUTO_AND_VEHICLES :  647317.8170731707
ENTERTAINMENT :  11640705.88235294
LIFESTYLE :  1437816.2687861272
BOOKS_AND_REFERENCE :  8767811.894736841
ART_AND_DESIGN :  1986335.0877192982
PERSONALIZATION :  5201482.6122448975
PARENTING :  542603.6206896552
FINANCE :  1387692.475609756
EDUCATION :  1833495.145631068
HEALTH_AND_FITNESS :  4188821.9853479853
COMICS :  817657.2727272727
HOUSE_AND_HOME :  1331540.5616438356
FOOD_AND_DRINK :  1924897.7363636363
NEWS_AND_MAGAZINES :  9549178.46

### Recommendation for PlayStore Market Only.
The numbers for the playstore dataset, are higher than that of the appstore dataset. To use the crietria for recommendation used for the appstore. We will get to the top popular categories we obtained by genres which are:
1. Family
2. Games
3. Tools 
4. Business
5. Lifestyle

And Contrast them with the Top 5 installments which are:
COMMUNICATION = 38456119.167247385
VIDEO_PLAYERS : 24727872.452830188
SOCIAL = 23253652.127118643
PRODUCTIVITY : 16787331.344927534
GAME = 15588015.603248259


As we can see productvity tools are in not only popular in the playstore market, but also has a high amount of installments. Games are also a ruling force in the Playstore, with a high number of users, and popularity in the market. 

Communication tools although do not have a high popularity in the playstore, they do carry the highest number of users. The Communication tools which includes the likes of Whatsapp, Skype, Hangout, are most considered and lumped together with social media apps like Facebook, Twitter, Snapchat. 

Although in this dataset, a distinction between the two is made. Social Media apps still have one of the highest number of installments from the playstore.

Thus a recommendation for the playstore market are Productivity Tools, Apps that encompass both Social Media and Communication and Games.

### Recommendation For Both AppStore & Playstore Markets.

There seems to be an intersection of what will survive, have a high amount of users, is popular in both markets and thus, will be suitable for both markets.

And that is **Social and Games**. 

Social Networking has the highest rating in the Appstore market, Games hold the most popular in the Appstore market which accounts for about 50% of the platform.

In comparison, 
Communication & Social have one of the highest installments for the playstore market. Games is also the secoond most popular app genre in the playstore.

In Conclusion, Social and Games sounds like the best fit option for Free English App for both the Play Store and App Store Market.

&copy; All Rights Reserved - Khadijah Lawal Shuaib 2019.