# Profitable App Profiles for the App Store and Google Play Markets

Our goal for this project is to analyze app data to help our developers understand what types of apps are likely to attract more users. 

At our company we only make free apps, so revenue is correlated with the number of users an app has as they will interact with in-app ads. This is why it is important to determine which apps generate the most traffic. Thus by determining which apps generate the most traffic we will also be determining the most profitable apps.

## Opening app data files and creating a list of lists for containing each dataset 

We will be analyzing samples of data, as we have it available and collecting data on thousands or millions of apps would require a lot of time and resources. There are two data sets, available at no cost, that are suitable for the purposes of this project:

- [A data set](https://www.kaggle.com/lava18/google-play-store-apps) containing data about apps from Google Play can be downloaded using this [link](https://dq-content.s3.amazonaws.com/350/googleplaystore.csv).
-[A data set](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps) containing data about iOS apps from the Apple App Store can be downloaded using this [link](https://dq-content.s3.amazonaws.com/350/AppleStore.csv).

In [1]:
# opening sample data sets
from csv import reader #importing reader

opened_file = open('googleplaystore.csv') # opening file
read_file = reader(opened_file) # reading file with reader
google_apps = list(read_file) # creating a list of lists
google_header = google_apps[0] # isolating header row
google_data = google_apps[1:] # isolating rows containing data

# Same as above but for Apple App Store data set
opened_file = open('AppleStore.csv')
read_file = reader(opened_file)
apple = list(read_file)
apple_header = apple[0]
apple_data = apple[1:]


## Create `explore_data()` function to repeatedly explore rows. Number of rows and columns for the dataset are also printed using this function

Headers and first few rows of each data set are printed for as examples. Columns relevant to analysis have been isolated and printed for reference.

**Not all of the column names in the Apple Store data set are clear. Details and documentation for each collumn can be found [here](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/home)**

In [2]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end] # slices dataset according to inputs    
    for row in dataset_slice: # loops through data slice
        print(row) # prints each row
        print('\n') #prints space between each row

    if rows_and_columns: # executed when 'True'
        print('Number of rows:', len(dataset)) # displays number of rows in data set
        print('Number of columns:', len(dataset[0])) # displays number of columns in data set

# exploring Google Play data set        
print(google_header)
print('\n')
explore_data(google_data, 0, 3, True)

# Relevant Columns to our analysis
print('\n')
print('Google Play relevant collumns:')
print(google_header[0])
print(google_header[1])
print(google_header[3])
print(google_header[5])
print(google_header[6])
print(google_header[7])
print(google_header[9])

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


Number of rows: 10841
Number of columns: 13


Google Play relevant collumns:
App
Category
Reviews
Installs
Type
Price
Genres


In [3]:
# exploring Apple Store data set
print(apple_header)
print('\n')
explore_data(apple_data, 0, 3, True)

# Relevant Columns to our analysis
print('\n')
print('Apple Store relevant collumns:')
print(apple_header[1])
print(apple_header[3])
print(apple_header[4])
print(apple_header[5])
print(apple_header[6])
print(apple_header[11])

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 'Games', '38', '5', '18', '1']


Number of rows: 7197
Number of columns: 16


Apple Store relevant collumns:
track_name
currency
price
rating_count_tot
rating_count_ver
prime_genre


## Data Cleaning

### Delete Data Containing Errors

The Google Play data set has a dedicated [discussion section](https://www.kaggle.com/lava18/google-play-store-apps/discussion), and in [one of the discussions](https://www.kaggle.com/lava18/google-play-store-apps/discussion/66015) an error for row 10472 is described. 

We will start by printing the incorrect row and comparing it to a couple of rows that we know do not contain errors. If there is truly an error in the row then it will be removed.

**As per the Apple App Store [discussion section](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/discussion) there does not seem to be errors reported in the iOS app dataset**

In [4]:
print(google_data[10472]) # row containing error
print('\n')
print(google_header) # header row
print('\n')
print(google_data[0]) # row known to be correct

['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


We can see that row 10472 is referencing an app named *Life Made WI-Fi Touchscreen Photo Frame*. We can also see that the rating of the app is 19, but the maximum rating should be 5 as per the [discussions section](https://www.kaggle.com/lava18/google-play-store-apps/discussion/66015) about the Google Play dataset. We will delete this row since the error is valid.   

In [5]:
print(len(google_data)) # number of rows before deletion
del google_data[10472] # only run deletion ONCE
print(len(google_data)) # number of rows after deletion

10841
10840


### Remove Duplicate Data

#### Part One - Exploration

If we explore the [discussion section](https://www.kaggle.com/lava18/google-play-store-apps/discussion) for Google Play further, we can see that there is are questions being asked about duplicate data entries. 

Let's look at the app *Instagram* as an example as it was reported to have four entries.

**As per the Apple App Store [discussion section](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/discussion) there does not seem to be duplicates reported in the iOS app dataset**

In [6]:
for app in google_data: # loop Google Play data set
    name = app[0] # sets name
    if name == 'Instagram': 
        print(app) # prints row if name matches

['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']


There are a total of 1,181 cases where an app occurs more than once in this data set:

In [7]:
duplicate_apps = [] # empty list for apps with duplicates
unique_apps = [] # empty list for all apps exclusing duplicates

for app in google_data: # loop Google Play data set
    name = app[0] # sets name
    if name in unique_apps: 
        duplicate_apps.append(name) # adds app to duplicate list if already found
    else:
        unique_apps.append(name) # adds app to unique list if not
        
print('Number of duplicate apps:', len(duplicate_apps)) # displays apps with duplicates
print('\n')
print('Examples of duplicate apps:', duplicate_apps[:15]) # displays the first 16 examples in the duplicates list

Number of duplicate apps: 1181


Examples of duplicate apps: ['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings', 'Box', 'Zenefits', 'Google Ads', 'Google My Business', 'Slack', 'FreshBooks Classic', 'Insightly CRM', 'QuickBooks Accounting: Invoicing & Expenses', 'HipChat - Chat Built for Teams', 'Xero Accounting Software']


We want to remove duplicate entries in the data set and keep only one entry per app. This ensures that our data analysis is accurate and we aren't counting apps more than once. We could remove the duplicate rows randomly, but it is better to be deliberate in the approach.

If you examine the rows we printed for the *Instagram* app, there are discrepancies on the fourth position of each row, which corresponds to the number of reviews. The different numbers show the data was collected at different times and we can use this information to build a criterion for removing the duplicates. The more recent data will have a higher number of reviews. Rather than removing duplicates randomly, we'll only keep the row with the highest number of reviews and remove the other entries for all duplicate entries.

To complete this task we will:
- Create a dictionary using the unique app name as a key, with the value being the highest number of reviews of that app
- Create a new dataset using the dictionary, which will only have one entry per app (only selecting the entries for apps with the highest number of reviews)

#### Part Two - Finalizing

First we will create the dictionary that we mentioned above. 

To confirm that we correctly established the dictionary we will check the length of the one we create and compare it to the expected length. The expected length should be the number of duplicate entries (1181) subtracted from the length of the Google Play data set since we want our dictionary to include only unique apps.

In [8]:
reviews_max = {} # creating empty dictionary

for app in google_data: # loop Google Play data set
    name = app[0] # sets name
    n_reviews = float(app[3]) # sets reviews for that app as a float
    
    if name in reviews_max and reviews_max[name] < n_reviews:
        reviews_max[name] = n_reviews # sets new value to updated number or reviews if there's a duplicate
        
    elif name not in reviews_max:
        reviews_max[name] = n_reviews # sets new value of key if not
        
print('Expected length of dictionary:', len(google_data) - 1181) # displays calculated expected dictionary length
print('Actual length of dictionary:', len(reviews_max)) # prints actual length of dictionary created

Expected length of dictionary: 9659
Actual length of dictionary: 9659


Based on the length of our dictionary vs. the expected length, it looks like we created it correctly. We can now use it to remove the duplicate rows that don't have the highest number of ratings for an app.

Below we will:
- Start by initializing two empty lists, android_clean and already_added.
- Loop through the android data set, and for every iteration:
    - Isolate the name of the app and the number of reviews.
    - Add the current row (app) to the android_clean list, and the app name (name) to the already_added list if:
        - The number of reviews of the current app matches the number of reviews of that app as described in the reviews_max dictionary; and
        - The name of the app is not already in the already_added list. We need to add this supplementary condition to account for those cases where the highest number of reviews of a duplicate app is the same for more than one entry (for example, the Box app has three entries, and the number of reviews is the same). If we just check for reviews_max[name] == n_reviews, we'll still end up with duplicate entries for some apps.

We will use the `explore_data()` function to confirm that the number of rows in our cleaned data list are the same as the length of the dictionary we created above (9659).

In [9]:
google_clean = [] # will store cleaned data
already_added = [] # will just store app names

for app in google_data: # loop Google Play data set
    name = app[0] # sets name
    n_reviews = float(app[3]) # sets number of reviews as a float
    
    if (reviews_max[name] == n_reviews) and (name not in already_added):
        google_clean.append(app) # adds row to cleaned list 
        already_added.append(name) # accounts for duplicates with same number of reviews
        
explore_data(google_clean, 0, 3, True) # displays examples

['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9659
Number of columns: 13


### Removing Non-English Apps

#### Part One - Exploring & Experimenting

At our company we develop apps in Engish only, so we'd like to analyze only English apps. If we explore the data, we can see that both the Google Play and Apple App Store datasets have apps with names that suggest they are not in English. A couple examples are printed below for reference:


In [10]:
# Examples
print(apple_data[813][1])
print(apple_data[6731][1])
print('\n')
print(google_clean[4412][0])
print(google_clean[7940][0])

爱奇艺PPS -《欢乐颂2》电视剧热播
【脱出ゲーム】絶対に最後までプレイしないで 〜謎解き＆ブロックパズル〜


中国語 AQリスニング
لعبة تقدر تربح DZ


We will delete the rows for the non-english apps for the purposes of our analysis. One approach would be to remove each app with a name containing a symbol that is not common in English. 

English specific characters are all in the range 0 to 127, as per ASCII (American Standard Code for Information Interchange).

We will build a function that checks an app's name and detects whether a character belongs to English (0 to 127) or not.

See the created function below:

In [11]:
def english_app(string): # creating functions
    
    for letter in string: # loops through a string
        if ord(letter) > 127: # ord() finds ASCII number or letter
            return False
    return True

# Examples
print(english_app('Instragram'))
print(english_app('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print('\n')
print(english_app('Docs To Go™ Free Office Suite'))
print(english_app('Instachat 😜'))
print('\n')
print(ord('™'))
print(ord('😜'))

True
False


False
False


8482
128540


The function is working. However we can see that while *Instachat 😜* and *Docs To Go™ Free Office Suite* appear to be English apps, but resulted in a False output. This is because emojis and special characters fall outside the ASCII range of 0 to 127 (I printed the corresponding ASCII numbers for the example characters above for example).

We will need to edit the formula so that we are not deleting rows that we are interested in from the data sets. 

#### Part Two - Finalizing

Since we do not want to remove useful data, we will only want to remove an app if it's name has more than three characters with corresponding numbers that fall outside the ASCII range 0 to 127. All Enlish apps with up to three emojis or special characters should still be included and while it is not perfect it should function well for our purposes.

In [12]:
def english_app(string): # creating finalized function
    non_english = 0 # will count non english apps
    
    for letter in string: # loops through a string
        if ord(letter) > 127:
            non_english += 1 # add to non english count for each case found
    if non_english > 3: # cases with more than three non english characters
        return False
    else:
        return True

# Examples
print(english_app('Docs To Go™ Free Office Suite'))
print(english_app('Instachat 😜'))
print(english_app('爱奇艺PPS -《欢乐颂2》电视剧热播'))

True
True
False


While it's possible that some English apps could still be rejected by the `english_app()` function. It is good enough at this point to move onto filtering out non-English apps from both data sets. The number of rows of both data sets will be reprinted to show the change after extracting the English apps.

In [13]:
google_english = [] # list conataining english apps from Google Play
ios_english = [] # list containing english apps from Apple App Store

for app in google_clean: # loop through cleaned Google Play data set
    name = app[0] # sets name
    if english_app(name): 
        google_english.append(app) # add to list if function evaluates as 'True'
        
for app in apple_data: # loop through cleaned Apple App Store data set
    name = app[1] # sets name
    if english_app(name):
        ios_english.append(app) # add to list if function evaluates as 'True'
        
# Examples 
explore_data(google_english, 0, 3, True) 
print('\n')
explore_data(ios_english, 0, 3, True)


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


Number of rows: 9614
Number of columns: 13


['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1']


['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1']


['529479190', 'Clash of Clans', '116476928', 'USD', '0.0', '2130805', '579', '4.5', '4.5', '9.24.12', '9+', 

### Isolating the Free Apps

Referring to the introduction of this project, we are only interested in free apps. Currently our data sets still contain both free and paid apps. We will need to build a final list to isolate the free apps. We will print the number of rows again to show the change.

We should now be complete with data cleaning for this project and will move onto analysis for our objectives.

In [14]:
google_final = [] # finalized list for Google Play data set
ios_final = [] # finalized list for Apple App Store data set

for app in google_english: # loops english Google Play data set
    price = app[7] # sets price
    if price == '0': # according to data
        google_final.append(app) # add to final list
        
for app in ios_english: # loops english Apple App Store data set
    price = app[4] # sets price
    if price == '0.0': # according to data
        ios_final.append(app) # add to final list

# Confirm lengths of final lists of the data sets
print(len(google_final))
print(len(ios_final))

8864
3222


## Data Analysis

### Identify Most Commons Apps by Genre

#### Part One - Restating Goals and Establishing Strategy

Referencing the introduction, our goal in this project is to figure out the kinds of apps that are going to create the most user volume. We need to find what apps attract the most users since that translates to in-app ad revenue.

To minimize risks and overhead, our validation strategy for an app idea is comprised of three steps:

   1. Build a minimal Android version of the app, and add it to Google Play.
   2. If the app has a good response from users, we develop it further.
   3. If the app is profitable after six months, we build an iOS version of the app and add it to the App Store.
   
In the end our company would want to add the app being developed to Google Play and the Apple App Store. We need to figure out the kinds of apps that are popular on both markets. Analyzing app profiles that work well in both markets will help us figure out what could be incorporated into the app we are developing to ensure a successful launch.

We can begin by figuring out the most common genres for both markets. To do this we will need to build frequency tables for a few columns in our data sets.

#### Part Two - Creating Functions to Generate Frequency Tables

We will build two functions that we can use to analyze the frequency tables:

   - One function will generate frequency tables that show percentages
   - The other function will be used to display the percentages in a descending order

In [15]:
def freq_table(dataset, index): # dataset is a list of lists
    table = {} # creating dictionary counting number of times a key is in a data set
    table_count = 0 # counting total rows evaluated
    
    for row in dataset: #loop through data set
        table_count += 1 # add to total count
        value = row[index] # set value to targetted column in data set
        if value in table: 
            table[value] += 1 # if value is found add to value
        else:
            table[value] = 1 # if value not found establish value
    
    table_percentages = {} # create dictionary to display key percentages
    for key in table: # loop through table dictionary
        percentage = (table[key] / table_count) * 100 # calculates key count as percentage of total count
        table_percentages[key] = percentage # establishes new key values as the calculated percentage

    return table_percentages



def display_table(dataset, index): #data set is a list of lists
    table = freq_table(dataset, index) # new freq. table created
    table_display = []
    for key in table:
        key_val_as_tuple = (table[key], key)
        table_display.append(key_val_as_tuple) # creating list of tuples

    table_sorted = sorted(table_display, reverse = True) # puts list of tuples in descending order
    for entry in table_sorted:
        print(entry[1], ':', entry[0]) # prints entries

#### Part Three - Creating and Examining Frequency Tables for App Genres

First we will generate and examine the frequency table for the *prime_genre* column of the Apple App Store data set.

In [16]:
display_table(ios_final, 11) # Freq. for Prime Genre collum

Games : 58.16263190564867
Entertainment : 7.883302296710118
Photo & Video : 4.9658597144630665
Education : 3.662321539416512
Social Networking : 3.2898820608317814
Shopping : 2.60707635009311
Utilities : 2.5139664804469275
Sports : 2.1415270018621975
Music : 2.0484171322160147
Health & Fitness : 2.0173805090006205
Productivity : 1.7380509000620732
Lifestyle : 1.5828677839851024
News : 1.3345747982619491
Travel : 1.2414649286157666
Finance : 1.1173184357541899
Weather : 0.8690254500310366
Food & Drink : 0.8069522036002483
Reference : 0.5586592178770949
Business : 0.5276225946617008
Book : 0.4345127250155183
Navigation : 0.186219739292365
Medical : 0.186219739292365
Catalogs : 0.12414649286157665


Among the free English apps, the top five of the *prime_genres* column counted in the Apple App Store data set are:
  1. Games - 58.1626%
  2. Entertainment - 7.8833%
  3. Photo & Video - 4.9659%
  4. Education - 3.6623%
  5. Social Networking - 3.2899%
   
It appears that the Apple Apps Store is dominated by apps that are for casual use or pleasure. Apps that are created for practical or informational reasons seem to be more rare. Even though this is the case, it does not necessarily mean that the apps dominating in terms of count are the most popular or most used. In fact since there is so much competition in the top *prime_genres* it may be tough to release a successful app without a deep understanding of how to succeed.

Next we will generate and examine the frequency table for the *Genres* and *Category* columns of the Google Play data set. 

In [17]:
display_table(google_final, 1) # Freq. for Category collum

FAMILY : 18.907942238267147
GAME : 9.724729241877256
TOOLS : 8.461191335740072
BUSINESS : 4.591606498194946
LIFESTYLE : 3.9034296028880866
PRODUCTIVITY : 3.892148014440433
FINANCE : 3.7003610108303246
MEDICAL : 3.531137184115524
SPORTS : 3.395758122743682
PERSONALIZATION : 3.3167870036101084
COMMUNICATION : 3.2378158844765346
HEALTH_AND_FITNESS : 3.0798736462093865
PHOTOGRAPHY : 2.944494584837545
NEWS_AND_MAGAZINES : 2.7978339350180503
SOCIAL : 2.6624548736462095
TRAVEL_AND_LOCAL : 2.33528880866426
SHOPPING : 2.2450361010830324
BOOKS_AND_REFERENCE : 2.1435018050541514
DATING : 1.861462093862816
VIDEO_PLAYERS : 1.7937725631768955
MAPS_AND_NAVIGATION : 1.3989169675090252
FOOD_AND_DRINK : 1.2409747292418771
EDUCATION : 1.1620036101083033
ENTERTAINMENT : 0.9589350180505415
LIBRARIES_AND_DEMO : 0.9363718411552346
AUTO_AND_VEHICLES : 0.9250902527075812
HOUSE_AND_HOME : 0.8235559566787004
WEATHER : 0.8009927797833934
EVENTS : 0.7107400722021661
PARENTING : 0.6543321299638989
ART_AND_DESIGN : 

Among the free English apps, the top five of the *Category* collumn counted in the Google Play data set are:
  1. Family - 18.9079%
  2. Game - 9.7247%
  3. Tools - 8.4612%
  4. Business - 4.5916%
  5. Lifstyle - 3.9034%
  
Looking at the the Google Play data set, we can begin to see that while there are still mostly apps for casual use or pleasure, practical apps have higher percentages in Google Play than the Apple Store data set. The dropoff from the top app *Category* to the next is not nearly as steep as in the Apple Store data set pointing to the distribution of apps being developed in each *Category*.

In [18]:
display_table(google_final, 9) # Freq. Table for Genres column

Tools : 8.449909747292418
Entertainment : 6.069494584837545
Education : 5.347472924187725
Business : 4.591606498194946
Productivity : 3.892148014440433
Lifestyle : 3.892148014440433
Finance : 3.7003610108303246
Medical : 3.531137184115524
Sports : 3.463447653429603
Personalization : 3.3167870036101084
Communication : 3.2378158844765346
Action : 3.1024368231046933
Health & Fitness : 3.0798736462093865
Photography : 2.944494584837545
News & Magazines : 2.7978339350180503
Social : 2.6624548736462095
Travel & Local : 2.3240072202166067
Shopping : 2.2450361010830324
Books & Reference : 2.1435018050541514
Simulation : 2.0419675090252705
Dating : 1.861462093862816
Arcade : 1.8501805054151623
Video Players & Editors : 1.7712093862815883
Casual : 1.7599277978339352
Maps & Navigation : 1.3989169675090252
Food & Drink : 1.2409747292418771
Puzzle : 1.128158844765343
Racing : 0.9927797833935018
Role Playing : 0.9363718411552346
Libraries & Demo : 0.9363718411552346
Auto & Vehicles : 0.9250902527075

Among the free English apps, the top five of the *Genres* column counted in the Google Play data set are:
  1. Tools - 8.4499%
  2. Entertainment - 6.0695%
  3. Education - 5.3475%
  4. Business - 4.5916%
  5. Productivity - 3.8921%
  
Looking at the top *Genres* in the Google Play data set, we can confirm our conclusion above as there is a large distribution of kinds of apps. The percentage differences among *Genres* in the data set is much lower than when we looked at the previous two frequency tables generated, which confirms that the kind of apps in Google Play are more distributed. We also only see one of the *Genres* in the top five that is related to casual use or pleasure, which is *Entertainment*.


The *Genres* column appears to be much more granular (more) than the *Category* collumn, which gives us a better idea of the true distribution of app types in the Google Play data set. However, since we are looking at the bigger picture right now we will use the *Category* frequency table moving forward.

So far we have found that the Apple App Store is dominated by apps for casual use or pleasure, whereas Google Play has more distribution in the kinds of apps offered. Now we will move on to figuring out which apps have the most users.

### Most Popular Apps by Genre on the Apple App Store

A way to approach finding which apps are more popular or have the most users, and are not just the most common, would be to see how many times the app has been downloaded. In the Apple App Store data set, we can use the *rating_count_column* to get an approximation of its' popularity since there is no collumn that specifically addresses downloads. 

We will start below by calculating the average number of user ratings for each genre Apple App Store data set:

In [19]:
ios_genre = freq_table(ios_final, 11) # generates freq. tables for Prime Genre

for genre in ios_genre: # loops each row of freq. table
    total = 0 # total user ratings counter
    len_genre = 0 # total apps in genre counter
    for app in ios_final: # loop through final Apple App Store data set
        genre_app = app[11] # sets app genre
        if genre_app == genre: # when app genre matches genre in freq. table
            user_ratings = float(app[5]) # sets user ratings
            total += user_ratings # adds user rating to total
            len_genre += 1 # adds one to apps in genre
    avg_user_ratings = total / len_genre # calculates average number of ratings for a genre
    print(genre, ':', avg_user_ratings) # displays calculation

Social Networking : 71548.34905660378
Photo & Video : 28441.54375
Games : 22788.6696905016
Music : 57326.530303030304
Reference : 74942.11111111111
Health & Fitness : 23298.015384615384
Weather : 52279.892857142855
Utilities : 18684.456790123455
Travel : 28243.8
Shopping : 26919.690476190477
News : 21248.023255813954
Navigation : 86090.33333333333
Lifestyle : 16485.764705882353
Entertainment : 14029.830708661417
Food & Drink : 33333.92307692308
Sports : 23008.898550724636
Book : 39758.5
Finance : 31467.944444444445
Education : 7003.983050847458
Productivity : 21028.410714285714
Business : 7491.117647058823
Catalogs : 4004.0
Medical : 612.0


The top 5 genres in the Apple App Store according to average number of ratings are:
   1. Navigation: 86,090.33
   2. Reference: 74,942.11
   3. Social Networking: 71,548.35
   4. Music: 57,326.53
   5. Weather: 52,279.89
    
We will look deeper into these genres to make our conclusions for an preferable app profile on the Apple App Store.

In [20]:
for app in ios_final: # loop through final Apple App Store data set
    if app[11] == 'Navigation': # set genre we are exploring
        print(app[1], ':', app[5]) # prints name and number of ratings
print('\n')

for app in ios_final:
    if app[11] == 'Reference':
        print(app[1], ':', app[5])
print('\n')

for app in ios_final:
    if app[11] == 'Social Networking':
        print(app[1], ':', app[5])
print('\n')

for app in ios_final:
    if app[11] == 'Music':
        print(app[1], ':', app[5])
print('\n')

for app in ios_final:
    if app[11] == 'Weather':
        print(app[1], ':', app[5])


Waze - GPS Navigation, Maps & Real-time Traffic : 345046
Google Maps - Navigation & Transit : 154911
Geocaching® : 12811
CoPilot GPS – Car Navigation & Offline Maps : 3582
ImmobilienScout24: Real Estate Search in Germany : 187
Railway Route Search : 5


Bible : 985920
Dictionary.com Dictionary & Thesaurus : 200047
Dictionary.com Dictionary & Thesaurus for iPad : 54175
Google Translate : 26786
Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran : 18418
New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition : 17588
Merriam-Webster Dictionary : 16849
Night Sky : 12122
City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) : 8535
LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools : 4693
GUNS MODS for Minecraft PC Edition - Mods Tools : 1497
Guides for Pokémon GO - Pokemon GO News and Cheats : 826
WWDC : 762
Horror Maps for Minecraft PE - Download The Scariest Maps for Minecraft Pocket Edition (MCPE) Free : 718
VPN

Looking at Navigation apps, we can see that the genres user ratings are dominated by Waze, Google Maps, and Geocaching®. There also are not many apps in this genre when compared to some of the others meaning barrier to entry is likely high. 

We can see something simlar in the Reference, Social Networking, Music, and Weather apps. There are dominant players in those categories, but there are also a lot of apps being offered and so not only is there more balance in the number of users (measured by average number of ratings), but there is also more opportunity to compete to some degree. 

We probably do not want to enter the market of a genre that has few options and obvious concentrated popularity, but also the market of a genre that has so many apps that we can get lost within all of the competition. The Social Networking and Music genres are probably not the most appropriate app profile for this reason.

Looking back to our earlier conclusion that the Apple app store generally has apps that are for casual use or pleasure, there is more opportunity for practical apps. Reference or Weather are both appropriate for app profiles in that sense. For the Reference category we may be able to develop an app that allows users to store their audio books more efficiently or even turn a book into an audio book, we might also be able to develope a translation or language learning app. It's harder to come up with a Weather app that doesn't exist, but we could combine elements of current popular weather apps and bring them together into one. 

Some other interesting genres that are not in the top five, include:
   1. Photo & Video - very popular practical category that can generate high user volume. Opportunity for in-app ad partnerships. Barriers are highly innovative and popular competitors.
   2. Health & Fitness - There is an emerging market with at-home workouts and subsriction based workouts that could be an opportunity. There are competitors, but none that have gained major momentum with customers and there is room for innovation and growth in this genre. 
   3. Travel - There is lots of competition in the Travel genre, but there may be opportunity as the market changes post Covid-19. This is a genre to keep an eye on as the market shifts and customers become reinvigorated. 
   4. Finance - Thre is a growing popularity of Fintech and demand for a convenient way to store, send and invest funds. This would require knowledge of complex financial systems that may not be readily available to us.



### Most Popular Apps on Google Play 

In the Google Play data set, we can use the *Installs* column to get the information we need. However, the installation numbers don't seem to be specific enough since there are open ended values like 100+, 1000+, 5000+, etc. (See below)

In [21]:
display_table(google_final, 5) # freq. table for the Installs columns

1,000,000+ : 15.726534296028879
100,000+ : 11.552346570397113
10,000,000+ : 10.548285198555957
10,000+ : 10.198555956678701
1,000+ : 8.393501805054152
100+ : 6.915613718411552
5,000,000+ : 6.825361010830325
500,000+ : 5.561823104693141
50,000+ : 4.7721119133574
5,000+ : 4.512635379061372
10+ : 3.5424187725631766
500+ : 3.2490974729241873
50,000,000+ : 2.3014440433213
100,000,000+ : 2.1322202166064983
50+ : 1.917870036101083
5+ : 0.78971119133574
1+ : 0.5076714801444043
500,000,000+ : 0.2707581227436823
1,000,000,000+ : 0.22563176895306858
0+ : 0.04512635379061372
0 : 0.01128158844765343


We don't know exactly how many installs an app with 100,000+ installs has. We, don't necessarily need exact data for our purposes (like our approximation for the Apple App Store data set). We only want to find the app category that is most popular among users, so this will still give us an idea of that.

We can simply leave the numbers as they are an analyze our data as such. For example an app that has 500,000+ installs will be analyzed as having 500000 installs. We will need to convert the string 500,000+ to a float 500000 however, and to do so we will need to remove the comma and the plus sign. 

We will convert the strings to floats below and will also calculate the the average number of installs for each category in the Google Play data set:

In [22]:
google_category = freq_table(google_final, 1) # generates freq. tables Categories


for category in google_category: # loops through freq. table
    total = 0 # total user ratings counter
    len_category = 0 # total apps in category counter
    for app in google_final: # loop through final Google Play data set
        category_app = app[1] # sets app category
        if category_app == category: # when app category matches category in freq. table
            n_installs = app[5] # sets app installs
            n_installs = n_installs.replace(',', '') # removes , from installs
            n_installs = n_installs.replace('+', '') # removes + from installs
            total += float(n_installs) # convert installs to float
            len_category += 1 # adds one to category counter
    avg_n_installs = total / len_category # calculates average installs for a cateogry
    print(category, ':', avg_n_installs)  # displays calculation

ART_AND_DESIGN : 1986335.0877192982
AUTO_AND_VEHICLES : 647317.8170731707
BEAUTY : 513151.88679245283
BOOKS_AND_REFERENCE : 8767811.894736841
BUSINESS : 1712290.1474201474
COMICS : 817657.2727272727
COMMUNICATION : 38456119.167247385
DATING : 854028.8303030303
EDUCATION : 1833495.145631068
ENTERTAINMENT : 11640705.88235294
EVENTS : 253542.22222222222
FINANCE : 1387692.475609756
FOOD_AND_DRINK : 1924897.7363636363
HEALTH_AND_FITNESS : 4188821.9853479853
HOUSE_AND_HOME : 1331540.5616438356
LIBRARIES_AND_DEMO : 638503.734939759
LIFESTYLE : 1437816.2687861272
GAME : 15588015.603248259
FAMILY : 3695641.8198090694
MEDICAL : 120550.61980830671
SOCIAL : 23253652.127118643
SHOPPING : 7036877.311557789
PHOTOGRAPHY : 17840110.40229885
SPORTS : 3638640.1428571427
TRAVEL_AND_LOCAL : 13984077.710144928
TOOLS : 10801391.298666667
PERSONALIZATION : 5201482.6122448975
PRODUCTIVITY : 16787331.344927534
PARENTING : 542603.6206896552
WEATHER : 5074486.197183099
VIDEO_PLAYERS : 24727872.452830188
NEWS_AND_

The top 5 categories in Google Play according to average number of installs are:
   1. Communication: 38,456,119
   2. Video Players: 24,727,873
   3. Social: 23,253,652
   4. Photography: 17,840,110
   5. Productivity: 16,787,331
   
Since there is more of a balance in the categories in the Google Play data set (as concluded earlier) we could explore both practical app categories and ones that are for casual use or pleasure. However, looking back to our conclusions on an app profile for the Apple App Store, we should focus our Google Play analysis on practical categories.   
    
Below, we will look deeper into the Google Play categories to make our conclusions for an preferable app profile:

In [23]:
for app in google_final: # loops final Google Play data set
    if app[1] == 'COMMUNICATION': # when app category is Commuincation
        print(app[0], ':', app[5]) # prints app name and installs
        

WhatsApp Messenger : 1,000,000,000+
Messenger for SMS : 10,000,000+
My Tele2 : 5,000,000+
imo beta free calls and text : 100,000,000+
Contacts : 50,000,000+
Call Free – Free Call : 5,000,000+
Web Browser & Explorer : 5,000,000+
Browser 4G : 10,000,000+
MegaFon Dashboard : 10,000,000+
ZenUI Dialer & Contacts : 10,000,000+
Cricket Visual Voicemail : 10,000,000+
TracFone My Account : 1,000,000+
Xperia Link™ : 10,000,000+
TouchPal Keyboard - Fun Emoji & Android Keyboard : 10,000,000+
Skype Lite - Free Video Call & Chat : 5,000,000+
My magenta : 1,000,000+
Android Messages : 100,000,000+
Google Duo - High Quality Video Calls : 500,000,000+
Seznam.cz : 1,000,000+
Antillean Gold Telegram (original version) : 100,000+
AT&T Visual Voicemail : 10,000,000+
GMX Mail : 10,000,000+
Omlet Chat : 10,000,000+
My Vodacom SA : 5,000,000+
Microsoft Edge : 5,000,000+
Messenger – Text and Video Chat for Free : 1,000,000,000+
imo free video calls and chat : 500,000,000+
Calls & Text by Mo+ : 5,000,000+
free 

Let's see if there are similar patterns in the Google Play data set as in the Apple App Store data set in terms of the most popular categories being dominated by a few apps. We can do this by isolating the apps within a category that have been downloaded 100,000,000+ or 500,000,000+ or 1,000,000,000+

We will also see how the average installs for the are affected if we remove those values from our analysis.

In [24]:
for app in google_final: # loops final Google Play data set
    # when app category is communication and app has been installed certain number of times
    if app[1] == 'COMMUNICATION' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'):
        print(app[0], ':', app[5]) # prints app name and installs

print('\n')

under_100_m_comm = [] # creates list of apps with uner 100MM installs

for app in google_final: # loops through final Google Play data set
    n_installs = app[5] # sets number of installs
    n_installs = n_installs.replace(',', '') # removes , from installs
    n_installs = n_installs.replace('+', '') # removes + from installs
    if (app[1] == 'COMMUNICATION') and (float(n_installs) < 100000000):
        under_100_m_comm.append(float(n_installs)) # adds given category apps with under 100MM installs to list
        
sum(under_100_m_comm) / len(under_100_m_comm) # calculates new average installs for given category

WhatsApp Messenger : 1,000,000,000+
imo beta free calls and text : 100,000,000+
Android Messages : 100,000,000+
Google Duo - High Quality Video Calls : 500,000,000+
Messenger – Text and Video Chat for Free : 1,000,000,000+
imo free video calls and chat : 500,000,000+
Skype - free IM & video calls : 1,000,000,000+
Who : 100,000,000+
GO SMS Pro - Messenger, Free Themes, Emoji : 100,000,000+
LINE: Free Calls & Messages : 500,000,000+
Google Chrome: Fast & Secure : 1,000,000,000+
Firefox Browser fast & private : 100,000,000+
UC Browser - Fast Download Private & Secure : 500,000,000+
Gmail : 1,000,000,000+
Hangouts : 1,000,000,000+
Messenger Lite: Free Calls & Messages : 100,000,000+
Kik : 100,000,000+
KakaoTalk: Free Calls & Text : 100,000,000+
Opera Mini - fast web browser : 100,000,000+
Opera Browser: Fast and Secure : 100,000,000+
Telegram : 100,000,000+
Truecaller: Caller ID, SMS spam blocking & Dialer : 100,000,000+
UC Browser Mini -Tiny Fast Private & Secure : 100,000,000+
Viber Mess

3603485.3884615386

There are defnitely some big players that seem to be skewing the data to some degree with over 1,000,000,000+ downloads (Whatsapp, Facebook Messenger, Skype, Google Chrome, Gmail, Hangouts). This does not necessarily mean that the data is unreliable, but we can see that the major players may make this category seem more popular than it actually is and also dominate and make barrier to entry high. 

We can see that removing the apps with over 100,000,000 installs makes the genre appear far less popular than our original category frequency table suggested.

Below, we will see if this pattern continues with the runner-up category:

In [25]:
# Please see comments in previous code
for app in google_final:
    if app[1] == 'VIDEO_PLAYERS':
        print(app[0], ':', app[5])

print('\n')
        
for app in google_final:
    if app[1] == 'VIDEO_PLAYERS' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

print('\n')

under_100_m_video = []

for app in google_final:
    n_installs = app[5]
    n_installs = n_installs.replace(',', '')
    n_installs = n_installs.replace('+', '')
    if (app[1] == 'VIDEO_PLAYERS') and (float(n_installs) < 100000000):
        under_100_m_video.append(float(n_installs))
        
sum(under_100_m_video) / len(under_100_m_video)

YouTube : 1,000,000,000+
All Video Downloader 2018 : 1,000,000+
Video Downloader : 10,000,000+
HD Video Player : 1,000,000+
Iqiyi (for tablet) : 1,000,000+
Video Player All Format : 10,000,000+
Motorola Gallery : 100,000,000+
Free TV series : 100,000+
Video Player All Format for Android : 500,000+
VLC for Android : 100,000,000+
Code : 10,000,000+
Vote for : 50,000,000+
XX HD Video downloader-Free Video Downloader : 1,000,000+
OBJECTIVE : 1,000,000+
Music - Mp3 Player : 10,000,000+
HD Movie Video Player : 1,000,000+
YouCut - Video Editor & Video Maker, No Watermark : 5,000,000+
Video Editor,Crop Video,Movie Video,Music,Effects : 1,000,000+
YouTube Studio : 10,000,000+
video player for android : 10,000,000+
Vigo Video : 50,000,000+
Google Play Movies & TV : 1,000,000,000+
HTC Service － DLNA : 10,000,000+
VPlayer : 1,000,000+
MiniMovie - Free Video and Slideshow Editor : 50,000,000+
Samsung Video Library : 50,000,000+
OnePlus Gallery : 1,000,000+
LIKE – Magic Video Maker & Community : 50,

5544878.133333334

The video players category appears to be even more skewed than the communication category. The video players apps with 1,000,000,000+ installs are YouTube and Google Play Movies, which both have control of large parts of the market. Again, we can see that the major players may make this category seem more popular than it actually is and also dominate and make barrier to entry high. 

Again, we can see that removing the apps with over 100,000,000 installs makes the category appear far less popular than our original genre frequency table suggested.

Since we are continuing to see the same pattern in the top categories of Google Play, we can build off of our conclusion for an Apple App Store app profile to see which category we should explore next. We were looking at a few practical app genres in our Apple App Store conclusion, but mainly focused on Weather and Reference. 

The Weather category in Google play is slightly less popular than the Books And Reference category, and we also concluded earlier that me might still be dealing with dominant competitors and have trouble coming up with a unique app. We will explore the Books And Reference category of the Google Play data set next for this reason.

In [26]:
# Please see comments in previous code
for app in google_final:
    if app[1] == 'BOOKS_AND_REFERENCE':
        print(app[0], ':', app[5])

print('\n')
        
for app in google_final:
    if app[1] == 'BOOKS_AND_REFERENCE' and (app[5] == '1,000,000,000+'
                                      or app[5] == '500,000,000+'
                                      or app[5] == '100,000,000+'):
        print(app[0], ':', app[5])

print('\n')

under_100_m_books = []

for app in google_final:
    n_installs = app[5]
    n_installs = n_installs.replace(',', '')
    n_installs = n_installs.replace('+', '')
    if (app[1] == 'BOOKS_AND_REFERENCE') and (float(n_installs) < 100000000):
        under_100_m_books.append(float(n_installs))
        
sum(under_100_m_books) / len(under_100_m_books)

E-Book Read - Read Book for free : 50,000+
Download free book with green book : 100,000+
Wikipedia : 10,000,000+
Cool Reader : 10,000,000+
Free Panda Radio Music : 100,000+
Book store : 1,000,000+
FBReader: Favorite Book Reader : 10,000,000+
English Grammar Complete Handbook : 500,000+
Free Books - Spirit Fanfiction and Stories : 1,000,000+
Google Play Books : 1,000,000,000+
AlReader -any text book reader : 5,000,000+
Offline English Dictionary : 100,000+
Offline: English to Tagalog Dictionary : 500,000+
FamilySearch Tree : 1,000,000+
Cloud of Books : 1,000,000+
Recipes of Prophetic Medicine for free : 500,000+
ReadEra – free ebook reader : 1,000,000+
Anonymous caller detection : 10,000+
Ebook Reader : 5,000,000+
Litnet - E-books : 100,000+
Read books online : 5,000,000+
English to Urdu Dictionary : 500,000+
eBoox: book reader fb2 epub zip : 1,000,000+
English Persian Dictionary : 500,000+
Flybook : 500,000+
All Maths Formulas : 1,000,000+
Ancestry : 5,000,000+
HTC Help : 10,000,000+
E

1437212.2162162163

There are still major players that skew this category, but there are less than in the other categories we have looked at so far and only one has over 1,000,000,000 installs (Google Play Books). Even though removing these apps from the data would again lead to it appearing less popular, there might be potential here.

We will next include some of the competitors to see how the next tier of apps are performing compared to the big players.

In [29]:
# loop through final Google Play data set to find apps installed at next tiers 
for app in google_final:
    if app[1] == 'BOOKS_AND_REFERENCE' and (app[5] == '1,000,000+'
                                      or app[5] == '5,000,000+'
                                      or app[5] == '10,000,000+'
                                      or app[5] == '50,000,000+'):
        print(app[0], ':', app[5])

Wikipedia : 10,000,000+
Cool Reader : 10,000,000+
Book store : 1,000,000+
FBReader: Favorite Book Reader : 10,000,000+
Free Books - Spirit Fanfiction and Stories : 1,000,000+
AlReader -any text book reader : 5,000,000+
FamilySearch Tree : 1,000,000+
Cloud of Books : 1,000,000+
ReadEra – free ebook reader : 1,000,000+
Ebook Reader : 5,000,000+
Read books online : 5,000,000+
eBoox: book reader fb2 epub zip : 1,000,000+
All Maths Formulas : 1,000,000+
Ancestry : 5,000,000+
HTC Help : 10,000,000+
Moon+ Reader : 10,000,000+
English-Myanmar Dictionary : 1,000,000+
Golden Dictionary (EN-AR) : 1,000,000+
All Language Translator Free : 1,000,000+
Aldiko Book Reader : 10,000,000+
Dictionary - WordWeb : 5,000,000+
50000 Free eBooks & Free AudioBooks : 5,000,000+
Al-Quran (Free) : 10,000,000+
Al Quran Indonesia : 10,000,000+
Al'Quran Bahasa Indonesia : 10,000,000+
Al Quran Al karim : 1,000,000+
Al Quran : EAlim - Translations & MP3 Offline : 5,000,000+
Koran Read &MP3 30 Juz Offline : 1,000,000+
H

We can see that the next tier of competitors in this category are actually quite successful. In the other categories that we looked at the market dominators had far more installs than any other app in the corresponding frequency table. In this frequency table we can tell that there are still market dominators, reflected by the decrease in average installs when you remove them, but we can also see that there is a good balance in the number of installs of apps competing at the next popularity tier. 

We can see that this next tier contains a lot of apps that are focused on e-books, dictionaries and and library collections. That market seems saturated and it may be difficult to gain traction with a new app. There are quite a few that focus on language and religion and it appears there may be potential for a new idea here.

## Final Conclusion

My app profile reccomendation would be to develop an translation & cultural app that allows users to have audio translations and insights into travel destinations. This would aid users to learn the language of a place that they might be visiting or simply help them communiate through the audio. It would also allow users to look into historical sources to find out about the culture and historical sites of a place they are planning to go to. 

This app would fit into the Reference genre in the Apple App Store and the Books And Refernce category in Google Play. These are what we are trying to target in our app development as they show the most potential in terms of current popularity and how realistic it would be to compete in the market. The obvious app to develop for these categories would be one having to do e-books, but there seem to be so many apps attempting to do this.

Looking at some of the other genres in the Apple Store data set, I noticed that travel was one that was pretty popular. It is also quite popular in the Google PlWhile there is lots of competition in the Travel genre, there may be opportunity as the market changes post Covid-19 and most current apps are focused on planning travel and not dealing with a new place or give you insight into the places to visit.

There does not seem to be a hugely popular app that blends the offerings of this app idea and with travel about to boom post Covid-19 we are in a perfect position to develop an app that would allow us to penetrate bot