# Hispanic Mental Health App Viability Analysis

## Introduction

Protective measures against the spread of COVID-19, including social distancing and the heavy restriction and closure of businesses, have taken a heavy psychological and economic toll on millions of people in the United States. Anxiety, depression, and suicidality in the US have [increased significantly this year](https://www.cdc.gov/mmwr/volumes/69/wr/mm6932a1.htm), especially among racial minorities. In a Centers for Disease Control and Prevention study published in August 2020, Hispanics, the largest ethnic minority in the US, reported higher levels of these conditions than non-hispanic whites and non-hispanic Asians.

The current circumstances highlight the already low levels of mental health services sought and received by Hispanic-Americans. [Barriers to treatment](https://www.mhanational.org/issues/latinxhispanic-communities-and-mental-health) include cultural stigma around speaking about or seeking help for mental health disorders; avoiding therapy for religious reasons; language barriers; cultural differences in describing symptoms; insufficient health insurance coverage; and legal status. 

Self-care mental health apps are one solution to address these obstacles. Such apps which are an easy, private, and often free medium to receive mental health information and some of the benefits of traditional therapy. On a global scale, apps in this category are being downloaded and used at a rapidly increasing rate--[they reportedly generated](https://www.prnewswire.com/news-releases/mental-health-apps-market-accounted-for-us-587-9-mn-in-2018-and-is-expected-to-generate-a-revenue-of-us-3-918-40-mn-by-2027--at-a-growth-rate-of-23-7-from-2019--2027--300997559.html) almost \\$600 million in revenue globally in 2018, and are expected to bring in close to \\$4 billion by 2027. While the majority of these apps have not been subject to scientific studies to determine their effectiveness, most mental health experts agree they are a useful tool when based on best practices, especially if combined with traditional therapy. 

The nature of self-care mental health apps could make them a particularly good solution for Hispanics. The privacy and low cost (or no cost) of apps addresses both common cultural and socioeconomic barriers. *However, only about half or less of mental health apps built in English are available in Spanish, and few apps have been built from the ground up to meet the unique mental health needs of Hispanic-Americans.* 

In this project, we will look at datasets to, I) determine if there is indeed a niche opportunity to develop a self-care mental health app specifically designed for Hispanic-Americans; and II) look for insights in the data that can inform and improve the app concept.  


## Choosing Datasets

[As of September 2020](https://www.statista.com/statistics/276623/number-of-apps-available-in-leading-app-stores/), Google Play has the most content of any app store in the world with 2.87 million apps; Apple's App Store trails them with 1.96 million. Due to their market dominance and the availability of datasets, we will analyze data from the App Store and Google Play. Specifically, we'll use two datasets from 2018 posted on Kaggle. One provides data on 7,197 apps from the App Store, and the other on 10,841 apps from Google Play.
<br>
<br>
- The App Store dataset is available [here](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps).
- The Google Play dataset is available [here](https://www.kaggle.com/lava18/google-play-store-apps/notebooks).

(Thank you to Ramanathan for the App Store dataset and to Lavanya Gupta for the Google Play dataset.) 


## Opening and Exploring Datasets

The datasets are available in the .csv format, so we'll install CSV for Python, and convert both datasets into lists of lists.

In [1]:
pip install python-csv

Note: you may need to restart the kernel to use updated packages.


In [2]:
from csv import reader

In [3]:
open_file_ios = open('/Users/fjgaughan94/Desktop/Data Science/My Data Sets/Apps Data (Dataquest)/Hispanic Mental Health App/AppleStore.csv')
read_file_ios = reader(open_file_ios)
apps_data_ios = list(read_file_ios)

open_file_android = open('/Users/fjgaughan94/Desktop/Data Science/My Data Sets/Apps Data (Dataquest)/Hispanic Mental Health App/googleplaystore.csv')
read_file_android = reader(open_file_android)
apps_data_android = list(read_file_android)

Now that both tables are readable, we'll look at the headers and the first couple rows of both datasets.

In [4]:
def explore_data(dataset, start, end, columns_and_rows=True):
    dataset_slice = dataset[start:end]
    for row in dataset_slice:
        print(row)
        print('\n')
    if columns_and_rows:
        print('Columns: ' + str(len(row)))
        print('Rows: ' + str(len(dataset) - 1))
        return 

ios_header = apps_data_ios[0]
android_header = apps_data_android[0]
    
print('App Store Header & Sample')
print('\n')
print(apps_data_ios[0])
print('\n')
print(explore_data(apps_data_ios, 1, 4))
print('\n')
print('\n')
print('Google Play Header & Sample')
print('\n')
print(apps_data_android[0])
print('\n')
print(explore_data(apps_data_android, 1, 4))
print('\n')

App Store Header & Sample


['', 'id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['1', '281656475', 'PAC-MAN Premium', '100788224', 'USD', '3.99', '21292', '26', '4', '4.5', '6.3.5', '4+', 'Games', '38', '5', '10', '1']


['2', '281796108', 'Evernote - stay organized', '158578688', 'USD', '0', '161065', '26', '4', '3.5', '8.2.2', '4+', 'Productivity', '37', '5', '23', '1']


['3', '281940292', 'WeatherBug - Local Weather, Radar, Maps, Alerts', '100524032', 'USD', '0', '188583', '2822', '3.5', '4.5', '5.0.0', '4+', 'Weather', '37', '5', '3', '1']


Columns: 17
Rows: 7197
None




Google Play Header & Sample


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook'

For our purposes, a couple of rows look potentially relevant:

* track_name/App
* prime_genre/Category/Genres
* lang.num (number of languages supported)
* rating_count_tot (total number of ratings given)
* Reviews (total number of reviews given)

(If you'd like to know more about the column categories for the Apple Store dataset, its [documentation](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/home) provides clear definitions for each column.)

## Cleaning the Data

Before we go further, we need to clean the data. It's clear from Kaggle that both are/were popular datasets. Consequently, essentially all the relevant issues have already been fleshed out in their respective "Discussion" sections on Kaggle. From our review of these discussion threads, we found the following problems relevant to our analysis: 

[App Store Dataset Issues](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/discussion)
* Duplicates
    - There are two rows with non-unique names.
    
[Google Play Dataset Issues](https://www.kaggle.com/lava18/google-play-store-apps/discussion)
* Duplicates
   - There are 1,181 rows with non-unique app names.
* Category column
   - Row 10472 (app name "Life Made WI-Fi Touchscreen Photo Frame") is missing the "Category" data point, shifting the rest of the data points in the row. 
    
We'll start by confirming the number of duplicate app names in the App Store dataset, and then take a closer look at them. 

In [5]:
unique_apps_ios = []
duplicate_apps_ios = []

for row in apps_data_ios[1:]:
    name = row[2]
    if name in unique_apps_ios: 
        duplicate_apps_ios.append(name)
    else:
        unique_apps_ios.append(name)

print('Unique iOS App Names: ' + str(len(unique_apps_ios)))
print('Duplicate iOS App Names: ' + str(len(duplicate_apps_ios)))
print('Duplicate iOS Apps: ' + str(duplicate_apps_ios))
        

Unique iOS App Names: 7195
Duplicate iOS App Names: 2
Duplicate iOS Apps: ['VR Roller Coaster', 'Mannequin Challenge']


There are, in fact, only two duplicate names. From [this discussion thread](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/discussion/90409) on Kaggle, we anticipated that the duplicate names would be "VR Roller Coaster" and "Mannequin Challenge"--and that these are all unique apps by different developers. The dataset therefore has no true duplicates.  

Now we'll turn to the Google Play dataset. First, we'll delete row 10472, and then remove duplicate rows. We'll start that process by deleting any rows that are exactly identical. We'll then create a frequency table of rows that have the same app name but are *not* identical, and print a portion of it showing all app names that are repeated five times or more.

In [6]:
apps_data_android.pop(10473)


unique_rows_android = []

for row in apps_data_android:
    whole_row = row[0:14]
    if whole_row in unique_rows_android:
        del row
    else:
        unique_rows_android.append(whole_row)

apps_data_android = unique_rows_android

unique_appnames_android = []
duplicate_appnames_android = []
duplicate_appnames_android_fq = {}

for row in apps_data_android[1:]:
    name = row[0]
    if name in unique_appnames_android:
        duplicate_appnames_android.append(name)
    else:
        unique_appnames_android.append(name)

for name in duplicate_appnames_android:
    if name in duplicate_appnames_android_fq:
        duplicate_appnames_android_fq[name] += 1
    else:
        duplicate_appnames_android_fq[name] = 1

print('Unique Google Play App Names: ' + str(len(unique_appnames_android)))
print('\n')
print('Remaining Rows with Non-Unique App Names: ' + str(len(duplicate_appnames_android)))
print('\n')
print('Updated Dataset--Total Rows: ' + str(len(apps_data_android[1:])))
print('\n')
print('Deleted rows: ' + str(10840 - len(apps_data_android[1:])))
print('\n')


DAG_fq_5_plus = {}

for name in duplicate_appnames_android_fq:
    if duplicate_appnames_android_fq[name] > 3:
        DAG_fq_5_plus[name] = duplicate_appnames_android_fq[name] + 1

print('Apps Duplicated 5+ Times: ' + str(DAG_fq_5_plus))

Unique Google Play App Names: 9659


Remaining Rows with Non-Unique App Names: 698


Updated Dataset--Total Rows: 10357


Deleted rows: 483


Apps Duplicated 5+ Times: {'Duolingo: Learn Languages Free': 5, 'Subway Surfers': 5, 'ROBLOX': 9, '8 Ball Pool': 7, 'Candy Crush Saga': 5, 'Bubble Shooter': 6, 'Granny': 5, 'Zombie Catchers': 6, 'Temple Run 2': 5, 'Zombie Tsunami': 5, 'Farm Heroes Saga': 5, 'slither.io': 5, 'Angry Birds Classic': 5, 'Helix Jump': 6, 'Bowmasters': 5}


Despite deleting 483 identical rows, there are still 698 rows with non-unique app names. We know from [this discussion thread](https://www.kaggle.com/lava18/google-play-store-apps/discussion/136133) that some otherwise identical rows differ in the Review column. However, this is not the entire story. We'll print out the dataset header and all the rows for the first two apps in the "Apps Duplicated 5+ Times" frequency table above (Duolingo and Nick), and take a closer look.

In [7]:
def print_app_dups(app_name):
    for row in apps_data_android[1:]:
        name = row[0]
        whole_row = row[0:14]
        if name == app_name:
            print(row)
            print('\n')

print(apps_data_android[0:1])
print('\n')
print_app_dups('Duolingo: Learn Languages Free')
print('\n')
print('\n')
print_app_dups('Nick')

[['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']]


['Duolingo: Learn Languages Free', 'EDUCATION', '4.7', '6289924', 'Varies with device', '100,000,000+', 'Free', '0', 'Everyone', 'Education;Education', 'August 1, 2018', 'Varies with device', 'Varies with device']


['Duolingo: Learn Languages Free', 'EDUCATION', '4.7', '6290507', 'Varies with device', '100,000,000+', 'Free', '0', 'Everyone', 'Education;Education', 'August 1, 2018', 'Varies with device', 'Varies with device']


['Duolingo: Learn Languages Free', 'FAMILY', '4.7', '6294400', 'Varies with device', '100,000,000+', 'Free', '0', 'Everyone', 'Education;Education', 'August 1, 2018', 'Varies with device', 'Varies with device']


['Duolingo: Learn Languages Free', 'FAMILY', '4.7', '6294397', 'Varies with device', '100,000,000+', 'Free', '0', 'Everyone', 'Education;Education', 'August 1, 2018', 'Varies with device', 'Varies wi

In these rows, there are variances in three columns: Category, Reviews, and Last Updated.  If you compare the Category and Review columns for both sets of rows, it's reasonable to assume that these are duplicates of the same app captured at different times--for both Duolingo and Nick, the Category change correlates with rising total reviews, implying that the category was changed at some point. 

We'll now delete all rows in the dataset with non-unique app names. We will keep one row for any given app name with the highest number in the Reviews column, since we can infer that it is the most recently captured data for that app. We'll verify the outcome by calling all "Duolingo" and "Nick" rows in the new dataset and compare them to the rows above. 

We may lose a handful of rows with non-unique apps names that are distinct apps, as we saw with the App Store data set; but given the large size of the dataset, that's not a cause for concern. 

In [8]:
tot_reviews = {}

for row in apps_data_android[1:]:
    name = row[0]
    reviews = row[3]
    if name in tot_reviews and reviews > tot_reviews[name]:
        tot_reviews[name] = reviews 
    elif name in tot_reviews and reviews < tot_reviews[name]:
        del row
    else:
        tot_reviews[name] = reviews  

highest_reviews_android = []

for row in apps_data_android:
    name = row[0]
    reviews = row[3]
    whole_row = row[0:14]    
    if name == 'App':
        highest_reviews_android.append(whole_row)
    elif name in tot_reviews and reviews == tot_reviews[name]:
        highest_reviews_android.append(whole_row) 

apps_data_android = highest_reviews_android

def print_app_dups_ver2(app_name):
    for row in apps_data_android:
        name = row[0]
        whole_row = row[0:14]
        if name == app_name:
            print(row)
            print('\n')

print_app_dups_ver2('Duolingo: Learn Languages Free')
print_app_dups_ver2('Nick')

['Duolingo: Learn Languages Free', 'FAMILY', '4.7', '6297590', 'Varies with device', '100,000,000+', 'Free', '0', 'Everyone', 'Education;Education', 'August 6, 2018', 'Varies with device', 'Varies with device']


['Nick', 'FAMILY', '4.2', '123322', '25M', '10,000,000+', 'Free', '0', 'Everyone 10+', 'Entertainment;Music & Video', 'January 24, 2018', '2.0.8', '4.4 and up']




Based on the output above, it appears we were successful.

## Filtering the Data

Now that the datasets have been cleaned sufficiently for our needs, we'll filter out irrelevant rows. Since we want to look at mental health apps, we'll filter both datasets based on the "prime genre" column in the App Store dataset, and the "Category" and "Genres" columns in the Google Play dataset. 

First, we need to determine which categories and genres we want to include. We'll create a dictionary for each of these columns.

In [9]:
genre_apps_ios = {}
category_apps_android = {}
genre_apps_android = {}


for row in apps_data_ios[1:]:
    genre = row[12]
    if genre in genre_apps_ios:
        genre_apps_ios[genre] += 1
    else:
        genre_apps_ios[genre] = 1

for row in apps_data_android[1:]:
    category = row[1]
    genre = row[9]
    if category in category_apps_android:
        category_apps_android[category] += 1
    else: 
        category_apps_android[category] = 1
    if genre in genre_apps_android:
        genre_apps_android[genre] += 1
    else:
        genre_apps_android[genre] = 1
        
print('App Store Genres:')
print('\n')
print(genre_apps_ios)
print('\n')
print('Google Play Categories:') 
print('\n')
print(category_apps_android)
print('\n')
print('Google Play Genres')
print('\n')
print(genre_apps_android)


App Store Genres:


{'Games': 3862, 'Productivity': 178, 'Weather': 72, 'Shopping': 122, 'Reference': 64, 'Finance': 104, 'Music': 138, 'Utilities': 248, 'Travel': 81, 'Social Networking': 167, 'Sports': 114, 'Business': 57, 'Health & Fitness': 180, 'Entertainment': 535, 'Photo & Video': 349, 'Navigation': 46, 'Education': 453, 'Lifestyle': 144, 'Food & Drink': 63, 'News': 75, 'Book': 112, 'Medical': 23, 'Catalogs': 10}


Google Play Categories:


{'ART_AND_DESIGN': 61, 'AUTO_AND_VEHICLES': 85, 'BEAUTY': 53, 'BOOKS_AND_REFERENCE': 222, 'BUSINESS': 420, 'COMICS': 56, 'COMMUNICATION': 315, 'DATING': 171, 'EDUCATION': 108, 'ENTERTAINMENT': 87, 'EVENTS': 64, 'FINANCE': 345, 'FOOD_AND_DRINK': 112, 'HEALTH_AND_FITNESS': 288, 'HOUSE_AND_HOME': 73, 'LIBRARIES_AND_DEMO': 84, 'LIFESTYLE': 369, 'GAME': 943, 'FAMILY': 1880, 'MEDICAL': 395, 'SOCIAL': 239, 'SHOPPING': 203, 'PHOTOGRAPHY': 281, 'SPORTS': 325, 'TRAVEL_AND_LOCAL': 219, 'TOOLS': 829, 'PERSONALIZATION': 376, 'PRODUCTIVITY': 374, 'PARENTIN

From both the App Store and Google Play, the Health & Fitness and Medical genres/category look potentially relevant. It appears that the Genres column in the Google Play data is simply a more granular version of the Categories column. This would be useful, hypothetically--but it doesn't include any new genres related to Health & Fitness, Medicine, or any other topic under which a mental health app would likely fall.  So, we'll ignore the "Genres" column in our filtering for the Google Play data. 
<br>
<br>
We'll now filter down both datasets to only apps that fall under the "Health & Fitness" or "Medical" genre/category, and then print out the number of remaining rows.

In [10]:
h_appsdata_ios = []
h_appsdata_android = []

for row in apps_data_ios:
    genre = row[12]
    whole_row = row[0:18]
    if genre == 'prime_genre':
        h_appsdata_ios.append(whole_row)
    elif genre == 'Health & Fitness' or genre == 'Medical':
        h_appsdata_ios.append(whole_row)

for row in apps_data_android:
    category = row[1]
    whole_row = row[0:14]
    if category == 'Category':
        h_appsdata_android.append(whole_row)        
    elif category == 'HEALTH_AND_FITNESS' or category == 'MEDICAL':
        h_appsdata_android.append(whole_row)

print('App Store Health Apps: ' + str(len(h_appsdata_ios)))
print('\n')
print('Google Play Health Apps: ' + str(len(h_appsdata_android)))

App Store Health Apps: 204


Google Play Health Apps: 684


Next, we need to further narrow both datasets from all apps related to Health & Fitness and Medical, to just apps relating to mental health. Because many apps names don't include intuitvely searchable key words related to their purpose ("Happify", for example), we can eliminate rows whose app names contain specific words that would likely *not* be included in the name of a mental health app. To determine the most-used words in the app names in these datasets, we'll create a frequency table for words in app names, order them by most frequent to least frequent, and print a portion of the result.

In [11]:
app_words_android = {}

for row in h_appsdata_android:
    string = row[0]
    string_list = string.split()
    for word in string_list:
        if word in app_words_android:
            app_words_android[word] +=1
        else:
            app_words_android[word] = 1

app_words_android = sorted(app_words_android.items(), key=lambda x: x[1], reverse=True)

print(app_words_android[:125])


[('-', 98), ('&', 59), ('Tracker', 42), ('AH', 30), ('CT', 26), ('Fitness', 24), ('Blood', 24), ('App', 23), ('Workout', 23), ('for', 19), ('and', 19), ('Pressure', 18), ('Health', 17), ('My', 16), ('Counter', 15), ('Calorie', 15), ('Weight', 13), ('Anatomy', 13), ('Sleep', 12), ('Workouts', 12), ('Ab', 12), ('Diet', 11), ('Guide', 11), ('BP', 11), ('Free', 10), ('Abs', 10), ('Running', 10), ('Trainer', 10), ('Yoga', 10), ('Meditation', 10), ('Period', 10), ('Calculator', 10), ('Pro', 10), ('CF', 10), ('30', 9), ('Mobile', 9), ('Prep', 9), ('Loss', 8), ('Challenge', 8), ('Log', 8), ('Pocket', 8), ('Pregnancy', 8), ('Ovulation', 8), ('Calendar', 8), ('Diabetes', 8), ('The', 7), ('by', 7), ('of', 7), ('Care', 7), ('Nursing', 7), ('Bacterial', 7), ('GPS', 6), ('Cycling', 6), ('Home', 6), ('with', 6), ('Coach', 6), ('Daily', 6), ('Relax', 6), ('Diary', 6), ('Monitor', 6), ('Medicine', 6), ('Medical', 6), ('Drug', 6), ('Super', 6), ('Vaginosis', 6), ('in', 5), ('Days', 5), ('Day', 5), ('Wal

Based on this list of the top 125 words, we can remove almost half of the remaining apps from the Google Play dataset. 

In [12]:
mh_appsdata_android = []
non_mh_apps_android = []
exclude_android = ['Fitness', 'Blood', 'Workout', 'Pressure', 'Counter', 
                    'Calorie', 'Weight', 'Anatomy', 'Sleep', 'Workouts', 
                    'Ab', 'Diet', 'Abs', 'Running', 'Period', 'Calculator', 
                    'Loss', 'Pregnancy', 'Ovulation', 'Diabetes', 'Nursing',
                    'Belly', 'Fat', 'Abdomen', 'Bike', 'Cycling', 'Run', 
                    'Pedometer', 'Packs', 'Period', 'Medical', 'Walk', 
                    'Hike', 'BP', 'CF', 'CB', 'CK', 'Pharmacy', 'Fitbit',
                    'Yoga', 'EP', 'CT', 'Paramedic', 'Medicine', 'Acupuncture',
                    'Drug', 'Ear', 'Hearing', 'Medication', 'Pet', 'Cancer',
                    'Deaf', 'Anime', 'Fever', 'Birth', 'Drink', 'AB', 'Bacterial',
                    'Fit', 'CR', 'Smoking', 'Vaginosis', 'EMT', 'Gym', 
                    'Nutrition', 'GPS', 'Run', 'Runtastic', 'AH']

for row in h_appsdata_android:
    string = row[0]
    string_list = string.split()
    whole_row = row[0:14]
    for word in string_list:
        if word in exclude_android:
            non_mh_apps_android.append(whole_row)

for row in h_appsdata_android:
    whole_row = row[0:14]
    if row not in non_mh_apps_android:
        mh_appsdata_android.append(whole_row)

print('Filtered Google Play Health Apps: ' + str(len(mh_appsdata_android)))

Filtered Google Play Health Apps: 334


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['LG Health', 'HEALTH_AND_FITNESS', '3.3', '20098', '25M', '10,000,000+', 'Free', '0', 'Everyone', 'Health & Fitness', 'May 14, 2018', '5.31.75', '4.4 and up']


['Aunjai i lert u', 'HEALTH_AND_FITNESS', '4.2', '1140', '5.5M', '500,000+', 'Free', '0', 'Teen', 'Health & Fitness', 'January 10, 2017', '1.5', '3.0 and up']


['Garmin Connect™', 'HEALTH_AND_FITNESS', '3.9', '232153', 'Varies with device', '10,000,000+', 'Free', '0', 'Everyone', 'Health & Fitness', 'August 2, 2018', 'Varies with device', 'Varies with device']


['The TK-App - everything under control', 'HEALTH_AND_FITNESS', '4.5', '8642', '18M', '100,000+', 'Free', '0', 'Everyone', 'Health & Fitness', 'August 2, 2018', '1.20', '4.4 and up']


['Recipes for hair and face tried', 'HEALTH_AND_FITNESS', '4.5', '1203', '6.5M', '500,000+', 

['Navi Radiography Pro', 'MEDICAL', '4.7', '11', '100M', '500+', 'Paid', '$15.99', 'Everyone', 'Medical', 'January 14, 2018', '1.2.0', '4.0.3 and up']


['palmPEDi: Pediatric Tape', 'MEDICAL', '4.6', '66', '2.6M', '5,000+', 'Paid', '$0.99', 'Everyone', 'Medical', 'December 2, 2013', '4.1', '3.2 and up']


['MyChart', 'MEDICAL', '4.2', '19473', 'Varies with device', '1,000,000+', 'Free', '0', 'Everyone', 'Medical', 'July 24, 2018', 'Varies with device', 'Varies with device']


['FollowMyHealth®', 'MEDICAL', '4.6', '73118', '37M', '1,000,000+', 'Free', '0', 'Everyone', 'Medical', 'May 10, 2018', '3.3', '4.4 and up']


['CareZone', 'MEDICAL', '4.4', '27524', 'Varies with device', '1,000,000+', 'Free', '0', 'Everyone', 'Medical', 'July 30, 2018', 'Varies with device', 'Varies with device']


['Teladoc Member', 'MEDICAL', '4.0', '2094', '23M', '500,000+', 'Free', '0', 'Everyone', 'Medical', 'July 26, 2018', '3.19', '4.3 and up']


['myAir™ for Air10™ by ResMed', 'MEDICAL', '3.7', '236', '18



['Smartshading AI', 'MEDICAL', 'NaN', '0', '10M', '10+', 'Free', '0', 'Everyone', 'Medical', 'June 21, 2018', '1.0', '4.1 and up']


['KBA-EZ Health Guide', 'MEDICAL', '5.0', '4', '25M', '1+', 'Free', '0', 'Everyone', 'Medical', 'August 2, 2018', '1.0.72', '4.0.3 and up']


['FoothillsVet', 'MEDICAL', '5.0', '2', '29M', '50+', 'Free', '0', 'Everyone', 'Medical', 'July 11, 2018', '300000.1.11', '4.0.3 and up']


['Eversense', 'MEDICAL', 'NaN', '3', '17M', '100+', 'Free', '0', 'Everyone', 'Medical', 'July 18, 2018', '2.0.101', '4.4 and up']


['PrimeDelivery', 'MEDICAL', '5.0', '3', '53M', '10+', 'Free', '0', 'Everyone', 'Medical', 'July 13, 2018', '0.1', '4.1 and up']


['HACH Cares', 'MEDICAL', 'NaN', '0', '28M', '10+', 'Free', '0', 'Everyone', 'Medical', 'July 25, 2018', '300000.1.11', '4.0.3 and up']


['WAH 247', 'MEDICAL', 'NaN', '0', '29M', '1+', 'Free', '0', 'Everyone', 'Medical', 'July 20, 2018', '300000.1.11', '4.0.3 and up']


['VetCode', 'MEDICAL', '4.9', '28', '5.7M', '5,0

We'll now manually comb through the remaining 364 rows and create a new list of only mental health apps. 

In [24]:
appnames_android = ['Free Meditation - Take a Break', 'Meditate OM', 'My Chakra Meditation', 
                    'Relax with Andrew Johnson Lite', 'Meditation Studio', '21-Day Meditation Experience',
                    'My Chakra Meditation 2', 'Simple Habit Meditation', 'Headspace: Meditation & Mindfulness',
                    'Self Healing', 'Happify', 'Binaural Beats Therapy', 'Pacifica - Stress & Anxiety', 
                    'Insight Timer - Free Meditation App', 'Self-help Anxiety Management', 'Brain Waves - Binaural Beats',
                    'Prana Breath: Calm & Meditate', '7 Cups: Anxiety & Stress Chat', 'Calm - Meditate, Sleep, Relax',
                    'Stop, Breathe & Think: Meditation & Mindfulness', 'Advanced Comprehension Therapy', 'Breathing Zone',
                    'End Anxiety Pro - Stress, Panic Attack Help', 'Number Therapy', 
                    'Zocdoc: Find Doctors & Book Appointments', 'Ada - Your Health Guide', 'All Mental disorders', 
                    'MoodSpace', 'HealtheLife', 'Moodpath - Depression & Anxiety Test', 'MDLIVE: Talk to a Doctor 24/7',
                    '5-Minute Clinical Consult', '5 Minute Clinical Consult 2019 - #1 for 25 years', 'Free Hypnosis', 
                    'GGDE: Prevent & Beat Depression Symptoms', 'Youper - AI Therapy', 'K Health', 'TelaDoc', 
                    'Cures A-Z', 'EO App. SelfCompassion to you', ]

mh_appsdata_android_final = []

for row in mh_appsdata_android:
    name = row[0]
    if name in appnames_android:
        mh_appsdata_android_final.append(row)

mh_appsdata_android = mh_appsdata_android_final

for row in mh_appsdata_android:
    print(row)
    print('\n')

['Free Meditation - Take a Break', 'HEALTH_AND_FITNESS', '4.2', '1608', '39M', '100,000+', 'Free', '0', 'Everyone', 'Health & Fitness', 'July 28, 2016', '6.7', '2.3.3 and up']


['Meditate OM', 'HEALTH_AND_FITNESS', '4.5', '19074', '4.2M', '1,000,000+', 'Free', '0', 'Everyone', 'Health & Fitness', 'January 17, 2018', '16.0', '4.0 and up']


['My Chakra Meditation', 'HEALTH_AND_FITNESS', '4.4', '7586', '45M', '500,000+', 'Free', '0', 'Everyone', 'Health & Fitness', 'April 10, 2018', '1.0.6', '2.3.3 and up']


['Relax with Andrew Johnson Lite', 'HEALTH_AND_FITNESS', '4.3', '2885', 'Varies with device', '100,000+', 'Free', '0', 'Everyone', 'Health & Fitness', 'June 19, 2012', 'Varies with device', 'Varies with device']


['Meditation Studio', 'HEALTH_AND_FITNESS', '4.6', '1026', '29M', '10,000+', 'Paid', '$3.99', 'Everyone', 'Health & Fitness', 'May 15, 2018', '1.0.6', '4.3 and up']


['21-Day Meditation Experience', 'HEALTH_AND_FITNESS', '4.4', '11506', '15M', '100,000+', 'Free', '0', 'E

We now have a workable set of data on mental health apps from the original Google Play dataset. We'll repeat the process with the iOS dataset that has been narrowed down to apps under the "Health and Fitness" and "Medical" genres. 

In [14]:
app_words_ios = {}

for row in h_appsdata_ios:
    string = row[2]
    string_list = string.split()
    for word in string_list:
        if word in app_words_ios:
            app_words_ios[word] +=1
        else:
            app_words_ios[word] = 1

app_words_ios = sorted(app_words_ios.items(), key=lambda x: x[1], reverse=True)

print(app_words_ios[:125])


[('-', 51), ('&', 36), ('Tracker', 30), ('and', 25), ('for', 24), ('Workout', 17), ('Sleep', 17), ('Fitness', 16), ('–', 11), ('Weight', 10), ('My', 9), ('Runtastic', 9), ('Diet', 9), ('Anatomy', 9), ('to', 9), ('Trainer', 9), ('Training', 8), ('by', 8), ('Period', 7), ('Calorie', 7), ('with', 7), ('Human', 7), ('7', 7), ('Minute', 7), ('Counter', 6), ('Yoga', 6), ('Meditation', 6), ('PRO', 6), ('Fitbit', 6), (':', 6), ('The', 6), ('Workouts', 5), ('Cycle', 5), ('Food', 5), ('3D', 5), ('Exercise', 5), ('Your', 5), ('Alarm', 5), ('~', 5), ('Baby', 4), ('Guided', 4), ('Daily', 4), ('Heart', 4), ('Pro', 4), ('Gym', 4), ('Body', 4), ('your', 4), ('Workouts,', 4), ('meal', 4), ('Challenge', 4), ('Health', 4), ('Lose', 3), ('Loss', 3), ('GPS', 3), ('Running', 3), ('Route', 3), ('Walking', 3), ('Relaxation', 3), ('Diary', 3), ('Mindfulness', 3), ('Instant', 3), ('Monitor', 3), ('Plus', 3), ('Bodyweight', 3), ('5K', 3), ('App', 3), ('Sworkit', 3), ('Clock', 3), ('workout', 3), ('plans', 3), ('

In [15]:
mh_appsdata_ios = []
non_mh_apps_ios = []
exclude_ios = ['Fitness', 'Blood', 'Workout', 'Pressure', 'Counter', 
                    'Calorie', 'Weight', 'Anatomy', 'Sleep', 'Workouts', 
                    'Ab', 'Diet', 'Abs', 'Running', 'Period', 'Calculator', 
                    'Loss', 'Pregnancy', 'Ovulation', 'Diabetes', 'Nursing',
                    'Belly', 'Fat', 'Abdomen', 'Bike', 'Cycling', 'Run', 
                    'Pedometer', 'Packs', 'Period', 'Medical', 'Walk', 
                    'Hike', 'BP', 'CF', 'CB', 'CK', 'Pharmacy', 'Fitbit',
                    'Yoga', 'EP', 'CT', 'Paramedic', 'Medicine', 'Acupuncture',
                    'Drug', 'Ear', 'Hearing', 'Medication', 'Pet', 'Cancer',
                    'Deaf', 'Anime', 'Fever', 'Birth', 'Drink', 'AB', 'Bacterial',
                    'Fit', 'CR', 'Smoking', 'Vaginosis', 'EMT', 'Gym', 
                    'Nutrition', 'GPS', 'Run', 'Runtastic', 'AH']

for row in h_appsdata_ios:
    string = row[2]
    string_list = string.split()
    whole_row = row[0:17]
    for word in string_list:
        if word in exclude_ios:
            non_mh_apps_ios.append(whole_row)

# Adds all rows whose app name does *not* include the exclusion words above
for row in h_appsdata_ios:
    whole_row = row[0:17]
    if row not in non_mh_apps_ios:
        mh_appsdata_ios.append(whole_row)

print('Filtered Apps Store Health Apps: ' + str(len(mh_appsdata_ios)))

Filtered Apps Store Health Apps: 101


In [16]:
appnames_ios = ['Simply Being - Guided Meditation for Relaxation and Presence', 
                'Breathing Zone: Guided Breathing for Mindfulness', 'WebMD for iPad',
                'Headspace', 'buddhify - modern mindfulness for busy lives', 'Mindfulness Daily',
                '平安好医生-要健康上平安好医生', 'PAUSE - Relaxation at your fingertip',
                'Moodnotes - Thought Journal / Mood Diary', 'Meditation Studio – Guided Meditations and Courses',
                'Away ~ Meditation & mindfulness to sleep, relax, focus, breathe', 
                'Wildfulness - Unwind in nature and calm your mind', 'Wellbeyond Meditation for Kids',
                'Zen', 'Flowing ~ Meditation & Mindfulness', 
                ]

mh_appsdata_ios_final = []

for row in mh_appsdata_ios:
    name = row[2]
    if name in appnames_ios:
        mh_appsdata_ios_final.append(row)

mh_appsdata_ios = mh_appsdata_ios_final

for row in mh_appsdata_ios:
    print(row)
    print('\n')

['330', '347418999', 'Simply Being - Guided Meditation for Relaxation and Presence', '100048896', 'USD', '1.99', '2417', '1366', '4.5', '4.5', '6.0', '4+', 'Health & Fitness', '38', '2', '1', '1']


['456', '369838631', 'Breathing Zone: Guided Breathing for Mindfulness', '37140480', 'USD', '3.99', '511', '57', '4.5', '4.5', '3.1', '4+', 'Health & Fitness', '37', '5', '1', '1']


['472', '373185673', 'WebMD for iPad', '17613824', 'USD', '0', '9142', '22', '3.5', '4', '3.5', '12+', 'Health & Fitness', '26', '5', '1', '1']


['1332', '493145008', 'Headspace', '121170944', 'USD', '0', '12819', '1326', '5', '5', '2.13.2', '4+', 'Health & Fitness', '37', '0', '1', '1']


['2484', '687421118', 'buddhify - modern mindfulness for busy lives', '285338624', 'USD', '4.99', '501', '12', '4.5', '4', '2.6.9', '4+', 'Health & Fitness', '37', '4', '1', '1']


['2548', '701112447', 'Mindfulness Daily', '101669888', 'USD', '1.99', '1245', '693', '5', '5', '1.072', '4+', 'Health & Fitness', '38', '5', '1'

We'll now turn both mental health app datasets into Pandas DataFrames, delete columns irrelevant to our analysis, and turn the app name column into the index. 

In [17]:
import pandas as pd
import numpy as np

mh_appsdata_android_df = pd.DataFrame(data=mh_appsdata_android)
mh_appsdata_android_df.columns = android_header

In [19]:
mh_appsdata_android_df.drop(['Category', 'Type', 'Price', 'Content Rating',
                       'Genres', 'Size'], axis=1, inplace=True)

mh_appsdata_android_df = mh_appsdata_android_df.set_index('App')

mh_appsdata_android_df

Unnamed: 0_level_0,Rating,Reviews,Installs,Last Updated,Current Ver,Android Ver
App,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Free Meditation - Take a Break,4.2,1608,"100,000+","July 28, 2016",6.7,2.3.3 and up
Meditate OM,4.5,19074,"1,000,000+","January 17, 2018",16.0,4.0 and up
My Chakra Meditation,4.4,7586,"500,000+","April 10, 2018",1.0.6,2.3.3 and up
Relax with Andrew Johnson Lite,4.3,2885,"100,000+","June 19, 2012",Varies with device,Varies with device
Meditation Studio,4.6,1026,"10,000+","May 15, 2018",1.0.6,4.3 and up
21-Day Meditation Experience,4.4,11506,"100,000+","August 2, 2018",3.0.0,4.1 and up
My Chakra Meditation 2,4.3,1288,"100,000+","April 10, 2018",2.0.4,2.3.3 and up
Simple Habit Meditation,4.7,11689,"500,000+","July 27, 2018",1.29.15,4.4 and up
Headspace: Meditation & Mindfulness,4.6,77563,"10,000,000+","July 23, 2018",3.6.4,4.2 and up
Self Healing,4.5,14394,"500,000+","July 16, 2017",Public.Heal,4.0.3 and up


In [20]:
mh_appsdata_ios_df = pd.DataFrame(data=mh_appsdata_ios, columns=ios_header)
mh_appsdata_ios_df.drop(['id', 'currency', 'price', 'cont_rating', 'prime_genre',
                         'vpp_lic', 'sup_devices.num', 'ipadSc_urls.num', 'size_bytes', 
                         ], axis=1, inplace=True)

mh_appsdata_ios_df.drop(mh_appsdata_ios_df.columns[0], axis=1, inplace=True)

mh_appsdata_ios_df = mh_appsdata_ios_df.set_index('track_name')

print(mh_appsdata_ios_df)

                                                   rating_count_tot  \
track_name                                                            
Simply Being - Guided Meditation for Relaxation...             2417   
Breathing Zone: Guided Breathing for Mindfulness                511   
WebMD for iPad                                                 9142   
Headspace                                                     12819   
buddhify - modern mindfulness for busy lives                    501   
Mindfulness Daily                                              1245   
平安好医生-要健康上平安好医生                                                   0   
PAUSE - Relaxation at your fingertip                            180   
Moodnotes - Thought Journal / Mood Diary                        625   
Meditation Studio – Guided Meditations and Courses             2491   
Away ~ Meditation & mindfulness to sleep, relax...              184   
Wildfulness - Unwind in nature and calm your mind                70   
Wellbe

# Analyzing The Data

Now that we have clean, relevant data in DataFrames, we can start our analysis. We want to determine if there is a market opportunity for a self-care mental health app designed for Hispanic-Americans. While many Hispanic-Americans are bi-lingual or only speak English, as of 2017 over [40 million Americans](https://archive.vn/20200214011034/https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_17_1YR_S1601&prodType=table) speak Spanish in their home. Given that there were [60 million Hispanics in the US in 2017], it's reasonable to assume that Spanish is the preferred language of at least half, if not a majority, of Hispanic-Americans. Thus, whether a self-care mental health app is built in Spanish or available in Spanish will be our primary criteria for determining potential competitors. Specifically, we want to know about:
<br>
1. Self-care mental health apps built in Spanish for Hispanics
<br>
<br>
Unfortunately, its apparent from reviewing the data that no apps with Spanish names made it into the datasets. We know from searches that there are apps written in Spanish on both app platforms, especially Google Play. As a "poor man's" alternative to gauge how much competition is out there, we'll review the top 10 search results on the App Store and Google Play for the search terms "salud mental" ("mental health"), "depresión" ("depression"), and "ansiedad" ("anxiety"), and make dictionaries of all relevant apps built in Spanish and of all relevants apps built in other languages. We'll then compare the sizes of both groups. (Using current search results means we're mixing some 2020 data with 2018 data. This is conscious trade-off even though it slightly undermines the integrity of our analysis.) 
<br>
<br>
2. Self-care mental health apps built in another language that are available in Spanish
<br>
<br>
For this, we can look at the "lang.num" (number of langauges) column in the original App Store dataset, which indicates whether the app could be available in Spanish. Because the Google Play dataset does not include a column showing number of languages, we cannot use that dataset for this step. To compensate, we've produced a table of 11 sample self-care mental health apps, showing which include a Spanish language option. 
<br>
<br>
We'll now take a look at apps built in Spanish.

In [21]:
spn_mhapps_android = {'Yana: Tu acompañante emocional': 3, 'Ansiedad y estrés': 2,  
                      'Autoayuda Depresión Ansiedad': 2,'Salud Mental': 1, 
                      'Psicologia de la Depresión': 1, 'Vive Sin Ansiedad': 1,
                      'Meyo: Ansiedad, Autoestima y Crecimiento Personal': 1}

nonspn_mhapps_android = ['Wysa: stress, depression & anxiety therapy chatbot', 
                      'Remente: Self Help, Mental Health & Improvement', 
                      'InnerHour Self-Care Therapy - Anxiety & Depression',
                      'Sanvello for Anxiety, Depression & Stress', 
                      'Mind journal: anxiety relief & mental health diary',
                      'Cíngulo: Terapia Guiada', 
                      'MoodSpace - Stress, anxiety, & low mood self-help',
                      'Depression Test', 'Sanvello for Anxiety, Depression & Stress',
                      'Mood Tracker, Journal, Diary | Anti Depression app', 
                      'InnerHour Self-Care Therapy - Anxiety & Depression',
                      'MoodSpace - Stress, anxiety, & low mood self-help',
                      'Wysa: stress, depression & anxiety therapy chatbot',
                      'Control and Monitor: Anxiety, Mood and Self-Esteem',
                      'Rootd - Panic Attack & Anxiety Relief', 
                      'Control and Monitor: Anxiety, Mood and Self-Esteem',
                      'Lojong: Meditação e Mindfulness +Calma -Ansiedade (Early Access)',
                      'Anxiety Test', 'Sanvello for Anxiety, Depression & Stress']

nonspn_mhapps_android_dict = {}

for item in nonspn_mhapps_android:
    if item in nonspn_mhapps_android_dict:
        nonspn_mhapps_android_dict[item] += 1
    else:
        nonspn_mhapps_android_dict[item] = 1
        
nonspn_mhapps_android = sorted(nonspn_mhapps_android_dict.items(), key=lambda x: x[1], reverse=True)

print('Native Spanish Apps from Google Play: ' + str(spn_mhapps_android))
print('\n')
print('Total Native Spanish Apps from Google Play: ' + str(sum(spn_mhapps_android.values())))
print('\n')
print('Non-Native Spanish Apps from Google Play: ' + str(nonspn_mhapps_android))
print('\n')
print('Total Non-Native Spanish Apps from Google Play: ' + str(sum(nonspn_mhapps_android_dict.values())))

Native Spanish Apps from Google Play: {'Yana: Tu acompañante emocional': 3, 'Ansiedad y estrés': 2, 'Autoayuda Depresión Ansiedad': 2, 'Salud Mental': 1, 'Psicologia de la Depresión': 1, 'Vive Sin Ansiedad': 1, 'Meyo: Ansiedad, Autoestima y Crecimiento Personal': 1}


Total Native Spanish Apps from Google Play: 11


Non-Native Spanish Apps from Google Play: [('Sanvello for Anxiety, Depression & Stress', 3), ('Wysa: stress, depression & anxiety therapy chatbot', 2), ('InnerHour Self-Care Therapy - Anxiety & Depression', 2), ('MoodSpace - Stress, anxiety, & low mood self-help', 2), ('Control and Monitor: Anxiety, Mood and Self-Esteem', 2), ('Remente: Self Help, Mental Health & Improvement', 1), ('Mind journal: anxiety relief & mental health diary', 1), ('Cíngulo: Terapia Guiada', 1), ('Depression Test', 1), ('Mood Tracker, Journal, Diary | Anti Depression app', 1), ('Rootd - Panic Attack & Anxiety Relief', 1), ('Lojong: Meditação e Mindfulness +Calma -Ansiedade (Early Access)', 1), ('Anx

In [22]:
spn_mhapps_ios = {'Guia Bio Emocional': 1, }

nonspn_mhapps_ios = {'BetterMe: Meditation & Sleep': 1, 'Sanvello for Anxiety, Depression & Stress': 2, 'Mindfulness': 1,
                     'Happify: for Stress & Worry': 1, 'eMoods Bipolar Mood Tracker': 1,
                     'Introspection Exercises': 1, 'Calm': 2, 'I am: Positive Affirmations': 1,
                     'Replika - My AI Friend': 1, 'MindDoc: Depression & Anxiety': 1,
                     'Depression Test': 1, 'Headspace: Meditation & Sleep': 1, 
                     'Daylio Journal': 1, 'AntiStress Anxiety Relief Game': 1, 
                     'Motivation - Daily quotes': 1, 'Anxiety Log': 1, 
                     'Zen: Guided Meditation & Sleep': 1}

nonspn_mhapps_ios_dict = {}

for item in nonspn_mhapps_ios:
    if item in nonspn_mhapps_ios_dict:
        nonspn_mhapps_ios_dict[item] += 1
    else:
        nonspn_mhapps_ios_dict[item] = 1
        
nonspn_mhapps_ios = sorted(nonspn_mhapps_ios_dict.items(), key=lambda x: x[1], reverse=True)

print('Native Spanish Apps from App Store: ' + str(spn_mhapps_ios))
print('\n')
print('Total Native Spanish Apps from App Store: ' + str(sum(spn_mhapps_ios.values())))
print('\n')
print('Non-Native Spanish Apps from App Store: ' + str(nonspn_mhapps_ios))
print('\n')
print('Total Non-Native Spanish Apps from App Store: ' + str(sum(nonspn_mhapps_ios_dict.values())))

Native Spanish Apps from App Store: {'Guia Bio Emocional': 1}


Total Native Spanish Apps from App Store: 1


Non-Native Spanish Apps from App Store: [('BetterMe: Meditation & Sleep', 1), ('Sanvello for Anxiety, Depression & Stress', 1), ('Mindfulness', 1), ('Happify: for Stress & Worry', 1), ('eMoods Bipolar Mood Tracker', 1), ('Introspection Exercises', 1), ('Calm', 1), ('I am: Positive Affirmations', 1), ('Replika - My AI Friend', 1), ('MindDoc: Depression & Anxiety', 1), ('Depression Test', 1), ('Headspace: Meditation & Sleep', 1), ('Daylio Journal', 1), ('AntiStress Anxiety Relief Game', 1), ('Motivation - Daily quotes', 1), ('Anxiety Log', 1), ('Zen: Guided Meditation & Sleep', 1)]


Total Non-Native Spanish Apps from App Store: 17


From the Google Play search, 36% (11) of the apps were built in Spanish; 64% of the apps were built in other languages, primarily English. Of those built in Spanish, it's unclear if any are designed specifically for Hispanic-Americans. From the App Store search, only 3% (1) of the apps were built in Spanish; 57% were built in English; and 40% of the apps were not self-care mental health apps. 
<br> 
<br>
The other competition to our proposed app is popular self-care mental health apps built in English that are available in Spanish. If we look at the iOS dataset, it appears that only 27% (4 of 15) of the mental health apps are available in multiple languages. 

To compensate for the lack of similar "number of languages" column in the Google Play data, we separately analyzed 11 self-care mental health apps found in articles written in Spanish, such as [this one](https://code.tutsplus.com/es/articles/world-mental-health-day-apps-for-a-changing-world--cms-31998), on top mental health apps. Of the 11 apps, we found that 45% were available partially or entirely in Spanish (see below). 


In [23]:
 article_apps_lang = [['Happify', 'Yes'], ['Ada', 'Yes'], ['Headspace', 'Yes'], 
                      ['Moodnotes', 'Yes'], ['Talkspace', 'Yes'], ['Mindshift', 'No'],
                      ['MindDoc', 'No'], ['Bloom', 'No'], ['Reflectly', 'No'],
                      ['Moodkit', 'No'], ['What\'s Up', 'No']]

articles_apps_df = pd.DataFrame(data=article_apps_lang, columns=(['App', 'Av. In Spn.']))
print(articles_apps_df)

          App Av. In Spn.
0     Happify         Yes
1         Ada         Yes
2   Headspace         Yes
3   Moodnotes         Yes
4   Talkspace         Yes
5   Mindshift          No
6     MindDoc          No
7       Bloom          No
8   Reflectly          No
9     Moodkit          No
10  What's Up          No


# Conclusions

We need to take these findings with several large grains of salt for a couple reasons: 
<br>
<br>
I) The datasets we started with represented only 0.036% of all App Store apps, and 0.052% of all Google Play apps;
<br>
<br>
II) These datasets are over two years old, meaning they do not represent the hundreds of thousands of apps that have been added to Google Play and the App Store since September 2018--they also don't reflect that some apps have added additional languages in the last two years, such as [Headspace](https://help.headspace.com/hc/en-us/articles/115003050408-Is-Headspace-offered-in-other-languages-); and,
<br>
<br>
III) The original datasets did not include any apps with names in Spanish, meaning we are missing relevant apps.  
<br>
<br>
To solve all of these issues, we could have conducted targeted webscrapings of the App Store and Google Play. However, that would have gone beyond the parameters of this exercise. 
<br>
<br>
Based on our available data, **it appears there is a good market for a self-care mental health specifically designed to meet the unique needs of Hispanic-Americans.** This conclusion is supported by our keyword search in Spanish on the two most popular app platforms, in which we found the majority of results are apps built in another language with a different culture and set of needs and obstacles in mind. 
<br>
<br>
For Hispanic-Americans whose first language is Spanish, there are few obvious options for self-care mental health apps that are I) built in Spanish, II) with their unique cultures and obstacles in mind. There very much appears to be a market for such an app, and it could be highly profitable if executed well. 