# Scraping Teletherapy App Data from the Apple App Store

This notebook uses the itunes_app_scraper and app_store_scraper packages to scrape the Apple app store for the reviews of 10 digital therapy apps to examine the benefits and shortfalls of mental health apps like these. 

The apps were selected based on search results for "therapy" and "CBT" on the app store, as well as online reviews of mental health apps. The dataframe containing their names and ids is manually created here. Then, the packages are used to extract app information and app ratings. 

Due to the biased nature of ratings, I will use these reviews to isolate potential problems or solutions unique to these apps, rather than trying to use them as a metric for efficacy. I will use regular expressions to do explore the following questions:
- Do users experience quicker or slower response times than they might with in-person therapy or more traditional, appointment-based telehealth?
- Do users experience issues navigating insurance claims with these apps? Do they report feeling unfairly charged?
- If users feel the apps helped them, how?
- ...and more to come based as I continue the analysis!

## Imports

In [None]:
!pip install itunes-app-scraper-dmi==0.9.4
!pip install app-store-scraper==0.3.5

In [26]:
import pandas as pd

from itunes_app_scraper.scraper import AppStoreScraper
from app_store_scraper import AppStore

import time
import re

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

In [4]:
pd.set_option('display.max_columns', None)

In [5]:
#Testing on one app

In [6]:
my_app = AppStore(
  country='us',       
  app_name='talkspace-therapy-counseling', 
  app_id=661829386    
) 
    
my_app.review(
  how_many=100,
  sleep=time.sleep(1)
)

2022-02-11 15:34:56,705 [INFO] Base - Initialised: AppStore('us', 'talkspace-therapy-counseling', 661829386)
2022-02-11 15:34:56,706 [INFO] Base - Ready to fetch reviews from: https://apps.apple.com/us/app/talkspace-therapy-counseling/id661829386
2022-02-11 15:34:59,442 [INFO] Base - [id:661829386] Fetched 100 reviews (100 fetched in total)


In [7]:
reviews = my_app.reviews

In [None]:
reviews[:5]

## Building dataframe of 10 apps 

This section builds a dataframe of the therapy apps to analyze from the App Store website (https://www.apple.com/app-store/). The names and ids of the apps are in each of their App store URLs

The apps are:
- [Talkspace](https://apps.apple.com/us/app/talkspace-therapy-counseling/id661829386) 
- [Betterhelp](https://apps.apple.com/us/app/betterhelp-therapy/id995252384)
- [Wysa](https://apps.apple.com/us/app/wysa-mental-health-support/id1166585565)
- [Bloom](https://apps.apple.com/us/app/bloom-cbt-therapy-self-care/id1475128511)
- [7 Cups](https://apps.apple.com/us/app/7-cups-online-therapy-chat/id921814681)
- [Sanvella](https://apps.apple.com/us/app/sanvello-anxiety-depression/id922968861)
- [WoeBot](https://apps.apple.com/us/app/woebot-your-self-care-expert/id1305375832) 
- [Larkr](https://apps.apple.com/us/app/larkr-on-demand-video-therapy/id1253710426)
- [Youper](https://apps.apple.com/us/app/youper-online-therapy/id1060691513)
- [Replika](https://apps.apple.com/us/app/replika-virtual-ai-friend/id1158555867)

In [11]:
app_names_ids = [
    {'name':'talkspace-therapy-counseling',
    'id': 661829386},
    {'name':'betterhelp-therapy',
    'id':995252384},
    {'name': 'wysa-mental-health-support',
    'id':1166585565},
    {'name': 'bloom-cbt-therapy-self-care',
    'id':1475128511},
    {'name': '7-cups-online-therapy-chat',
    'id':921814681},
    {'name': 'sanvello-anxiety-depression',
    'id':922968861},
    {'name': 'woebot-your-self-care-expert',
    'id':1305375832},
    {'name': 'larkr-on-demand-video-therapy',
    'id':1253710426},
    {'name':'youper-online-therapy',
    'id':1060691513},
    {'name': 'replika-virtual-ai-friend',
    'id':1158555867}]

In [14]:
therapy_apps = pd.DataFrame(app_names_ids)

In [16]:
therapy_apps

Unnamed: 0,name,id
0,talkspace-therapy-counseling,661829386
1,betterhelp-therapy,995252384
2,wysa-mental-health-support,1166585565
3,bloom-cbt-therapy-self-care,1475128511
4,7-cups-online-therapy-chat,921814681
5,sanvello-anxiety-depression,922968861
6,woebot-your-self-care-expert,1305375832
7,larkr-on-demand-video-therapy,1253710426
8,youper-online-therapy,1060691513
9,replika-virtual-ai-friend,1158555867


## Scraping reviews for the apps

In [28]:
#Avoiding max retries error
session = requests.Session()
retry = Retry(connect=3, backoff_factor=0.5)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)

In [33]:
all_reviews = []
for app in app_names_ids:
    my_app = AppStore(
      country='us',       
      app_name=app['name'], 
      app_id=app['id']    
    ) 

    my_app.review(
      how_many=1500,
      sleep=time.sleep(2)
    )
    
    reviews = my_app.reviews
    
    for review in reviews:
        review['app_name'] = app['name']
        review['app_id'] = app['id']
        
    reviews_df = pd.DataFrame(reviews)
    all_reviews.append(reviews_df)
    time.sleep(5)

2022-02-11 18:19:56,118 [INFO] Base - Initialised: AppStore('us', 'talkspace-therapy-counseling', 661829386)
2022-02-11 18:19:56,119 [INFO] Base - Ready to fetch reviews from: https://apps.apple.com/us/app/talkspace-therapy-counseling/id661829386
2022-02-11 18:20:16,777 [ERROR] Base - Something went wrong: HTTPSConnectionPool(host='amp-api.apps.apple.com', port=443): Max retries exceeded with url: /v1/catalog/us/apps/661829386/reviews?l=en-GB&offset=0&limit=20&platform=web&additionalPlatforms=appletv%2Cipad%2Ciphone%2Cmac (Caused by ResponseError('too many 429 error responses'))
2022-02-11 18:20:16,781 [INFO] Base - [id:661829386] Fetched 0 reviews (0 fetched in total)
2022-02-11 18:20:22,702 [INFO] Base - Initialised: AppStore('us', 'betterhelp-therapy', 995252384)
2022-02-11 18:20:22,703 [INFO] Base - Ready to fetch reviews from: https://apps.apple.com/us/app/betterhelp-therapy/id995252384
2022-02-11 18:20:43,602 [ERROR] Base - Something went wrong: HTTPSConnectionPool(host='amp-api.

ConnectionError: HTTPSConnectionPool(host='apps.apple.com', port=443): Max retries exceeded with url: /us/app/replika-virtual-ai-friend/id1158555867 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x126748b20>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

## Analysis

I haven't done much yet! But I plan to start with search terms such as insurance, love, refund, hate, worth, customer service

### Insurance and cost issues

## Getting app details (ignore for now)

In [None]:
#potentially delete next few rows

In [4]:
scraper = AppStoreScraper()

In [13]:
data = scraper.get_multiple_app_details([661829386])

In [14]:
df = pd.DataFrame(data)

https://itunes.apple.com/lookup?id=661829386&country=nl&entity=software


Unnamed: 0,screenshotUrls,ipadScreenshotUrls,appletvScreenshotUrls,artworkUrl60,artworkUrl512,artworkUrl100,artistViewUrl,features,isGameCenterEnabled,supportedDevices,advisories,kind,minimumOsVersion,trackCensoredName,languageCodesISO2A,fileSizeBytes,sellerUrl,formattedPrice,contentAdvisoryRating,averageUserRatingForCurrentVersion,userRatingCountForCurrentVersion,averageUserRating,trackViewUrl,trackContentRating,bundleId,trackId,trackName,releaseDate,primaryGenreName,genreIds,isVppDeviceBasedLicensingEnabled,currentVersionReleaseDate,sellerName,releaseNotes,primaryGenreId,currency,description,artistId,artistName,genres,price,version,wrapperType,userRatingCount
0,https://is4-ssl.mzstatic.com/image/thumb/Purpl...,https://is4-ssl.mzstatic.com/image/thumb/Purpl...,,https://is4-ssl.mzstatic.com/image/thumb/Purpl...,https://is4-ssl.mzstatic.com/image/thumb/Purpl...,https://is4-ssl.mzstatic.com/image/thumb/Purpl...,https://apps.apple.com/nl/developer/groop-inte...,iosUniversal,False,"iPhone5s-iPhone5s,iPadAir-iPadAir,iPadAirCellu...","Soms/Milde medische/behandelingsinformatie,Som...",software,13.0,Talkspace Therapy & Counseling,EN,109459456,https://www.talkspace.com,Gratis,12+,0,0,0,https://apps.apple.com/nl/app/talkspace-therap...,12+,com.talktala.talktala,661829386,Talkspace Therapy & Counseling,2013-07-11T17:51:11Z,Health & Fitness,60136020,True,2022-02-02T17:44:11Z,Groop Internet Platform inc.,This release improves performance on certain d...,6013,EUR,Talkspace is the most convenient and affordabl...,661829389,Groop Internet Platform inc.,"Gezondheid en fitness,Geneeskunde",0.0,8.92.85,software,0


In [None]:
#end of rows to potentially delete