# AppStore 
As of 2022, Apple's App Store was home to some 1.76 million apps and over 460,000 games. The aim of this exercise is to obtain app rating and review data for select categories for sentiment analysis, product analytics, opportunity discovery, and selection.  Our data acquisition centers on three entities: AppData, Rating, and Review, which are described below.

## AppData
The AppData entity encapsulates the core data for each app and is defined as follows.  

| #  | attribute     | type  | description                                  | API Field         |
|----|---------------|-------|----------------------------------------------|-------------------|
| 1  | id:           | int   | Unique Apple App Identifier                  | trackId           |
| 2  | name:         | str   | Name of the app.                             | trackName         |
| 3  | description:  | str   | Description                                  | description       |
| 4  | category_id:  | int   | Four digit category identifier               | primaryGenreId    |
| 5  | category:     | str   | Category name                                | primaryGenreName  |
| 6  | price:        | float | Cost of the app                              | price             |
| 7  | rating:       | float | The user average rating                      | averageUserRating |
| 8  | ratings:      | int   | The rating count                             | userRatingCount   |
| 9  | developer_id: | int   | The app developer identifier                 | artistId          |
| 10 | developer:    | str   | The app developer name                       | artistName        |
| 11 | released:     | str   | The date of initial release                  | releaseDate       |
| 12 | source:       | str   | The host from which the data were obtained.  | itunes.apple.com  |

The data acquisition pipeline will obtain app data for the following categories and persist them in an RDBMS table.

1. business
2. education
3. entertainment
4. health
5. lifestyle
6. medical
7. productivity
9. social_networking


### Imports

In [1]:
from aimobile.service.appstore.controller import AppStoreAppController, AppStoreReviewController
from aimobile.container import AIMobileContainer

In [2]:
TERMS = ["business", "education", "entertainment", "health", "lifestyle", "medical", "productivity", "social_networking"]
CATEGORIES = [6000, 6017, 6016, 6013, 6012, 6020, 6007, 6005]

### Dependencies

In [3]:
container = AIMobileContainer()
container.init_resources()
container.wire(packages=["aimobile.service.appstore"])

### AppData Scraper
AppStoreAppController object iterates through the above TERMs, engaging a scraper to extract the app data described above from the App Store. The results are persisted in an RDBMS and archived.

In [4]:
controller = AppStoreAppController()
controller.scrape(terms=TERMS)
controller.summarize()


[04/22/2023 10:19:52 AM] [INFO] [AppStoreAppController] [scrape] : 

Appstore AppData Scraped Status is Complete. Skipping App Store App Data Scraping Operation.


Unnamed: 0,Category,App Count,Average Rating,Average Rating Count,Total Rating Count
0,Medical,138680,1.54,203.74,28255057
1,Health & Fitness,67879,2.74,1189.69,80755198
2,Social Networking,63761,1.78,994.0,63378258
3,Education,27807,2.56,2281.81,63450186
4,Business,27534,2.68,2321.21,63912296
5,Lifestyle,21507,3.09,3687.58,79308887
6,Games,19887,4.09,12061.64,239869878
7,Productivity,15179,3.11,4786.31,72651335
8,Utilities,12559,2.8,3567.25,44801098
9,Entertainment,10278,3.33,10063.51,103432717


### Review Scraper
AppStoreReviewController manages the extraction of review data from the App Store. Iterating through the CATEGORIES above, a scraper returns review data for each app in the repository, by category. As before, the results are stored in an RDBMS and archived.

In [5]:
controller = AppStoreReviewController()
controller.scrape(category_ids=CATEGORIES)
controller.summarize()
controller.archive()

[04/22/2023 10:20:37 AM] [DEBUG] [AppStoreReviewScraper] [_set_url] : 
URL: Start Index: 0 End Index: 400.
[04/22/2023 10:20:45 AM] [DEBUG] [urllib3.connectionpool] [_new_conn] : Starting new HTTPS connection (1): itunes.apple.com:443
[04/22/2023 10:20:46 AM] [DEBUG] [urllib3.connectionpool] [_make_request] : https://itunes.apple.com:443 "GET /WebObjects/MZStore.woa/wa/userReviewsRow?id=444553167&displayable-kind=11&startIndex=0&endIndex=400&sort=1 HTTP/1.1" 200 156282
[04/22/2023 10:20:46 AM] [DEBUG] [SessionHandler] [_teardown] : 
Request status code: 200. Session: 0
[04/22/2023 10:20:46 AM] [DEBUG] [AppStoreReviewScraper] [_parse_response] : 
Results returned: 400
[04/22/2023 10:20:46 AM] [DEBUG] [ReviewRepo] [add] : Added 400 rows to the review repository.
[04/22/2023 10:20:46 AM] [DEBUG] [AppStoreReviewScraper] [_set_url] : 
URL: Start Index: 400 End Index: 800.
[04/22/2023 10:20:55 AM] [DEBUG] [urllib3.connectionpool] [_make_request] : https://itunes.apple.com:443 "GET /WebObject

In [None]:
controller.summarize()