# AppStore 
As of 2022, Apple's App Store was home to some 1.76 million apps and over 460,000 games. The aim of this exercise is to obtain app rating and review data for select categories for sentiment analysis, product analytics, opportunity discovery, and selection.  Our data acquisition centers on three entities: AppData, Rating, and Review, which are described below.

## AppData
The AppData entity encapsulates the core data for each app and is defined as follows.  

| #  | attribute     | type  | description                                  | API Field         |
|----|---------------|-------|----------------------------------------------|-------------------|
| 1  | id:           | int   | Unique Apple App Identifier                  | trackId           |
| 2  | name:         | str   | Name of the app.                             | trackName         |
| 3  | description:  | str   | Description                                  | description       |
| 4  | category_id:  | int   | Four digit category identifier               | primaryGenreId    |
| 5  | category:     | str   | Category name                                | primaryGenreName  |
| 6  | price:        | float | Cost of the app                              | price             |
| 7  | rating:       | float | The user average rating                      | averageUserRating |
| 8  | ratings:      | int   | The rating count                             | userRatingCount   |
| 9  | developer_id: | int   | The app developer identifier                 | artistId          |
| 10 | developer:    | str   | The app developer name                       | artistName        |
| 11 | released:     | str   | The date of initial release                  | releaseDate       |
| 12 | source:       | str   | The host from which the data were obtained.  | itunes.apple.com  |

The data acquisition pipeline will obtain app data for the following categories and persist them in an RDBMS table.

1. business
2. education
3. entertainment
4. health
5. lifestyle
6. medical
7. productivity
9. social_networking


### Imports

In [1]:
from aimobile.data.appstore.controller import AppStoreAppController
from aimobile.container import AIMobileContainer

In [2]:
TERMS = ["health", "productivity", "social", "business", "education", "entertainment", "lifestyle", "medical"]

### Dependencies

In [3]:
container = AIMobileContainer()
container.init_resources()
container.wire(packages=["aimobile.service.appstore"])

### AppData Scraper
AppStoreAppController object iterates through the above TERMs, engaging a scraper to extract the app data described above from the App Store. The results are persisted in an RDBMS and archived.

In [4]:
controller = AppStoreAppController()
controller.scrape(terms=TERMS)
controller.summarize()


[04/26/2023 08:00:22 PM] [INFO] [AppStoreAppController] [scrape] : 

Appstore AppData Scraped Status is Complete. Skipping App Store App Data Scraping Operation.


Unnamed: 0,Category,App Count,Average Rating,Average Rating Count,Total Rating Count
0,Medical,71617,1.52,405.38,29032210
1,Health & Fitness,40353,3.03,3660.09,147695666
2,Social Networking,31615,2.93,2037.27,64408273
3,Business,22314,3.18,2877.8,64215156
4,Education,20386,3.16,3225.57,65756519
5,Games,19856,4.16,12395.58,246126698
6,Productivity,13279,3.58,5559.64,73826499
7,Lifestyle,12260,3.48,6686.94,81981934
8,Utilities,9901,3.24,4712.43,46657724
9,Entertainment,8345,3.76,12445.95,103861437


### AppData Exploration

In [5]:
repo = container.data.appdata_repo()
repo.dedup()

[04/26/2023 08:02:38 PM] [INFO] [AppStoreAppDataRepo] [dedup] : Removed 131105 duplicates from the appdata repository.
