# 1. <a id='toc1_'></a>[AppStore Exploratory Data Analysis](#toc0_)
As of 2022, Apple's App Store was home to some 1.76 million apps and over 460,000 games. App and review data were obtained from the App Store for the following nine categories:
1. business
2. education
3. entertainment
4. health
5. lifestyle
6. medical
7. productivity
9. social_networking

This exploratory data analysis is undertaken to expose patterns and generate insights into customer satisfaction, sentiment and opinion in the mobile app markets. The remainder of this section is organized as follows:

**Table of contents**<a id='toc0_'></a>    
- 1. [AppStore Exploratory Data Analysis](#toc1_)    
  - 1.1. [AppData](#toc1_1_)    
    - 1.1.1. [Descriptive Analysis (Univariate)](#toc1_1_1_)    
      - 1.1.1.1. [Data Types and Missing Values](#toc1_1_1_1_)    

<!-- vscode-jupyter-toc-config
	numbering=true
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

**Imports** 

In [1]:
from aimobile.container import AIMobileContainer

**Dependencies**

In [2]:
container = AIMobileContainer()
container.init_resources()
container.wire(packages=["aimobile.data.acquisition.appstore"])

<a id='appdata'></a>

## 1.1. <a id='toc1_1_'></a>[AppData](#toc0_)
AppData, the term, encapsulates the core, descriptive data for each app, and contains the following variables: 

| #  | attribute     | type  | description                                  | API Field         |
|----|---------------|-------|----------------------------------------------|-------------------|
| 1  | id:           | int   | Unique Apple App Identifier                  | trackId           |
| 2  | name:         | str   | Name of the app.                             | trackName         |
| 3  | description:  | str   | Description                                  | description       |
| 4  | category_id:  | int   | Four digit category identifier               | primaryGenreId    |
| 5  | category:     | str   | Category name                                | primaryGenreName  |
| 6  | price:        | float | Cost of the app                              | price             |
| 7  | rating:       | float | The user average rating                      | averageUserRating |
| 8  | ratings:      | int   | The rating count                             | userRatingCount   |
| 9  | developer_id: | int   | The app developer identifier                 | artistId          |
| 10 | developer:    | str   | The app developer name                       | artistName        |
| 11 | released:     | str   | The date of initial release                  | releaseDate       |
| 12 | source:       | str   | The host from which the data were obtained.  | itunes.apple.com  |

Let's instantiate the appdata repository, remove the duplicates (should any exist) and get a sense of the overall profile of the data.

In [3]:
appdata_repo = container.data.appdata_repo()
appdata_repo.dedup()

### 1.1.1. <a id='toc1_1_1_'></a>[Descriptive Analysis (Univariate)](#toc0_)

#### 1.1.1.1. <a id='toc1_1_1_1_'></a>[Data Types and Missing Values](#toc0_)

The following table provides information about the appdata attributes, data types. and missing values. 

In [4]:
appdata_repo.info()

Unnamed: 0,Dtype,Non-Null Count,Bytes,Cardinality
id,int64,229940,1839520,200871
name,object,229940,19285232,200691
description,object,229940,708288356,197531
category_id,int64,229940,1839520,26
category,object,229940,15456489,26
price,float64,229940,1839520,92
developer_id,int64,229940,1839520,130768
developer,object,229940,17994432,130533
rating,float64,229940,1839520,39290
ratings,int64,229940,1839520,14647


We appear to have no duplicate rows; however, the cardinality indicates duplicate ids. 

In [5]:
appdata_repo.summary



Appdata Repository Summary
	               Examples: | 229940
	              Variables: | 12
	           Size (Bytes): | 806685437



Unnamed: 0,Category,Examples,Apps,Average Rating,Rating Count
0,Finance,37287,14545,4.1,448935205
1,Medical,33757,33245,1.62,16720187
2,Social Networking,32458,32146,1.78,56242780
3,Health & Fitness,28229,27610,3.19,69230739
4,Business,15734,13796,3.05,75173086
5,Education,12950,12457,3.1,52517795
6,Games,12898,12643,4.21,199500174
7,Lifestyle,11642,11075,3.28,77736895
8,Productivity,8395,7908,3.44,68914704
9,Utilities,7680,7334,3.32,45575904
