# 1. <a id='toc1_'></a>[AppStore Exploratory Data Analysis](#toc0_)
As of 2022, Apple's App Store was home to some 1.76 million apps and over 460,000 games. App and review data were obtained from the App Store for the following nine categories:
1. business
2. education
3. entertainment
4. health
5. lifestyle
6. medical
7. productivity
9. social_networking

This exploratory data analysis is undertaken to expose patterns and generate insights into customer satisfaction, sentiment and opinion in the mobile app markets. The remainder of this section is organized as follows:

**Table of contents**<a id='toc0_'></a>    
- 1. [AppStore Exploratory Data Analysis](#toc1_)    
  - 1.1. [AppData](#toc1_1_)    
    - 1.1.1. [Descriptive Analysis (Univariate)](#toc1_1_1_)    
      - 1.1.1.1. [Data Types and Missing Values](#toc1_1_1_1_)    

<!-- vscode-jupyter-toc-config
	numbering=true
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

**Imports** 

In [1]:
from aimobile.container import AIMobileContainer

**Dependencies**

In [2]:
container = AIMobileContainer()
container.init_resources()
container.wire(packages=["aimobile.service.appstore"])

<a id='appdata'></a>

## 1.1. <a id='toc1_1_'></a>[AppData](#toc0_)
AppData, the term, encapsulates the core, descriptive data for each app, and contains the following variables: 

| #  | attribute     | type  | description                                  | API Field         |
|----|---------------|-------|----------------------------------------------|-------------------|
| 1  | id:           | int   | Unique Apple App Identifier                  | trackId           |
| 2  | name:         | str   | Name of the app.                             | trackName         |
| 3  | description:  | str   | Description                                  | description       |
| 4  | category_id:  | int   | Four digit category identifier               | primaryGenreId    |
| 5  | category:     | str   | Category name                                | primaryGenreName  |
| 6  | price:        | float | Cost of the app                              | price             |
| 7  | rating:       | float | The user average rating                      | averageUserRating |
| 8  | ratings:      | int   | The rating count                             | userRatingCount   |
| 9  | developer_id: | int   | The app developer identifier                 | artistId          |
| 10 | developer:    | str   | The app developer name                       | artistName        |
| 11 | released:     | str   | The date of initial release                  | releaseDate       |
| 12 | source:       | str   | The host from which the data were obtained.  | itunes.apple.com  |

Let's instantiate the appdata repository, remove the duplicates (should any exist) and get a sense of the overall profile of the data.

In [3]:
appdata_repo = container.data.appdata_repo()
appdata_repo.dedup()

[04/28/2023 01:18:54 PM] [INFO] [AppStoreAppDataRepo] [dedup] : There are no duplicate rows; however, there may be duplicate ids. Check your data.


### 1.1.1. <a id='toc1_1_1_'></a>[Descriptive Analysis (Univariate)](#toc0_)

#### 1.1.1.1. <a id='toc1_1_1_1_'></a>[Data Types and Missing Values](#toc0_)

The following table provides information about the appdata attributes, data types. and missing values. 

In [4]:
appdata_repo.info()

Unnamed: 0,Dtype,Non-Null Count,Bytes,Cardinality
id,int64,192555,1540440,189753
name,object,192555,15440685,189563
description,object,192555,391006390,186449
category_id,int64,192555,1540440,26
category,object,192555,13057010,26
price,float64,192555,1540440,92
developer_id,int64,192555,1540440,122414
developer,object,192555,14757485,122186
rating,float64,192555,1540440,36636
ratings,int64,192555,1540440,13857


We appear to have no duplicate rows; however, the cardinality indicates duplicate ids. 

In [5]:
appdata_repo.summary



Appdata Repository Summary
	               Examples: | 192555
	              Variables: | 12
	           Size (Bytes): | 656762753



Unnamed: 0,Category,Examples,Apps,Average Rating,Rating Count
0,Medical,33667,33245,1.62,16647542
1,Social Networking,32410,32146,1.78,55993854
2,Health & Fitness,28097,27609,3.18,68152920
3,Business,13875,13546,2.94,60460674
4,Games,12742,12587,4.2,198175680
5,Education,12604,12354,3.08,52165199
6,Lifestyle,11337,11019,3.25,70406288
7,Productivity,8129,7859,3.42,68619647
8,Utilities,7151,7124,3.26,42744285
9,Entertainment,5674,5505,3.42,70839812
