# Project Overview
This is a guided project at the end of module 1. The purpose of this project is to apply the skills learnt in this module, including:
- The basics of programming in Python (arithmetical operations, variables, common data types, etc.)
- List and for loops
- Conditional statements
- Dictionaries and frequency tables
- Functions
- Jupyter Notebook

### Project scenario
For this project, we'll pretend we're working as data analysts for a company that builds Android and iOS mobile apps. We make our apps available on Google Play and the App Store.

We only build apps that are free to download and install, and our main source of revenue consists of in-app ads. This means our revenue for any given app is mostly influenced by the number of users who use our app — the more users that see and engage with the ads, the better. Our goal for this project is to analyze data to help our developers understand what type of apps are likely to attract more users.

In [1]:
opened_file = open('Datasets/Apple iOS Store/AppleStore.csv')
from csv import reader
read_file = reader(opened_file)
ios_apps_data = list(read_file)

opened_file = open('Datasets/Google Play Store/googleplaystore.csv')
from csv import reader
read_file = reader(opened_file)
gp_apps_data = list(read_file)

def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

## iOS Data Summary

The iOS dataset has 7197 rows of data (excluding the header). There are 17 columns of data.
\
Here is a [link](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps) to the source data.
\
The columns are:

| 0   | 1      | 2          | 3               | 4             | 5            | 6                                    | 7                                        | 8                                           | 9                                               | 10                  | 11             | 12             | 13                            | 14                                       | 15                            | 16                                 |
|:----|:-------|:-----------|:----------------|:--------------|:-------------|:-------------------------------------|:-----------------------------------------|:--------------------------------------------|:------------------------------------------------|:--------------------|:---------------|:---------------|:------------------------------|:-----------------------------------------|:------------------------------|:-----------------------------------|
|     | id     | track_name | size_bytes      | currency      | price        | rating_count_tot                     | rating_count_ver                         | user_rating                                 | user_rating_ver                                 | ver                 | cont_rating    | prime_genre    | sup_devices.num               | ipadSc_urls.num                          | lang.num                      | vpp_lic                            |
| Row | App ID | App Name   | Size (in Bytes) | Currency Type | Price amount | User Rating counts (for all version) | User Rating counts (for current version) | Average User Rating value (for all version) | Average User Rating value (for current version) | Latest version code | Content Rating | Primary GenreÊ | Number of supporting devicesÊ | Number of screenshots showed for display | Number of supported languages | Vpp Device Based Licensing Enabled |



In [2]:
print('iOS Data Summary')
print('Number of rows excl header: ', len(ios_apps_data[1:]))
print('Number of columns: ', len(ios_apps_data[0]))
print('\n')
explore_data(ios_apps_data, 0, 6)

iOS Data Summary
Number of rows excl header:  7197
Number of columns:  17


['', 'id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['1', '281656475', 'PAC-MAN Premium', '100788224', 'USD', '3.99', '21292', '26', '4', '4.5', '6.3.5', '4+', 'Games', '38', '5', '10', '1']


['2', '281796108', 'Evernote - stay organized', '158578688', 'USD', '0', '161065', '26', '4', '3.5', '8.2.2', '4+', 'Productivity', '37', '5', '23', '1']


['3', '281940292', 'WeatherBug - Local Weather, Radar, Maps, Alerts', '100524032', 'USD', '0', '188583', '2822', '3.5', '4.5', '5.0.0', '4+', 'Weather', '37', '5', '3', '1']


['4', '282614216', 'eBay: Best App to Buy, Sell, Save! Online Shopping', '128512000', 'USD', '0', '262241', '649', '4', '4.5', '5.10.0', '12+', 'Shopping', '37', '5', '9', '1']


['5', '282935706', 'Bible', '92774400'

## Google App Store Data Summary
The Google App Store data has 10841 rows (exluding the header) and has 13 columns.
\
Here is a [link](https://www.kaggle.com/lava18/google-play-store-apps) to the source data
\
The columns are:


| 0        | 1                           | 2                              | 3                                  | 4               | 5                                             | 6            | 7                | 8                                                                | 9                                   | 10                                               | 11                                                 | 12                           |
|:---------|:----------------------------|:-------------------------------|:-----------------------------------|:----------------|:----------------------------------------------|:-------------|:-----------------|:-----------------------------------------------------------------|:-------------------------------------|:-------------------------------------------------|:---------------------------------------------------|:-----------------------------|
| App      | Category                    | Rating                         | Reviews                            | Size            | Installs                                      | Type         | Price            | Content Rating                                                   | Genres                               | Last Updated                                     | Current Ver                                        | Android Ver                  |
| App name | Category the App belongs to | Overall user rating of the app | Number of user reviews for the app | Size of the app | Number of user downloads/installs for the app | Paid or Free | Price of the app | Age group the app is targeted at - Children / Mature 21+ / Adult | An app can belong to multiple genres | Date when the app was last updated on Play Store | Current version of the app available on Play Store | Min required Android version |


In [3]:
print('Google Plan Data Summary')
print('Number of rows excl header: ', len(gp_apps_data[1:]))
print('Number of columns: ', len(gp_apps_data[0]))
print('\n')
explore_data(gp_apps_data, 0, 6)

Google Plan Data Summary
Number of rows excl header:  10841
Number of columns:  13


['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


['Sketch - Draw & Paint', 'ART_AND_DESIGN', '4.5', '215644', '25M', '50,000,000+', 'Free', '0', 'Teen', 'Art & Design', 'June 8, 2018', 'Varies with device', '4.2 and up']


['Pixel Draw - Number Art 