<div>
    <b>Description:</b> Exploring Google Play and Apple Store Apps Markets<br>
    <b>Author:</b> Maika Carmelle Henry Northrop
</div>
<br>

In [1]:
# import modules
from csv import reader

## Profitable App Profiles for the App Store and Google Play Markets

The objective of this data analysis project is to identify mobile apps that could potentially be profitable for the App store and Google Play markets.  As a Data Scientist and Full Stack Web Developer for the XYZ startup company, my job is to facilitate the development of Android and IOS mobile apps and enable our team and stakeholders to make data-driven decisions with respect to the kind of apps they should build.




At XYZ startup company, we build apps that are free to download and install and take a user-centric approach to how we design the front and back end of the app.  Our primary source of revenue consists of in-app ads. This means that our revenue for any given app is mostly influenced by the number of users that use our app, which is why a user-friendly and UX approach to how we design coupled with the type of mobile apps we bring to market are critical elements in our business model. And so, the main goal for this project is to analyze data to help our team understand what kinds of apps are likely to attract more users.

## Collecting the Data

Presently, there are over 4 million iOS and Android apps available on the market. 

It would not be practical nor a sound business strategy to collect this amount of data as it would require a considerable amount of time and money to compile.  Therefore, we've decided to analyze a sample of the data in order to avoid spending company resources collecting new data ourselves.  The following two data sets will serve our purpose:

* A [data set](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/home) containing data about approximately ten thousand Android apps from Google Play
* A [data set](https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/home) containing data about approximately seven thousand iOS apps from the App Store


![Source: Statista 2018](https://github.com/MHNorth/Profitable-App-Profiles-for-the-App-Store-and-Google-Play-Markets-Jupyter-Notebook/blob/master/img/apps_stats.png "App Market Statistics as of 2018")

## Let's explore the Google Play and Apple store datasets.

The following function was created following the DRY design method so that we can repeatedly print rows in a more readable way.  Also, an option has been added to our function to show the number of rows and columns for any data set.

In [12]:
def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row
        
    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

### Let's begin by opening and reading both data sets.

In [10]:
### The Google Play data set ###
opened_file = open('datasets/googleplaystore.csv', encoding="utf8")
read_file = reader(opened_file)
android = list(read_file)
android_header = android[0]
android = android[1:]

### The App Store data set ###
opened_file = open('datasets/AppleStore.csv', encoding="utf8")
read_file = reader(opened_file)
ios = list(read_file)
ios_header = ios[0]
ios = ios[1:]

In [13]:
### Explore Android data set
print(android_header)
print('\n')
explore_data(android, 0, 3, True)

['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver']


['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up']


['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']


['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']


Number of rows: 10841
Number of columns: 13


In [14]:
### Explore IOS data set
print(ios_header)
print('\n')
explore_data(ios, 0, 3, True)

['', 'id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


['1', '281656475', 'PAC-MAN Premium', '100788224', 'USD', '3.99', '21292', '26', '4', '4.5', '6.3.5', '4+', 'Games', '38', '5', '10', '1']


['2', '281796108', 'Evernote - stay organized', '158578688', 'USD', '0', '161065', '26', '4', '3.5', '8.2.2', '4+', 'Productivity', '37', '5', '23', '1']


['3', '281940292', 'WeatherBug - Local Weather, Radar, Maps, Alerts', '100524032', 'USD', '0', '188583', '2822', '3.5', '4.5', '5.0.0', '4+', 'Weather', '37', '5', '3', '1']


Number of rows: 7197
Number of columns: 17
