## 1. Introduction
<p><img src="https://assets.datacamp.com/production/project_1197/img/google_play_store.png" alt="Google Play logo"></p>
<p>Mobile apps are everywhere. They are easy to create and can be very lucrative from the business standpoint. Specifically, Android is expanding as an operating system and has captured more than 74% of the total market<sup><a href="https://www.statista.com/statistics/272698/global-market-share-held-by-mobile-operating-systems-since-2009">[1]</a></sup>. </p>
<p>The Google Play Store apps data has enormous potential to facilitate data-driven decisions and insights for businesses. In this notebook, we will analyze the Android app market by comparing ~10k apps in Google Play across different categories. We will also use the user reviews to draw a qualitative comparision between the apps.</p>
<p>The dataset you will use here was scraped from Google Play Store in September 2018 and was published on <a href="https://www.kaggle.com/lava18/google-play-store-apps">Kaggle</a>. Here are the details: <br>
<br></p>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6;">
    <div style="font-size:20px"><b>datasets/apps.csv</b></div>
This file contains all the details of the apps on Google Play. There are 9 features that describe a given app.
<ul>
    <li><b>App:</b> Name of the app</li>
    <li><b>Category:</b> Category of the app. Some examples are: ART_AND_DESIGN, FINANCE, COMICS, BEAUTY etc.</li>
    <li><b>Rating:</b> The current average rating (out of 5) of the app on Google Play</li>
    <li><b>Reviews:</b> Number of user reviews given on the app</li>
    <li><b>Size:</b> Size of the app in MB (megabytes)</li>
    <li><b>Installs:</b> Number of times the app was downloaded from Google Play</li>
    <li><b>Type:</b> Whether the app is paid or free</li>
    <li><b>Price:</b> Price of the app in US$</li>
    <li><b>Last Updated:</b> Date on which the app was last updated on Google Play </li>

</ul>
</div>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6;">
    <div style="font-size:20px"><b>datasets/user_reviews.csv</b></div>
This file contains a random sample of 100 <i>[most helpful first](https://www.androidpolice.com/2019/01/21/google-play-stores-redesigned-ratings-and-reviews-section-lets-you-easily-filter-by-star-rating/)</i> user reviews for each app. The text in each review has been pre-processed and passed through a sentiment analyzer.
<ul>
    <li><b>App:</b> Name of the app on which the user review was provided. Matches the `App` column of the `apps.csv` file</li>
    <li><b>Review:</b> The pre-processed user review text</li>
    <li><b>Sentiment Category:</b> Sentiment category of the user review - Positive, Negative or Neutral</li>
    <li><b>Sentiment Score:</b> Sentiment score of the user review. It lies between [-1,1]. A higher score denotes a more positive sentiment.</li>

</ul>
</div>
<p>From here on, it will be your task to explore and manipulate the data until you are able to answer the three questions described in the instructions panel.<br></p>

## Question 1

In [11]:
# Importing python packages
import pandas as pd 
import numpy as np 

# Loading apps data
apps = pd.read_csv('datasets/apps.csv')

# Defining funtion to clean the Installs column
def clean_installs(x):
    # Remove commas and plus(+) sgin
    name = x.replace("+", "").replace(",", "")
    return name

# Applying function, converting to integer and replacing NaN values with 0
apps['Installs'] = apps['Installs'].apply(clean_installs).fillna(0).astype(int)

#Return frist five rows
apps.head()

Unnamed: 0,App,Category,Rating,Reviews,Size,Installs,Type,Price,Last Updated
0,Photo Editor & Candy Camera & Grid & ScrapBook,ART_AND_DESIGN,4.1,159,19.0,10000,Free,0.0,"January 7, 2018"
1,Coloring book moana,ART_AND_DESIGN,3.9,967,14.0,500000,Free,0.0,"January 15, 2018"
2,"U Launcher Lite – FREE Live Cool Themes, Hide ...",ART_AND_DESIGN,4.7,87510,8.7,5000000,Free,0.0,"August 1, 2018"
3,Sketch - Draw & Paint,ART_AND_DESIGN,4.5,215644,25.0,50000000,Free,0.0,"June 8, 2018"
4,Pixel Draw - Number Art Coloring Book,ART_AND_DESIGN,4.3,967,2.8,100000,Free,0.0,"June 20, 2018"


## Question 2

In [12]:
# Creating a dataframe containg the number of apps in each category, the average price, and the average rating
app_category_info = apps.groupby("Category",as_index=False)['App', 'Rating', 'Price'].agg({'App': 'count', 'Rating': 'mean', 'Price': 'mean'})

# Renaming the four columns
app_category_info.rename(columns={"App": "Number of apps", "Rating": "Average rating", "Price": "Average price"}, inplace=True)

# Printing app_category_info

print(app_category_info)

               Category  Number of apps  Average rating  Average price
0        ART_AND_DESIGN              64        4.357377       0.093281
1     AUTO_AND_VEHICLES              85        4.190411       0.158471
2                BEAUTY              53        4.278571       0.000000
3   BOOKS_AND_REFERENCE             222        4.344970       0.539505
4              BUSINESS             420        4.098479       0.417357
5                COMICS              56        4.181481       0.000000
6         COMMUNICATION             315        4.121484       0.263937
7                DATING             171        3.970149       0.160468
8             EDUCATION             119        4.364407       0.150924
9         ENTERTAINMENT             102        4.135294       0.078235
10               EVENTS              64        4.435556       1.718594
11               FAMILY            1832        4.179664       1.309967
12              FINANCE             345        4.115563       8.408203
13    

### Findings: App Category Analysis (Q2)

This analysis was conducted to gain insights into the number of apps available in each category, their average prices, and their average user ratings. The results are summarised in the DataFrame "app_category_info," which contains four columns: Category, Number of apps, Average price, and Average rating.

### Category Distribution

The analysis revealed a diverse distribution of apps across various categories. Here are some key findings:

1. **ART_AND_DESIGN**: This category comprises an average of 64 apps, with an average rating of 4.36 and a relatively low average price of £0.09.

2. **AUTO_AND_VEHICLES**: The Auto and Vehicles category features an average of 85 apps, with an average rating of 4.19 and an average price of £0.16.

3. **BEAUTY**: In the Beauty category, there are an average of 53 apps. These apps have an average rating of 4.28, and interestingly, many of them are available for free.

4. **BOOKS_AND_REFERENCE**: This category boasts a substantial number of apps, with an average of 222. The apps in this category have an average rating of 4.34 and a moderate average price of £0.54.

5. **BUSINESS**: The Business category is well-represented with an average of 420 apps. The average rating is 4.10, and the average price is £0.42.

6. **COMICS**: There are approximately 56 apps in the Comics category, with an average rating of 4.18. Many of these apps are available for free.

7. **COMMUNICATION**: Communication apps are abundant, with an average of 315 in this category. The average rating is 4.12, and the average price is £0.26.

8. **DATING**: Dating apps number around 171 on average, with an average rating of 3.97 and an average price of £0.16.

9. **EDUCATION**: The Education category features an average of 119 apps, with a high average rating of 4.36 and a relatively low average price of £0.15.

10. **ENTERTAINMENT**: Entertainment apps, with an average of 102, have an average rating of 4.14 and a low average price of £0.08.

### Pricing and User Ratings

The average price of apps varies significantly across categories, from free apps to those with substantial price tags. Additionally, user ratings also differ among categories, with some categories having higher average ratings than others. These insights can help developers and users understand the app landscape better.

In conclusion, the analysis of app categories, their numbers, pricing, and user ratings provides valuable information for both app developers and users. It offers a glimpse into the competitive landscape within each category and allows stakeholders to make informed decisions based on their specific interests and preferences.


## Question 3

In [13]:
# Loading data user reviews data 
user = pd.read_csv('datasets/user_reviews.csv').dropna()

# Frist five rows
user.head()

Unnamed: 0,App,Review,Sentiment Category,Sentiment Score
0,10 Best Foods for You,I like eat delicious food. That's I'm cooking ...,Positive,1.0
1,10 Best Foods for You,This help eating healthy exercise regular basis,Positive,0.25
3,10 Best Foods for You,Works great especially going grocery store,Positive,0.4
4,10 Best Foods for You,Best idea us,Positive,1.0
5,10 Best Foods for You,Best way,Positive,1.0


In [14]:
# subsetting rows for free finance apps 
free_finance_apps = apps[(apps['Category'] == 'FINANCE') & (apps['Type'] == "Free")]

free_finance_apps.head()

Unnamed: 0,App,Category,Rating,Reviews,Size,Installs,Type,Price,Last Updated
837,K PLUS,FINANCE,4.4,124424,,10000000,Free,0.0,"June 26, 2018"
838,ING Banking,FINANCE,4.4,39041,,1000000,Free,0.0,"August 3, 2018"
839,Citibanamex Movil,FINANCE,3.6,52306,42.0,5000000,Free,0.0,"July 27, 2018"
840,The postal bank,FINANCE,3.7,36718,,5000000,Free,0.0,"July 16, 2018"
841,KTB Netbank,FINANCE,3.8,42644,19.0,5000000,Free,0.0,"June 28, 2018"


In [15]:
# Merging the user dataset and free finance apps
df = user.merge(free_finance_apps, on='App')

df.head()

Unnamed: 0,App,Review,Sentiment Category,Sentiment Score,Category,Rating,Reviews,Size,Installs,Type,Price,Last Updated
0,A+ Mobile,"I rated higher, lowering rating. It simply wor...",Negative,-0.063889,FINANCE,3.9,730,6.3,10000,Free,0.0,"June 26, 2018"
1,A+ Mobile,It tells I need update option update. So I uni...,Positive,0.15625,FINANCE,3.9,730,6.3,10000,Free,0.0,"June 26, 2018"
2,A+ Mobile,issues remembering device. Fingerprint scan al...,Positive,0.416667,FINANCE,3.9,730,6.3,10000,Free,0.0,"June 26, 2018"
3,A+ Mobile,The mobile check deposit ia add easy,Positive,0.433333,FINANCE,3.9,730,6.3,10000,Free,0.0,"June 26, 2018"
4,A+ Mobile,Being prompted rate time already rated annoying,Negative,-0.8,FINANCE,3.9,730,6.3,10000,Free,0.0,"June 26, 2018"


In [16]:
# Creating a dataframe containg the top 10 free FINANCE apps having the highest average sentiment score
top_10_user_feedback = df.groupby("App",as_index=False)['Sentiment Score'].mean().sort_values('Sentiment Score', ascending = False).head(10)

print(top_10_user_feedback)

                                           App  Sentiment Score
6                                   BBVA Spain         0.515086
4               Associated Credit Union Mobile         0.388093
9                          BankMobile Vibe App         0.353455
0                                    A+ Mobile         0.329592
24   Current debit card and app made for teens         0.327258
7                               BZWBK24 mobile         0.326883
29  Even - organize your money, get paid early         0.283929
21                                Credit Karma         0.270052
32                Fortune City - A Finance App         0.266966
14                                      Branch         0.264230


### Findings: Top 10 Finance Apps with Highest Sentiment Scores (Q3)

This analysis aimed to identify the top 10 free finance apps with the highest average sentiment scores, providing valuable insights into user satisfaction within the finance app category. The results are presented in the DataFrame "top_10_user_feedback," which includes two columns: "App" and "Sentiment Score."

**Key Findings:**

1. **BBVA Spain (Sentiment Score: 0.515):** BBVA Spain emerged as the top finance app with the highest average sentiment score, indicating extremely positive user feedback.

2. **Associated Credit Union Mobile (Sentiment Score: 0.388):** The Associated Credit Union Mobile app secured the second position with a strong average sentiment score, reflecting user satisfaction.

3. **BankMobile Vibe App (Sentiment Score: 0.353):** BankMobile Vibe App follows closely in third place, with a commendable average sentiment score.

4. **A+ Mobile (Sentiment Score: 0.330):** A+ Mobile app garnered positive user sentiment, ranking fourth in the list of top finance apps.

5. **Current debit card and app made for teens (Sentiment Score: 0.327):** This app, designed for teens, also received high user satisfaction, placing fifth.

6. **BZWBK24 mobile (Sentiment Score: 0.327):** BZWBK24 mobile shares the fifth position with a similar average sentiment score, showcasing user contentment.

7. **Even - organize your money, get paid early (Sentiment Score: 0.284):** Even app is in the seventh position, with users expressing satisfaction and positive sentiment.

8. **Credit Karma (Sentiment Score: 0.270):** Credit Karma ranks eighth with a notable average sentiment score, reflecting user trust and satisfaction.

9. **Fortune City - A Finance App (Sentiment Score: 0.267):** Fortune City is the ninth finance app on the list with a strong user sentiment score.

10. **Branch (Sentiment Score: 0.264):** Branch completes the top 10 list, indicating high user satisfaction and sentiment.

**Implications:**

These findings highlight the finance apps that have successfully garnered positive user feedback and sentiment. Users of these apps appear to be highly satisfied with their functionality and features. Developers and financial institutions can use this information to understand user preferences, improve their apps, and create a better user experience.

In conclusion, the top 10 finance apps with the highest sentiment scores represent the best-performing apps in terms of user satisfaction within the finance category. Users find these apps valuable and effective in managing their financial needs.
