## 1. Introduction
<p><img src="https://assets.datacamp.com/production/project_1197/img/google_play_store.png" alt="Google Play logo"></p>
<p>Mobile apps are everywhere. They are easy to create and can be very lucrative from the business standpoint. Specifically, Android is expanding as an operating system and has captured more than 74% of the total market<sup><a href="https://www.statista.com/statistics/272698/global-market-share-held-by-mobile-operating-systems-since-2009">[1]</a></sup>. </p>
<p>The Google Play Store apps data has enormous potential to facilitate data-driven decisions and insights for businesses. In this notebook, we will analyze the Android app market by comparing ~10k apps in Google Play across different categories. We will also use the user reviews to draw a qualitative comparision between the apps.</p>
<p>The dataset you will use here was scraped from Google Play Store in September 2018 and was published on <a href="https://www.kaggle.com/lava18/google-play-store-apps">Kaggle</a>. Here are the details: <br>
<br></p>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6;">
    <div style="font-size:20px"><b>datasets/apps.csv</b></div>
This file contains all the details of the apps on Google Play. There are 9 features that describe a given app.
<ul>
    <li><b>App:</b> Name of the app</li>
    <li><b>Category:</b> Category of the app. Some examples are: ART_AND_DESIGN, FINANCE, COMICS, BEAUTY etc.</li>
    <li><b>Rating:</b> The current average rating (out of 5) of the app on Google Play</li>
    <li><b>Reviews:</b> Number of user reviews given on the app</li>
    <li><b>Size:</b> Size of the app in MB (megabytes)</li>
    <li><b>Installs:</b> Number of times the app was downloaded from Google Play</li>
    <li><b>Type:</b> Whether the app is paid or free</li>
    <li><b>Price:</b> Price of the app in US$</li>
    <li><b>Last Updated:</b> Date on which the app was last updated on Google Play </li>

</ul>
</div>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6;">
    <div style="font-size:20px"><b>datasets/user_reviews.csv</b></div>
This file contains a random sample of 100 <i>[most helpful first](https://www.androidpolice.com/2019/01/21/google-play-stores-redesigned-ratings-and-reviews-section-lets-you-easily-filter-by-star-rating/)</i> user reviews for each app. The text in each review has been pre-processed and passed through a sentiment analyzer.
<ul>
    <li><b>App:</b> Name of the app on which the user review was provided. Matches the `App` column of the `apps.csv` file</li>
    <li><b>Review:</b> The pre-processed user review text</li>
    <li><b>Sentiment Category:</b> Sentiment category of the user review - Positive, Negative or Neutral</li>
    <li><b>Sentiment Score:</b> Sentiment score of the user review. It lies between [-1,1]. A higher score denotes a more positive sentiment.</li>

</ul>
</div>
<p>From here on, it will be your task to explore and manipulate the data until you are able to answer the three questions described in the instructions panel.<br></p>

In [256]:
# Use this cell to begin your analysis, and add as many as you would like!

In [257]:
import pandas as pd

file = pd.read_csv('datasets/apps.csv')
print(file.head())
print(file.columns)
file.shape

                                                 App        Category  Rating  \
0     Photo Editor & Candy Camera & Grid & ScrapBook  ART_AND_DESIGN     4.1   
1                                Coloring book moana  ART_AND_DESIGN     3.9   
2  U Launcher Lite – FREE Live Cool Themes, Hide ...  ART_AND_DESIGN     4.7   
3                              Sketch - Draw & Paint  ART_AND_DESIGN     4.5   
4              Pixel Draw - Number Art Coloring Book  ART_AND_DESIGN     4.3   

   Reviews  Size     Installs  Type  Price      Last Updated  
0      159  19.0      10,000+  Free    0.0   January 7, 2018  
1      967  14.0     500,000+  Free    0.0  January 15, 2018  
2    87510   8.7   5,000,000+  Free    0.0    August 1, 2018  
3   215644  25.0  50,000,000+  Free    0.0      June 8, 2018  
4      967   2.8     100,000+  Free    0.0     June 20, 2018  
Index(['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type',
       'Price', 'Last Updated'],
      dtype='object')


(9659, 9)

In [258]:
apps = file.copy()
print(apps.head())
apps['Installs'] = apps['Installs'].replace(',','', regex = True)
apps['Installs'] = apps['Installs'].str.replace('+','')
apps = apps.astype({'Installs':'int'})
print(apps.head())
print(type(apps['Installs'][1]))

                                                 App        Category  Rating  \
0     Photo Editor & Candy Camera & Grid & ScrapBook  ART_AND_DESIGN     4.1   
1                                Coloring book moana  ART_AND_DESIGN     3.9   
2  U Launcher Lite – FREE Live Cool Themes, Hide ...  ART_AND_DESIGN     4.7   
3                              Sketch - Draw & Paint  ART_AND_DESIGN     4.5   
4              Pixel Draw - Number Art Coloring Book  ART_AND_DESIGN     4.3   

   Reviews  Size     Installs  Type  Price      Last Updated  
0      159  19.0      10,000+  Free    0.0   January 7, 2018  
1      967  14.0     500,000+  Free    0.0  January 15, 2018  
2    87510   8.7   5,000,000+  Free    0.0    August 1, 2018  
3   215644  25.0  50,000,000+  Free    0.0      June 8, 2018  
4      967   2.8     100,000+  Free    0.0     June 20, 2018  
                                                 App        Category  Rating  \
0     Photo Editor & Candy Camera & Grid & ScrapBook  ART_AND

In [259]:
app_category_info = apps.pivot_table(index = 'Category', values = ['Price','Rating'])
app_category_info.rename(columns = {'Price':'Average price', 'Rating':'Average rating'}, inplace = True)
app_category_info['Number of apps'] = pd.DataFrame(apps['Category'].value_counts())
print(app_category_info)


                     Average price  Average rating  Number of apps
Category                                                          
ART_AND_DESIGN            0.093281        4.357377              64
AUTO_AND_VEHICLES         0.158471        4.190411              85
BEAUTY                    0.000000        4.278571              53
BOOKS_AND_REFERENCE       0.539505        4.344970             222
BUSINESS                  0.417357        4.098479             420
COMICS                    0.000000        4.181481              56
COMMUNICATION             0.263937        4.121484             315
DATING                    0.160468        3.970149             171
EDUCATION                 0.150924        4.364407             119
ENTERTAINMENT             0.078235        4.135294             102
EVENTS                    1.718594        4.435556              64
FAMILY                    1.309967        4.179664            1832
FINANCE                   8.408203        4.115563            

In [260]:
free_apps = apps[apps['Type'] == 'Free']
print(free_apps.shape)

(8903, 9)


In [261]:
free_finance_apps = free_apps[free_apps['Category'] == 'FINANCE']
print(free_finance_apps.head())

                   App Category  Rating  Reviews  Size  Installs  Type  Price  \
837             K PLUS  FINANCE     4.4   124424   NaN  10000000  Free    0.0   
838        ING Banking  FINANCE     4.4    39041   NaN   1000000  Free    0.0   
839  Citibanamex Movil  FINANCE     3.6    52306  42.0   5000000  Free    0.0   
840    The postal bank  FINANCE     3.7    36718   NaN   5000000  Free    0.0   
841        KTB Netbank  FINANCE     3.8    42644  19.0   5000000  Free    0.0   

       Last Updated  
837   June 26, 2018  
838  August 3, 2018  
839   July 27, 2018  
840   July 16, 2018  
841   June 28, 2018  


In [262]:
file_2 = pd.read_csv('datasets/user_reviews.csv')
apps_merged = free_finance_apps.merge(file_2, on = 'App')
print(apps_merged.head())

                 App Category  Rating  Reviews  Size  Installs  Type  Price  \
0  Citibanamex Movil  FINANCE     3.6    52306  42.0   5000000  Free    0.0   
1  Citibanamex Movil  FINANCE     3.6    52306  42.0   5000000  Free    0.0   
2  Citibanamex Movil  FINANCE     3.6    52306  42.0   5000000  Free    0.0   
3  Citibanamex Movil  FINANCE     3.6    52306  42.0   5000000  Free    0.0   
4  Citibanamex Movil  FINANCE     3.6    52306  42.0   5000000  Free    0.0   

    Last Updated                                             Review  \
0  July 27, 2018  Forget paying app, designed make fail payments...   
1  July 27, 2018  It's working expected, talking best bank Mexic...   
2  July 27, 2018  It has many problems with Android 8.1. You can...   
3  July 27, 2018  I changed my phone to a Xiaomi Redmi Note 5, t...   
4  July 27, 2018  In her eagerness to make her look pretty with ...   

  Sentiment Category  Sentiment Score  
0           Negative        -0.500000  
1           Positi

In [263]:
df = apps_merged.pivot_table(index = 'App', values = 'Sentiment Score')
print(df)

                                                 Sentiment Score
App                                                             
A+ Mobile                                               0.329592
ACE Elite                                               0.252171
Acorns - Invest Spare Change                            0.046667
Amex Mobile                                             0.175666
Associated Credit Union Mobile                          0.388093
BBVA Compass Banking                                    0.205590
BBVA Spain                                              0.515086
BZWBK24 mobile                                          0.326883
Bank of America Mobile Banking                          0.180027
BankMobile Vibe App                                     0.353455
Banorte Movil                                           0.116999
Barclays US for Android                                 0.017928
Betterment                                              0.143252
Bloomberg Professional   

In [264]:
df = df.sort_values('Sentiment Score', ascending = False)
top_10_user_feedback = df.iloc[:10,:]
print(top_10_user_feedback)

                                            Sentiment Score
App                                                        
BBVA Spain                                         0.515086
Associated Credit Union Mobile                     0.388093
BankMobile Vibe App                                0.353455
A+ Mobile                                          0.329592
Current debit card and app made for teens          0.327258
BZWBK24 mobile                                     0.326883
Even - organize your money, get paid early         0.283929
Credit Karma                                       0.270052
Fortune City - A Finance App                       0.266966
Branch                                             0.264230
