# Objective

The objective of this analysis is to explore the Google Play Store dataset to understand app characteristics, user engagement patterns, and factors influencing app ratings and popularity. 
    The study focuses on identifying trends across free and paid apps, reviews, ratings, installs, and other relevant attributes.

## Data Description

The dataset contains information about applications available on the Google Play Store. Each row represents an individual app entry, and key features include:App, Category, Rating – User rating, Reviews, Installs, Type – Free or Paid app, Price,Content Rating, Genres etc.
The dataset is used to perform descriptive statistics, grouping, filtering, and comparisons to uncover meaningful insights.


In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [3]:
data = pd.read_csv('googleplaystore.csv')
data

Unnamed: 0,App,Category,Rating,Reviews,Size,Installs,Type,Price,Content Rating,Genres,Last Updated,Current Ver,Android Ver
0,Photo Editor & Candy Camera & Grid & ScrapBook,ART_AND_DESIGN,4.1,159,19M,"10,000+",Free,0,Everyone,Art & Design,"January 7, 2018",1.0.0,4.0.3 and up
1,Coloring book moana,ART_AND_DESIGN,3.9,967,14M,"500,000+",Free,0,Everyone,Art & Design;Pretend Play,"January 15, 2018",2.0.0,4.0.3 and up
2,"U Launcher Lite – FREE Live Cool Themes, Hide ...",ART_AND_DESIGN,4.7,87510,8.7M,"5,000,000+",Free,0,Everyone,Art & Design,"August 1, 2018",1.2.4,4.0.3 and up
3,Sketch - Draw & Paint,ART_AND_DESIGN,4.5,215644,25M,"50,000,000+",Free,0,Teen,Art & Design,"June 8, 2018",Varies with device,4.2 and up
4,Pixel Draw - Number Art Coloring Book,ART_AND_DESIGN,4.3,967,2.8M,"100,000+",Free,0,Everyone,Art & Design;Creativity,"June 20, 2018",1.1,4.4 and up
...,...,...,...,...,...,...,...,...,...,...,...,...,...
10836,Sya9a Maroc - FR,FAMILY,4.5,38,53M,"5,000+",Free,0,Everyone,Education,"July 25, 2017",1.48,4.1 and up
10837,Fr. Mike Schmitz Audio Teachings,FAMILY,5.0,4,3.6M,100+,Free,0,Everyone,Education,"July 6, 2018",1.0,4.1 and up
10838,Parkinson Exercices FR,MEDICAL,,3,9.5M,"1,000+",Free,0,Everyone,Medical,"January 20, 2017",1.0,2.2 and up
10839,The SCP Foundation DB fr nn5n,BOOKS_AND_REFERENCE,4.5,114,Varies with device,"1,000+",Free,0,Mature 17+,Books & Reference,"January 19, 2015",Varies with device,Varies with device


In [4]:
data.columns

Index(['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type',
       'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver',
       'Android Ver'],
      dtype='str')

### Total number of App Titles contain Astrology

In [5]:
data['App'].str.contains('Astrology', case = False, na = False).sum() # Numbers of Astrology in App column

np.int64(3)

In [6]:
data[data['App'].str.contains('Astrology', case = False, na = False)] # Complete info of Astrology containing data

Unnamed: 0,App,Category,Rating,Reviews,Size,Installs,Type,Price,Content Rating,Genres,Last Updated,Current Ver,Android Ver
1570,Horoscopes – Daily Zodiac Horoscope and Astrology,LIFESTYLE,4.6,161143,11M,"10,000,000+",Free,0,Everyone 10+,Lifestyle,"June 25, 2018",5.2.4(881),4.0.3 and up
1592,သိင်္ Astrology - Min Thein Kha BayDin,LIFESTYLE,4.7,2225,15M,"100,000+",Free,0,Everyone,Lifestyle,"July 26, 2018",4.2.1,4.0.3 and up
10840,iHoroscope - 2018 Daily Horoscope & Astrology,LIFESTYLE,4.5,398307,19M,"10,000,000+",Free,0,Everyone,Lifestyle,"July 25, 2018",Varies with device,Varies with device


### Average App Rating

In [7]:
data['Rating'].mean() 

np.float64(4.193338315362443)

### Highest Average App Rating

In [8]:
data.groupby('Category')['Rating'].mean().sort_values(ascending=False)

Category
1.9                    19.000000
EVENTS                  4.435556
EDUCATION               4.389032
ART_AND_DESIGN          4.358065
BOOKS_AND_REFERENCE     4.346067
PERSONALIZATION         4.335987
PARENTING               4.300000
GAME                    4.286326
BEAUTY                  4.278571
HEALTH_AND_FITNESS      4.277104
SHOPPING                4.259664
SOCIAL                  4.255598
WEATHER                 4.244000
SPORTS                  4.223511
PRODUCTIVITY            4.211396
HOUSE_AND_HOME          4.197368
FAMILY                  4.192272
PHOTOGRAPHY             4.192114
AUTO_AND_VEHICLES       4.190411
MEDICAL                 4.189143
LIBRARIES_AND_DEMO      4.178462
FOOD_AND_DRINK          4.166972
COMMUNICATION           4.158537
COMICS                  4.155172
NEWS_AND_MAGAZINES      4.132189
FINANCE                 4.131889
ENTERTAINMENT           4.126174
BUSINESS                4.121452
TRAVEL_AND_LOCAL        4.109292
LIFESTYLE               4.094904
V

### Apps having 5 Star Rating

In [9]:
len(data[data['Rating']==5.0])

274

### Average Values of Reviews

In [10]:
data['Reviews'] = data['Reviews'].replace('3.0M', 3.0) 

In [11]:
data['Reviews'] = data['Reviews'].astype('float') # Converting data type from string to float

In [12]:
data['Reviews'].dtype # Succesfully converted data type to float

dtype('float64')

In [13]:
data['Reviews'].mean() # Average value in Review column

np.float64(444111.9265750392)

### Number of Free and Paid Apps

In [14]:
data['Type'].value_counts()

Type
Free    10039
Paid      800
0           1
Name: count, dtype: int64

### App having Maximum Reviews

In [25]:
data[data['Reviews'].max()==data['Reviews']]['App']

2544    Facebook
Name: App, dtype: str

### Top 5 App having Highest Reviews

In [30]:
index = data['Reviews'].sort_values(ascending = False).head().index

In [32]:
data.iloc[index]['App']

2544              Facebook
3943              Facebook
381     WhatsApp Messenger
336     WhatsApp Messenger
3904    WhatsApp Messenger
Name: App, dtype: str

### Average Rating of Free and Paid Apps

In [37]:
data.groupby('Type')['Rating'].mean()

Type
0       19.000000
Free     4.186203
Paid     4.266615
Name: Rating, dtype: float64

### Top 5 Highest Number of  Installs

In [86]:
data['Installs_1'] = data['Installs'].str.replace(',','') # Replacing commas with empty string

In [87]:
data['Installs_1'] = data['Installs_1'].str.replace('+','') # Replacing plus symbol with empty string

In [88]:
data['Installs_1'] = data['Installs_1'].str.replace('Free','0') # Replacing 'Free' with '0' string

In [89]:
data['Installs_1'] = data['Installs_1'].astype('int') # Converting to integer values

In [90]:
index1 = data['Installs_1'].sort_values(ascending = False).head().index

In [91]:
data.iloc[index1]['App']

5856    Google Play Games
5395        Google Photos
2853        Google Photos
2884        Google Photos
4170         Google Drive
Name: App, dtype: str

# Conclusion

- Free apps heavily dominate the Google Play Store, indicating a strong preference for the free app model.
- Paid apps exhibit slightly higher average ratings compared to free apps.
- Most apps have relatively high ratings, suggesting generally positive user feedback.
- Messenger-type apps have higher reviews as compared to other apps
- Perfect ratings (5.0) should be interpreted cautiously, as they may result from a small number of reviews.
- Reviews and installs are highly skewed, with a few apps capturing the majority of user engagement.