<a href="https://colab.research.google.com/github/vamsibitra/Capstone1-Telecom_churn_analysis/blob/main/EDA_Submission_Template_Day3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    -  Play Store App Review Analysis



##### **Project Type**    - EDA
##### **Contribution**    - Team
##### **Team Member 1 -** Vamsi Bitra
##### **Team Member 2 -** Anjali Bathula

# **Project Summary**

Developers and businesses continually seek ways to not only create innovative apps but also ensure their success in the Google Play Store, one of the world's largest app distribution platforms. To thrive in this environment, it is imperative to harness the power of data analytics. Over 1.5 billion Android smartphones were shipped last year(2022) this is the rough estimation that shows the number of users of play store. The most popular downloaded apps all over the world are TikTok, Instagram, Facebook, WhatsApp, Telegram till now. This shows that these apps got the better review by so many users. Some apps which got very few downloads, this tells us these apps are negatively reviewed.

The Play Store Apps Review Analysis project aims to unlock the vast potential hidden within app reviews and metadata to provide actionable insights for developers, thereby driving app-making businesses towards success.

#**Project Data**

The core of the Play Store Apps Review Analysis project revolves around two datasets:


**Play Store Apps Dataset**: This dataset contains comprehensive information about various Android applications available on the Play Store. Each entry provides valuable details such as the app's category, user ratings, size, and several other attributes.

**Customer Reviews Dataset:** Complementing the primary dataset is a repository of customer reviews for these Android apps. These reviews are a valuable source of user-generated content, offering insights into user sentiments, opinions, and feedback.

The journey of this project begins with an extensive data exploration phase. Here, the team conducts comprehensive data cleaning and preprocessing, ensuring that the datasets are in optimal shape for subsequent analysis. This crucial step lays the foundation for all subsequent insights and findings.

Sentiment analysis is a pivotal aspect of the project, leveraging advanced natural language processing techniques to delve into the Customer Reviews Dataset. The objective is to extract sentiment insights from user-generated content, shedding light on how users perceive and engage with different apps. This analysis goes beyond quantitative metrics, providing a nuanced understanding of user sentiments, opinions, and feedback.

Identifying the key factors responsible for app engagement and success is a fundamental goal of the project. This involves a multifaceted approach, including the identification of trends among highly-rated apps, an examination of the impact of app size on download rates, and an investigation into whether specific app categories possess a more substantial user appeal. By identifying these factors, developers and businesses can make data-informed decisions about their app development, marketing strategies, and user experience enhancements.

#**Benefits**

The Play Store Apps Review Analysis project offers several benefits:

**Informed Decision-Making:** Developers and businesses can make informed decisions based on user data, enhancing their app's design, marketing strategies, and user experience.

**Competitive Advantage:** Access to actionable insights can give businesses a competitive edge in a crowded app marketplace, helping them stand out and attract more users.

**Improved User Satisfaction:** Addressing user preferences and pain points can lead to increased user satisfaction, reflected in higher app ratings and more downloads.

**Data-Driven Growth:** Businesses can use data-driven strategies to drive growth, expand their presence in the Android app market, and adapt to changing trends.

# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


**Google Play Store is one of the most used app market in the world. But there are some faults in the apps that are present in the play store. We explore and analyse the data given by both google and it's users and give major responsibilities for the app to get success and some profit.**

#### **Define Your Business Objective?**

 Effectively managing and leveraging app reviews can contribute to the overall success of an app in the marketplace.

The Business Objectives of Google Play Store App are:


* **User Feedback and Improvement:** Here we are going to collect the play store reviews given to each app. The coders of these app are going to look into the reviews and going to develop much better and solve the issues, bugs or user experience problems in their app. There are going to develop their app functionality continuously.
* **App Rating and Visibility:** Positive reviews can improve an app's overall rating on the Play Store, making it more attractive to potential users. A higher rating can increase an app's visibility in search results and category rankings, leading to more downloads and installs.
* **User Engagement:** Engaging with users through the review section can help build a sense of community around the app. Responding to reviews, both positive and negative, demonstrates that the developer values user input and is actively working to address concerns. This can enhance user loyalty and encourage more reviews.
* **Competitive Analysis:** Analyzing competitor's app reviews can provide insights into what users like and dislike about competing apps. This information can be used to refine your app's features and user experience to gain a competitive edge.
* **Feature Prioritization:** Reviews often contain feature requests and suggestions from users. Analyzing these requests can help prioritize which new features or improvements to implement, aligning the app's development roadmap with user preferences.
* **Customer Support and Issue Resolution:** Negative reviews can highlight specific problems that users are facing. Responding to these reviews promptly and resolving issues can prevent users from churning and leaving poor ratings. Effective customer support through the review section can lead to improved user satisfaction and loyalty.
* **Marketing and Promotion:** Positive reviews and high app ratings can be used as marketing assets. Developers can showcase these reviews in promotional materials, advertisements, and app descriptions to attract potential users and build trust in their app.
* **User Acquisition:** Encouraging satisfied users to leave reviews can serve as a form of user-generated content that can attract new users. Apps with positive reviews and high ratings are more likely to be considered by potential users looking for new apps to download.
* **App Monetization:** Some apps monetize through in-app purchases, ads, or subscriptions. Positive reviews and high user satisfaction can lead to increased engagement and monetization opportunities.
* **Compliance and Quality Assurance:** Monitoring reviews can help developers ensure that their app complies with Play Store policies and quality guidelines. It can also help identify any issues that may arise from software updates or changes in the app's functionality.


# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [1]:
# Import Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

### Dataset Loading

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
# Load Dataset
play_store_data = pd.read_csv("/content/drive/MyDrive/eda capstone dataset/Play Store Data.csv")
user_reviews = pd.read_csv("/content/drive/MyDrive/eda capstone dataset/User Reviews.csv")

### Dataset First View

In [11]:
# Dataset First Look
ps_data = play_store_data.head()
print(f"The first 5 values of Play Store dataset are below")
ps_data

The first 5 values of Play Store dataset are below


Unnamed: 0,App,Category,Rating,Reviews,Size,Installs,Type,Price,Content Rating,Genres,Last Updated,Current Ver,Android Ver
0,Photo Editor & Candy Camera & Grid & ScrapBook,ART_AND_DESIGN,4.1,159,19M,"10,000+",Free,0,Everyone,Art & Design,"January 7, 2018",1.0.0,4.0.3 and up
1,Coloring book moana,ART_AND_DESIGN,3.9,967,14M,"500,000+",Free,0,Everyone,Art & Design;Pretend Play,"January 15, 2018",2.0.0,4.0.3 and up
2,"U Launcher Lite – FREE Live Cool Themes, Hide ...",ART_AND_DESIGN,4.7,87510,8.7M,"5,000,000+",Free,0,Everyone,Art & Design,"August 1, 2018",1.2.4,4.0.3 and up
3,Sketch - Draw & Paint,ART_AND_DESIGN,4.5,215644,25M,"50,000,000+",Free,0,Teen,Art & Design,"June 8, 2018",Varies with device,4.2 and up
4,Pixel Draw - Number Art Coloring Book,ART_AND_DESIGN,4.3,967,2.8M,"100,000+",Free,0,Everyone,Art & Design;Creativity,"June 20, 2018",1.1,4.4 and up


In [12]:
ur_data = user_reviews.head()
print(f"The first 5 values of User reviews dataset are below")
ur_data

The first 5 values of User reviews dataset are below


Unnamed: 0,App,Translated_Review,Sentiment,Sentiment_Polarity,Sentiment_Subjectivity
0,10 Best Foods for You,I like eat delicious food. That's I'm cooking ...,Positive,1.0,0.533333
1,10 Best Foods for You,This help eating healthy exercise regular basis,Positive,0.25,0.288462
2,10 Best Foods for You,,,,
3,10 Best Foods for You,Works great especially going grocery store,Positive,0.4,0.875
4,10 Best Foods for You,Best idea us,Positive,1.0,0.3


### Dataset Rows & Columns count

In [13]:
# Dataset Rows & Columns count
ps_shape = play_store_data.shape
print(f"The Playstore dataset consists of {ps_shape[0]} rows and {ps_shape[1]} columns.")

The Playstore dataset consists of 10841 rows and 13 columns.


In [14]:
ur_shape = user_reviews.shape
print(f"The User reviews dataset consists of {ur_shape[0]} rows and {ur_shape[1]} columns.")

The User reviews dataset consists of 64295 rows and 5 columns.


### Dataset Information

In [15]:
# Dataset Info
ps_info = play_store_data.info()
print(ps_info)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10841 entries, 0 to 10840
Data columns (total 13 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   App             10841 non-null  object 
 1   Category        10841 non-null  object 
 2   Rating          9367 non-null   float64
 3   Reviews         10841 non-null  object 
 4   Size            10841 non-null  object 
 5   Installs        10841 non-null  object 
 6   Type            10840 non-null  object 
 7   Price           10841 non-null  object 
 8   Content Rating  10840 non-null  object 
 9   Genres          10841 non-null  object 
 10  Last Updated    10841 non-null  object 
 11  Current Ver     10833 non-null  object 
 12  Android Ver     10838 non-null  object 
dtypes: float64(1), object(12)
memory usage: 1.1+ MB
None


In [16]:
ur_info = user_reviews.info()
print(ur_info)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 64295 entries, 0 to 64294
Data columns (total 5 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   App                     64295 non-null  object 
 1   Translated_Review       37427 non-null  object 
 2   Sentiment               37432 non-null  object 
 3   Sentiment_Polarity      37432 non-null  float64
 4   Sentiment_Subjectivity  37432 non-null  float64
dtypes: float64(2), object(3)
memory usage: 2.5+ MB
None


#### Duplicate Values

In [17]:
# Dataset Duplicate Value Count
new_ps_data = play_store_data.drop_duplicates()
new_ps_data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10358 entries, 0 to 10840
Data columns (total 13 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   App             10358 non-null  object 
 1   Category        10358 non-null  object 
 2   Rating          8893 non-null   float64
 3   Reviews         10358 non-null  object 
 4   Size            10358 non-null  object 
 5   Installs        10358 non-null  object 
 6   Type            10357 non-null  object 
 7   Price           10358 non-null  object 
 8   Content Rating  10357 non-null  object 
 9   Genres          10358 non-null  object 
 10  Last Updated    10358 non-null  object 
 11  Current Ver     10350 non-null  object 
 12  Android Ver     10355 non-null  object 
dtypes: float64(1), object(12)
memory usage: 1.1+ MB


In [18]:
new_ur_data = user_reviews.drop_duplicates()
new_ur_data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 30679 entries, 0 to 64236
Data columns (total 5 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   App                     30679 non-null  object 
 1   Translated_Review       29692 non-null  object 
 2   Sentiment               29697 non-null  object 
 3   Sentiment_Polarity      29697 non-null  float64
 4   Sentiment_Subjectivity  29697 non-null  float64
dtypes: float64(2), object(3)
memory usage: 1.4+ MB


#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count

In [None]:
# Visualizing the missing values

### What did you know about your dataset?

Answer Here

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns

In [None]:
# Dataset Describe

### Variables Description

Answer Here

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.

### What all manipulations have you done and insights you found?

Answer Here.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 2

In [None]:
# Chart - 2 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 3

In [None]:
# Chart - 3 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 4

In [None]:
# Chart - 4 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 5

In [None]:
# Chart - 5 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 6

In [None]:
# Chart - 6 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 7

In [None]:
# Chart - 7 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 8

In [None]:
# Chart - 8 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 9

In [None]:
# Chart - 9 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 10

In [None]:
# Chart - 10 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 11

In [None]:
# Chart - 11 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 12

In [None]:
# Chart - 12 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Answer Here.

# **Conclusion**

Write the conclusion here.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***