# play store app review analysis



##### **Project Type**    - EDA
##### **Contribution**    - Individual

# **Project Summary -**

The Play Store apps data has enormous potential to drive app-making businesses to success.Actionable insights can be drawn for developers to work on and capture the Android market. Eachapp (row) has values for category, rating, size, and more. Another dataset contains customerreviews of the android apps. Explore and analyse the data to discover key factors responsible for
app engagement and success.


# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


Data Cleaning Step:
1.   Removing unwanted Values : Deleting of duplicate/incorrect or irrelevant values
2.   Handling Missing Values: Handling missing values in our Dataset
3.   Handling Structural Errors: Fixing mislabeled categories or classes, Types,
4.   Category Analysis: Analyze the distribution of apps across different
categories to identify the most popular or competitive categories
5.   Rating Analysis: Examine the relationship between app ratings and app engagement/success
6.  User Reviews Analysis: Analyze customer reviews to extract sentiments and identify common themes or issues raised by users



#### **Define Your Business Objective?**

The objective of this analysis is to explore and analyze the Play Store apps data, along with customer reviews, in order to identify key factors that contribute to app engagement and success in the Android market. By examining various attributes such as category, rating, size, and more, we aim to uncover actionable insights that can guide app-making businesses in capturing the Android market effectively..

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import numpy as np               # Numerical computing library
import pandas as pd              # Data manipulation and analysis library
import matplotlib.pyplot as plt  # Data visualization library
import seaborn as sns            # Statistical data visualization library


### Dataset Loading

In [None]:
# Load Dataset
data =pd.read_csv("/content/Play Store Data.csv")
# review = pd.read_csv("")
data

In [None]:
from google.colab import drive
drive.mount('/content/drive')

### Dataset First View

In [None]:
# Dataset First Look
data.head(10)

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count


rows_count, columns_count = data.shape

print(f"The dataset has {rows_count} rows and {columns_count} columns.")

columns=data


### Dataset Information

In [None]:
# Dataset Info
data.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
Duplicate_Value_Count=data.duplicated().sum()
print(Duplicate_Value_Count)


#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
missing_count = data.isnull().sum()

print("Missing values count:")
print(missing_count)


In [None]:
# Visualizing the missing values
# Create a bar plot of missing values
plt.figure(figsize=(10, 6))
missing_count.plot(kind='bar')
plt.xlabel('Columns')
plt.ylabel('Missing Values Count')
plt.title('Missing Values by Column')
plt.show()

### What did you know about your dataset?

till now we can observe the following:
*    The dataset has a total of 13 columns: 'App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', and 'Android Ver'.

*    The dataset contains 10,841 entries, ranging from index 0 to 10,840.

*    The missing values count shows the number of missing values in each column. The 'Rating' column has 1,474 missing values, 'Type' has 1 missing value, 'Content Rating' has 1 missing value, 'Current Ver' has 8 missing values, and 'Android Ver' has 3 missing values.

*    The data types of the columns include float64 for the 'Rating' column and object (string) for the rest of the columns.

can perform various data exploration and analysis techniques, such as:

Data Cleaning: Handle missing values by either imputing them or removing rows/columns with excessive missing values.

Data Transformation: Convert relevant columns to their appropriate data types (e.g., convert 'Reviews' column to numeric if it contains numerical values stored as strings).

Descriptive Statistics: Calculate summary statistics (e.g., mean, median, mode) for numeric columns ('Rating', 'Reviews') to understand their distribution and central tendencies.

Data Visualization: Create visualizations (e.g., bar plots, scatter plots, histograms) to explore relationships between different variables and identify patterns or trends.

Feature Engineering: Extract additional features from existing columns that might be relevant for app engagement and success (e.g., extract year or month from 'Last Updated' column).

Correlation Analysis: Calculate correlations between different variables to identify relationships and dependencies (e.g., correlation between 'Rating' and 'Installs').

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
print(data.columns)

# Index(['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type',
#        'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver',
#        'Android Ver'],
      # dtype='object')


In [None]:
# Dataset Describe
data.describe()


### Variables Description

Here is a description of the variables in the Play Store apps dataset:

App: The name of the mobile application.

Category: The category or genre to which the app belongs.

Rating: The user rating of the app (ranges from 1 to 5), representing the overall satisfaction level of users.

Reviews: The number of user reviews for the app.

Size: The size of the app in terms of storage space required.

Installs: The number of times the app has been installed by users.

Type: Whether the app is free or paid.

Price: The price of the app, if it is a paid app.

Content Rating: The age group for which the app is suitable, such as Everyone, Teen, Mature, etc.

Genres: Specific genres or sub-categories of the app.

Last Updated: The date when the app was last updated.

Current Ver: The version number of the current release of the app.

Android Ver: The minimum Android version required to run the app.

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for column in data.columns:
    unique_values = data[column].unique()
    print(f"Unique values for {column}: {unique_values}")

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.





# # Drop unnecessary columns
# data = data.drop(['Current Ver',
#        'Android Ver'], axis=1)

# Convert 'Reviews' column to numeric
data['Reviews'] = pd.to_numeric(data['Reviews'], errors='coerce')

# Handle missing values
data['Rating'].fillna(data['Rating'].median(), inplace=True)


# Convert 'Price' column to numeric
# data['Price'] = data['Price'].str.replace('$', '').astype(float)

# Extract year from 'Last Updated' column
data['Last Updated'] = pd.to_datetime(data['Last Updated']).dt.year





# Reset index
# data.reset_index(drop=True, inplace=True)




In [None]:
data.loc[data['Installs'] == 'Free', 'Installs'] = 0
data['Installs'] = data['Installs'].str.replace(',', '').str.replace('+', '').astype(float)

### What all manipulations have you done and insights you found?



Dropping unnecessary columns: The code drops the columns specified in ['Current Ver',
       'Android Ver''] using the drop() function.

Converting 'Reviews' column to numeric: The code uses pd.to_numeric() to convert the 'Reviews' column to a numeric data type.

Handling missing values: The code fills missing values in the 'Rating' column with the median value using fillna(). It also drops rows with missing values in the 'Type' and 'Content Rating' columns using dropna().

Converting 'Price' column to numeric: The code removes the dollar sign from the 'Price' column and converts it to a float using str.replace() and astype(float).

Extracting year from 'Last Updated' column: The code extracts the year from the 'Last Updated' column using pd.to_datetime() and the .dt.year attribute.




Resetting the index: The code resets the index of the DataFrame using reset_index().

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code
#  to visualize the relationship between 'Rating' and 'Reviews':
plt.scatter(data['Reviews'], data['Rating'])
plt.xlabel('Reviews')
plt.ylabel('Rating')
plt.title('Relationship between Rating and Reviews')
plt.ylim(0, 5)  # Set x-axis limit to 0-5 (Rating)
plt.show()

##### 1. Why did you pick the specific chart?




The scatter plot is chosen here because it allows us to observe the correlation or pattern between the 'Rating' and 'Reviews' variables. The x-axis represents the 'Rating' values, the y-axis represents the 'Reviews' values, and each point represents an app in the dataset..

##### 2. What is/are the insight(s) found from the chart?

The graph depicting the relationship between Rating and Reviews reveals that as the number of Reviews increases, the number of Rating tends to increase as well. Additionally, it can be observed that the majority of apps have a Rating of 4.5, and a significant number of apps fall within the rating range of 4 to 5.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

the insights gained from the relationship between Rating and Reviews can potentially have a positive business impact. Here's why:

Positive Impact:

As the number of Reviews increases, indicating higher user engagement and feedback, the number of Rating tends to increase. This suggests that user reviews and ratings play a significant role in driving app success and user satisfaction.
The majority of apps having a Rating of 4.5 suggests a high level of user satisfaction and positive feedback for those apps. This positive rating can help attract new users and retain existing ones.
The significant number of apps falling within the rating range of 4 to 5 indicates a positive perception of app quality among users, which can lead to increased downloads, positive word-of-mouth, and potentially higher revenue.
Negative Impact (if applicable):

From the given insights, there isn't any direct negative impact evident. However, it's essential to consider other factors and conduct a comprehensive analysis to identify any potential negative growth patterns or challenges that may not be apparent from this specific analysis. Factors such as negative user reviews, poor app performance, fierce competition, or lack of innovation can impact business growth negatively.

#### Chart - 2

In [None]:











# Chart - 2 visualization code
# want to compare the frequency of apps in each 'Category':
# Count the number of apps in each category
category_counts = data['Category'].value_counts()

# Bar chart
plt.bar(category_counts.index, category_counts.values)
plt.xlabel('Category')
plt.ylabel('Count')
plt.title('App Count by Category')
plt.xticks(rotation=90)
plt.show()

##### 1. Why did you pick the specific chart?

The bar chart is chosen here because it allows us to compare the frequency or count of apps in each 'Category'. Each bar represents a category, and the height of the bar indicates the count of apps in that category..

##### 2. What is/are the insight(s) found from the chart?

The insight gained from comparing the frequency of apps in each 'Category' reveals that the 'Family' category has the highest frequency, while the 'Beauty' category has the lowest frequency.
*     The higher frequency of apps in the 'Family' category indicates a strong demand for apps catering to family-related activities, such as parenting, education, entertainment, and games. Developers and businesses can consider focusing on this category to target a larger user base and potentially capitalize on the popularity of family-oriented apps.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The gained insights from comparing the frequency of apps in each 'Category' can potentially create a positive business impact. Here's why:

Positive Business Impact:

Understanding market demand: Knowing that the 'Family' category has the highest frequency suggests a significant user demand for family-oriented apps. This insight can guide businesses and developers in targeting this popular category to cater to the needs of a large user base, potentially leading to increased downloads, user engagement, and revenue.

Identifying opportunities: While the 'Family' category may have high competition, it also indicates a thriving market with opportunities for innovation and differentiation. By analyzing the frequency distribution, businesses can identify gaps or underserved areas within the 'Family' category, enabling them to create unique and compelling apps that stand out in the market.

 Negative growth could potentially occur due to reasons such as:

High competition in popular categories: The high frequency in certain categories, such as 'Family,' may also indicate intense competition. If businesses fail to differentiate their apps or provide added value to users, it could lead to challenges in standing out and achieving sustained growth.

Neglected categories with low frequency: Categories with lower frequency, such as 'Beauty,' may suggest a smaller user base or limited demand. Businesses entering such categories need to carefully assess market potential, user preferences, and competition to determine if the category can support their growth objectives.

#### Chart - 3

In [None]:
# Chart - 3 visualization code
# Group by 'Last Updated' year and calculate total installs
installs_by_year = data.groupby(data['Last Updated']).agg({'Installs': 'sum'})

# Line chart
plt.plot(installs_by_year.index, installs_by_year['Installs'])
plt.xlabel('Year')
plt.ylabel('Total Installs')
plt.title('Trend of Total Installs over Years')
plt.show()

##### 1. Why did you pick the specific chart?

The line chart is chosen here because it allows us to observe the trend in 'Installs' over different 'Last Updated' years. Each point on the line represents the total installs for a specific year, helping us identify any upward or downward trends over time.

##### 2. What is/are the insight(s) found from the chart?

. The insight gained from the line chart, which shows the trend of app installations over the years, is that there is an increasing trend in installations with the passage of time. As the years progress, the number of app installations tends to increase.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

ositive Business Impact:

Market growth opportunity: The insight that app installations are increasing over time suggests a growing market demand. This can be leveraged by businesses to target a larger user base, expand their customer reach, and potentially increase revenue through app downloads, in-app purchases, or advertising.

Potential for user engagement: With the increasing number of app installations, businesses have a greater opportunity to engage with users and build long-term relationships. By offering high-quality apps, delivering value, and providing excellent user experiences, businesses can drive user satisfaction, loyalty, and positive word-of-mouth, contributing to positive business impact.


 negative growth aspects:

Intensified competition: As app installations increase, the app market becomes more competitive. Businesses may face challenges in standing out from the crowd, capturing user attention, and acquiring a significant market share. Differentiation, innovation, and effective marketing strategies are crucial to overcome these challenges and achieve positive growth.

#### Chart - 4

In [None]:
# want to compare the distribution of 'Rating' for each 'Category':


plt.boxplot([data[data['Category'] == cat]['Rating'] for cat in data['Category'].unique()],
            labels=data['Category'].unique(), vert=True)  # Set vert=True to swap x-axis and y-axis
plt.ylim(0, 6)
plt.xlabel('Category')  # Set x-axis label
plt.ylabel('Rating')  # Set y-axis label
plt.title('Distribution of Rating by Category')
plt.xticks(rotation=90)
plt.show()


##### 1. Why did you pick the specific chart?

The box plot is chosen here because it allows us to compare the distribution of 'Rating' across different 'Category' groups. The boxes represent the interquartile range (IQR), the horizontal line inside the box represents the median, and the whiskers show the range of data. It helps us identify variations and outliers in the 'Rating' distribution for each category..

##### 2. What is/are the insight(s) found from the chart?

Consistent Median: The median 'Rating' across different categories is near about 4.5. This suggests that, on average, apps in various categories receive similar ratings from users.

Art and Design Category: The 'Art and Design' category has the highest median 'Rating' among all categories. This indicates that, on average, apps in the 'Art and Design' category tend to receive higher ratings compared to other categories.

Video Editor Category: The 'Video Editor' category has the lowest median 'Rating' among all categories. This suggests that apps in the 'Video Editor' category receive comparatively lower ratings on average.

Range of Ratings: The range of 'Rating' values for the 'Video Editor' category appears to be the highest, indicating a wider spread of ratings for these apps. On the other hand, categories such as 'Business', 'Dating', and 'Sports' have smaller ranges, indicating a narrower spread of ratings for apps in these categories.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Positive Business Impact:

Improving user satisfaction: By understanding the consistent median rating of around 4.5 across categories, businesses can focus on improving app quality and user experience to maintain or exceed this average rating. This can result in increased user satisfaction, positive reviews, and recommendations, leading to a positive impact on user acquisition and retention.

Leveraging high-rated categories: The insight that the 'Art and Design' category has the highest median rating suggests a positive reception by users. Businesses can leverage this knowledge to invest in and capitalize on the popularity of this category, potentially attracting a larger user base and generating positive business impact.

Negative Growth Insights (if applicable):
While the insights from the box chart generally point towards positive growth, it's important to consider potential challenges or negative growth aspects:

Challenges in low-rated categories: The insight that the 'Video Editor' category has the lowest median rating and the highest rating range indicates potential challenges in this category. Businesses operating in this category may need to address user concerns, improve app functionality, and enhance user experience to overcome negative feedback and drive growth.

Competitiveness in high-rated categories: Although the 'Art and Design' category has the highest median rating, it may also imply higher competition and saturation. Businesses entering this category need to differentiate their apps, offer unique features, and continuously innovate to stand out and achieve sustained growth.

Addressing specific user needs: The differences in rating ranges across categories, such as 'Business', 'Dating', and 'Sports', indicate varying user expectations and preferences. Businesses operating in these categories need to thoroughly understand their target audience, align their app features with user needs, and continuously adapt to evolving user demands to achieve positive growth.

#### Chart - 5

In [None]:
# Chart - 5 visualization code

# Calculate the count of each Content Rating category
content_rating_counts = data['Content Rating'].value_counts()

# Create a pie chart
plt.pie(content_rating_counts, labels=content_rating_counts.index, autopct='%1.1f%%')
plt.title('Content Rating Distribution')

# Adjust the aspect ratio to create a circle
plt.axis('equal')

# Display the chart
plt.show()

##### 1. Why did you pick the specific chart?

Displaying proportions: A pie chart is an effective way to display proportions and relative sizes of different categories. Each slice of the pie represents a Content Rating category, and the size of each slice corresponds to the proportion of apps belonging to that category.



##### 2. What is/are the insight(s) found from the chart?

The insights gained from the pie chart representing the distribution of Content Rating categories are as follows:

Majority in "Everyone" category: The chart shows that the "Everyone" category has the highest proportion, accounting for 84% of the apps in the dataset. This indicates that a significant majority of the apps are suitable for users of all age groups.

Significant presence of "Teen" category: The "Teen" category represents 11.1% of the apps. This suggests that a notable portion of the apps is specifically targeted towards teenage users, catering to their preferences and requirements.

Moderate representation of "Mature" category: The "Mature" category holds a 4.8% proportion in the distribution. This suggests the presence of a moderate number of apps designed for mature audiences, indicating a segment of the market that caters to older users with more mature content or themes.

Limited apps in the "18+" category: The "18+" category accounts for only 3.8% of the apps in the dataset. This indicates a relatively smaller number of apps with content specifically intended for adult audiences, which may involve explicit or restricted content.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.



```

```

Positive Business Impact:

Broad target audience: The insight that the majority of apps fall under the "Everyone" category suggests a broad target audience, potentially increasing the market size and user reach. Businesses can benefit from targeting a wide range of users, as it opens up opportunities for app adoption and revenue generation.

Focus on teen users: The significant presence of the "Teen" category indicates a sizable market segment. Businesses that cater to the preferences and needs of teenagers can leverage this insight to develop and market apps tailored specifically for this demographic, potentially leading to increased downloads, user engagement, and monetization opportunities.

Negative Growth Insights (if applicable):
While the insights from the Content Rating distribution are generally positive, it's important to consider potential challenges or negative growth aspects:

Limited market for mature and 18+ apps: The relatively lower proportions of the "Mature" and "18+" categories suggest a narrower target audience for apps with mature or adult-oriented content. Businesses focusing solely on these categories may face limitations in terms of market size and potential user reach.

Regulatory and platform restrictions: App stores and platforms often enforce strict guidelines and restrictions regarding content ratings, especially for mature or explicit content. Businesses operating in the "Mature" or "18+" categories may face increased scrutiny, content moderation challenges, and limited visibility due to these regulations.

Need for careful content curation: With the presence of different content rating categories, businesses must carefully curate and align their app content with the specified ratings. Failure to accurately categorize or misrepresent the content could result in negative user experiences, loss of trust, and potential backlash.

#### Chart - 6

In [None]:
# Chart - 6 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 7

In [None]:
# Chart - 7 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 8

In [None]:
# Chart - 8 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 9

In [None]:
# Chart - 9 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 10

In [None]:
# Chart - 10 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 11

In [None]:
# Chart - 11 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 12

In [None]:
# Chart - 12 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code
# Select the columns of interest
columns_of_interest = ['Installs', 'Price', 'Rating']
df_subset = data[columns_of_interest]

# Convert 'Price' column to numeric
df_subset['Price'] = pd.to_numeric(df_subset['Price'], errors='coerce')

# Calculate the correlation matrix
corr_matrix = df_subset.corr()

# Generate the heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm')
plt.title('Correlation Heatmap: Reviews, Installs, Price, and Rating')
plt.show()





##### 1. Why did you pick the specific chart?

 The purpose of the code is to visualize the correlation between two variables ('Rating' and 'Category'). A heatmap is an effective way to represent the correlation matrix, where the values are color-coded to provide a quick and intuitive understanding of the relationships between variables..

##### 2. What is/are the insight(s) found from the chart?

Answer Here

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code
# Select the columns of interest for the pair plot
columns_of_interest = ['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type',
       'Price', 'Content Rating', 'Genres', 'Last Updated',]

# Subset the DataFrame with the selected columns
df_subset = data[columns_of_interest]

# Plot the pair plot
sns.pairplot(df_subset)
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Focus on High-Rated Categories: Given the insight that the 'Art and Design' category has the highest median rating, the client can consider investing resources and efforts in developing and promoting apps in this category. By leveraging the positive reception and popularity of this category, they can potentially attract a larger user base and increase user satisfaction.

Improve User Engagement and Retention: To foster positive business impact, the client should prioritize user engagement and retention. This can be achieved by continuously improving the quality and user experience of their apps. Regular updates, addressing user feedback, and offering additional features can help keep users engaged and increase their likelihood of staying active and loyal.

Capitalize on Increasing App Installations: Since the trend analysis reveals an increasing trend in app installations over the years, the client should seize this opportunity to capitalize on the growing market demand. They can invest in targeted marketing and promotional strategies to reach a wider audience, improve app visibility, and attract more users.

Address Negative Growth Areas: While the insights generally suggest positive growth, the client should also pay attention to potential negative growth areas. For instance, if the 'Video Editor' category has a lower median rating and limited growth potential, the client can evaluate ways to enhance the quality, features, and user experience of their video editing apps to overcome negative feedback and drive growth.

Continuously Monitor and Adapt: The app market is dynamic and constantly evolving. To stay ahead and achieve business objectives, the client should monitor industry trends, user preferences, and competitor activities. They should be open to adapting their strategies, incorporating user feedback, and exploring new opportunities to maintain a competitive edge and meet evolving user needs..

# **Conclusion**

.

*    Insights and Recommendations:

Increasing App Installations: The trend analysis indicates a positive growth in app installations over the years. Businesses should focus on leveraging this growing market demand and targeting wider user audiences.

Category Analysis: Different app categories have varying levels of popularity and user engagement. Businesses can benefit from understanding category preferences and investing resources accordingly.

Content Rating Distribution: The distribution of content ratings helps businesses align their apps with appropriate age groups, ensuring compliance and user suitability.


Rating and Reviews Relationship: There is a positive correlation between the number of reviews and the app rating. Encouraging user reviews and feedback can lead to higher ratings and increased app success.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***