Sugarfit Google Play Store Review Analysis and Power BI Reporting

Contents📖

Introduction
Data Scraping and Sentiment Analysis
Data Cleaning and Exploration
Data Analysis
Data Visualization
Findings

Introduction

Mobile applications heavily rely on user feedback to refine their offerings and ensure customer satisfaction. The Sugar.fit Android app, aimed at addressing diabetes-related concerns, stands to benefit greatly from an in-depth analysis of user reviews sourced from the Google Play Store. This project endeavors to leverage advanced analytics techniques to dissect these reviews, extracting valuable insights that can inform strategic decisions and drive improvements in app functionality and user experience.

Feel free to reach out for any questions or suggestions about this project. I'm open to discussions and eager to assist. Linkedln | Mariya Joseph

Don't forget to follow and star ⭐ the repository if you find it valuable.

Database:PostgreSQL
Programming Language: Python
Libraries: Pandas, Numpy, tensorflow
IDE: Microsoft Azure Data Studio

Data Scraping & Sentiment Analysis📊

Programming Language: Python
Libraries: Pandas, Numpy, Tensorflow
IDE: Microsoft Azure Data Studio

Import Required Libraries

from google_play_scraper import app, Sort ,reviews_all
from app_store_scraper import AppStore
import pandas as pd 
import numpy as np 
import json,os,uuid

Fetching Google Play Store Reviews for Sugarfit Android app

g_reviews=reviews_all(
    'fit.sugar.android',
    sleep_milliseconds=0,
    lang='en',
    country='us',
    sort=Sort.NEWEST,
)

Transforming and Cleaning Google Play Store Reviews Data for Analysis

g_df=pd.DataFrame(np.array(g_reviews),columns=['review'])
g_df2=g_df.join(pd.DataFrame(g_df.pop('review').tolist()))

g_df2.drop(columns={'userImage','reviewCreatedVersion'},inplace=True)
g_df2.rename(columns={'reviewId':'ID','userName':'Username','content':'Review','score':'AppRating','thumbsUpCount':'ThumbsUpCount','at':'ReviewTime','replyContent':'CompanyReply','repliedAt':'ReplyTime','appVersion':'AppVersion'},inplace=True)
g_df2

Calculating the Mean App Rating of Sugar.fit Android App Reviews

g_df2['AppRating'].mean()

Installing TensorFlow

!pip install tensorflow

import tensorflow as tf

!pip install ipykernel

import tensorflow as tf
print(tf.__version__)

Importing and Configuring Sentiment Analysis Pipeline

from transformers import pipeline
sentiment_analysis = pipeline("sentiment-analysis",model="siebert/sentiment-roberta-large-english")

print(sentiment_analysis("I love this!"))

Checking Data Types of DataFrame Columns

g_df2.dtypes

Converting Review Column to String Data Type

g_df2['Review']=g_df2['Review'].astype('str')

Applying Sentiment Analysis on Review Column

g_df2['result']=g_df2['Review'].apply(lambda x: sentiment_analysis(x))

g_df2.head()

Extracting Sentiment Label and Score from Result Column

g_df2['sentiment']=g_df2['result'].apply(lambda x: (x[0]['label']))
g_df2['score']=g_df2['result'].apply(lambda x: (x[0]['score']))

g_df2.head()

Calculating the Mean Sentiment Score

g_df2['score'].mean()

g_df2['sentiment'].value_counts()

Calculating Normalized Sentiment Distribution

g_df2['sentiment'].value_counts(normalize=True)

Visualizing Sentiment Distribution with Plotly Express

import plotly.express as px
fig=px.histogram(g_df2,x='sentiment',color='sentiment', text_auto=True)
fig.show()

Exporting DataFrame to CSV File and Reading it Back

g_df2.to_csv("C:\\Users\\hp\\Desktop\\newfile.CSV")
g_df2=pd.read_csv("C:\\Users\\hp\\Desktop\\newfile.csv")
g_df2

Data Cleaning and Exploration🧹

Tools Used🛠️:Microsoft Excel

Deleted unwanted columns that is not required for this analysis
Checked and formatted the cells with proper datatypes

Missing Values in each column

used filter function in excel to identify missing/null values

conditional foramtting to identify and highlight the missing values

Removing the duplicates

Handling the missing values by using find & select inbuilt function in excel

Replaced the blank space with NULL for the column that is TEXT Datatype
Replaced the blank space with a default date value for the column that is DATE Datatype

Data Analysis📈

Tools Used⚙️:PostgreSQL

Creating and importing dataset to postgreSQL

CREATE TABLE SF1 (
  ID VARCHAR(50),
  Username VARCHAR(50),
  Review VARCHAR(5000),
  AppRating INT,
  ThumbsUpCount INT,
  ReviewTime DATE,
  CompanyReply VARCHAR(5000),
  ReplyTime DATE,
  sentiment VARCHAR(10),
  score NUMERIC );

Selecting and viewing the dataset

SELECT * FROM SF1

Total Number of reviews

SELECT COUNT(*) AS total_reviews
FROM sf1;

Total number of positive reviews

SELECT COUNT(*)
FROM SF1
WHERE sentiment='POSITIVE';

Total number of negative reviews

SELECT COUNT(*)
FROM SF1
WHERE sentiment='NEGATIVE';

Average App Rating

SELECT AVG(AppRating) AS average_rating
FROM sf1

Review with highest thump count

SELECT Review, ThumbsUpCount
FROM sf1
ORDER BY ThumbsUpCount DESC
LIMIT 1;

Users with high number of reviews

SELECT Username, COUNT(*) AS review_count
FROM sf1
GROUP BY Username
ORDER BY review_count DESC
LIMIT 5;

Average Sentiment Score

SELECT AVG(score) AS average_sentiment_score
FROM sf1;

Count of User Reviews that are not replied by company

SELECT COUNT(*)AS not_replied
FROM SF1
WHERE companyreply='NULL';

Total number of reviews that are replied by company

SELECT id,review
FROM SF1
WHERE companyreply='NULL';

Reviews that are not replied by company

SELECT COUNT(*)AS replied
FROM sf1
WHERE id NOT IN
(SELECT id
FROM SF1
WHERE companyreply='NULL');

Reviews with sentiment negative to analyze the issues

SELECT review
FROM sf1
WHERE sentiment = 'NEGATIVE';

Distribution of ratngs and it's total count

SELECT AppRating, COUNT(*) AS rating_count
FROM sf1
GROUP BY AppRating
ORDER BY rating_count DESC;

Reviews with the text like 'Challenges'

SELECT id,review
FROM sf1
WHERE Review LIKE '%challenges%';

Total number of reviews per year

SELECT EXTRACT(YEAR FROM ReviewTime) AS review_year, COUNT(*) AS review_count
FROM sf1
GROUP BY review_year
ORDER BY review_year;

Reviews with the text like 'Personalized diet'

SELECT id,review
FROM sf1
WHERE Review LIKE '%personalized diet%';

Latest 5 reviews and date

SELECT review,reviewtime
FROM sf1
ORDER BY ReviewTime DESC
LIMIT 5;

------------------------------------------------------------------------------------------------------

Data Visualization

Tools Used⚙️:Microsoft Power BI

Findings

Based on the analysis conducted in the project, the following findings and suggestions can be derived:

Understanding User Issues: By analyzing the negative reviews, we gained insights into user concerns and issues. This enables us to address them more effectively, leading to improved user satisfaction.
Identifying Review Trends: The project helped in identifying trends in user reviews, such as common topics mentioned in negative feedback or frequently praised aspects in positive reviews. This information can guide product development and marketing strategies.
Enhancing Customer Trust: Responding promptly to customer reviews demonstrates attentiveness and care towards users' experiences. It fosters trust and loyalty among customers, showing them that their feedback is valued and acted upon.
Improving Customer Engagement: Engaging with customers through timely replies to reviews creates a positive interaction and strengthens the relationship between the company and its users. It encourages ongoing dialogue and fosters a sense of community around the product.
Leveraging Feedback for Improvement: Analyzing user feedback provides valuable insights for product improvement and feature development. By addressing user concerns and implementing suggestions, the company can enhance the overall user experience and stay competitive in the market.
Understanding Pain Points: Negative reviews often highlight areas where users are facing challenges or experiencing dissatisfaction with the product or service. By analyzing these reviews, we can pinpoint specific pain points and areas for improvement.
Proactive Issue Resolution: Identifying and addressing negative feedback promptly demonstrates a proactive approach to problem-solving. By acknowledging user concerns and taking steps to resolve issues, the company can prevent further escalation and mitigate potential damage to its reputation.
Opportunity for Improvement: Negative reviews present valuable opportunities for improvement. By listening to user feedback and incorporating suggestions for enhancement, the company can iteratively improve its products or services, ultimately leading to a better user experience and increased customer satisfaction.
Building Trust and Credibility: Transparently addressing negative feedback shows users that their concerns are taken seriously and that the company is committed to delivering a high-quality product or service. This builds trust and credibility with users, fostering stronger relationships and brand loyalty over time.
Continuous Feedback Loop: By monitoring and analyzing negative reviews on an ongoing basis, the company can establish a continuous feedback loop for product improvement. This allows for agile and responsive development, ensuring that user needs and preferences are always top of mind.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
Data Analysis.md		Data Analysis.md
Data Cleaning and Exploration.md		Data Cleaning and Exploration.md
Data Scraping and Sentimental Analysis.md		Data Scraping and Sentimental Analysis.md
Data Visualization.md		Data Visualization.md
Findings.md		Findings.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Analysis.md

Data Analysis.md

Data Cleaning and Exploration.md

Data Cleaning and Exploration.md

Data Scraping and Sentimental Analysis.md

Data Scraping and Sentimental Analysis.md

Data Visualization.md

Data Visualization.md

Findings.md

Findings.md

README.md

README.md

Repository files navigation

Sugarfit Google Play Store Review Analysis and Power BI Reporting

Contents📖

Introduction

Data Scraping & Sentiment Analysis📊

Data Cleaning and Exploration🧹

Data Analysis📈

Data Visualization

Findings

About

Releases

Packages

Mariyajoseph24/SugarFit_Google_Play_Store_Review_Analysis_and_Power_BI_Reporting

Folders and files

Latest commit

History

Repository files navigation

Sugarfit Google Play Store Review Analysis and Power BI Reporting

Contents📖

About

Topics

Resources

Stars

Watchers

Forks