# Google Play Store Data Analysis
## Internship Project
This project analyzes Google Play Store data to uncover app market trends, user sentiment, and pricing strategies.

## 1. Data Loading & Cleaning

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Load datasets
apps_df = pd.read_csv('apps.csv')
reviews_df = pd.read_csv('user_reviews.csv')

# Display first few rows
apps_df.head()

## 2. Handling Missing Values & Data Type Correction

In [None]:
# Drop unnecessary column
apps_df.drop(columns=['Unnamed: 0'], inplace=True)

# Convert Installs to numeric
apps_df['Installs'] = apps_df['Installs'].str.replace(',', '').str.replace('+', '').astype(float)

# Convert Price to numeric
apps_df['Price'] = apps_df['Price'].str.replace('$', '').astype(float)

# Remove rows with missing values
apps_df.dropna(inplace=True)

## 3. Category Exploration - App Distribution

In [None]:
# Countplot of apps by category
plt.figure(figsize=(12,6))
sns.countplot(y=apps_df['Category'], order=apps_df['Category'].value_counts().index, palette='viridis')
plt.xlabel('Number of Apps')
plt.ylabel('Category')
plt.title('App Distribution by Category')
plt.show()

## 4. Metrics Analysis - Ratings, Size, and Pricing Trends

In [None]:
# Scatter plot of App Size vs Ratings
plt.figure(figsize=(8,5))
sns.scatterplot(x=apps_df['Size'], y=apps_df['Rating'], alpha=0.5)
plt.xlabel('App Size (MB)')
plt.ylabel('Rating')
plt.title('App Size vs Rating')
plt.show()

## 5. Sentiment Analysis on User Reviews

In [None]:
# Sentiment distribution in user reviews
plt.figure(figsize=(6,4))
sns.countplot(x=reviews_df['Sentiment'], palette='coolwarm')
plt.title('Sentiment Distribution in Reviews')
plt.show()

## 6. Interactive Visualization - Sentiment Trends for Top Apps

In [None]:
# Top 10 Apps with Most Reviews
top_apps = reviews_df['App'].value_counts().head(10).index
top_reviews = reviews_df[reviews_df['App'].isin(top_apps)]

# Sentiment distribution for top apps
plt.figure(figsize=(12,6))
sns.countplot(x='App', hue='Sentiment', data=top_reviews, palette='coolwarm')
plt.xticks(rotation=45)
plt.title('Sentiment Distribution for Top 10 Apps')
plt.show()

## 7. Insights & Business Impact
- **Understand Market Trends**: Identify the most popular app categories.
- **Optimize Pricing Strategies**: Evaluate how pricing affects downloads and ratings.
- **Improve User Engagement**: Leverage sentiment analysis to enhance app quality.
- **Data-Driven Decision Making**: Helps developers create better, more competitive apps.