# Apple Store Apps Analysis

## Introduction
Welcome! This portfolio project analyzes Apple Store app data to uncover insights about app ratings, pricing, and genre trends. The analysis demonstrates end-to-end data science skills: data cleaning, exploratory analysis, visualization, and storytelling. Key business takeaways are highlighted for potential employers.

## Data Loading

We load the dataset directly from the repository. Make sure `Applestore.csv` is present in your project directory.

In [ ]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="whitegrid")

# Load data using repo-relative path
data = pd.read_csv('Applestore.csv')
data.head()

## Data Overview & Cleaning

Let's inspect the structure and cleanliness of the data.

In [ ]:
# Basic info
print(f"Rows: {data.shape[0]}, Columns: {data.shape[1]}")
data.info()

In [ ]:
# Check for missing values
missing = data.isnull().sum()
print('Missing values per column:')
print(missing)
assert missing.sum() == 0, "Dataset contains missing values!"

No missing values detected. The dataset is ready for analysis.

In [ ]:
# Unique values per column
data.nunique()

## Exploratory Data Analysis (EDA)

Let's explore key features and trends using summary statistics and visualizations.

In [ ]:
# Summary statistics
data.describe(include='all')

### Distribution of App Ratings

In [ ]:
plt.figure(figsize=(10, 5))
sns.histplot(data['user_rating'], bins=30, kde=True, color='royalblue')
plt.title('Distribution of User Ratings')
plt.xlabel('User Rating')
plt.ylabel('Frequency')
plt.show()

*Interpretation:* Most apps have user ratings between 3.5 and 5.0, indicating generally positive user experiences.

### Average Rating by Genre

In [ ]:
avg_rating_by_genre = data.groupby('prime_genre')['user_rating'].mean().sort_values(ascending=False)
plt.figure(figsize=(12,6))
avg_rating_by_genre.plot(kind='bar', color='teal')
plt.title('Average User Rating by App Genre')
plt.xlabel('Genre')
plt.ylabel('Average Rating')
plt.xticks(rotation=45, ha='right')
plt.show()

*Insight:* Productivity and Music genres have the highest average ratings. Utility and Entertainment genres are also well-received.

### Top 10 Highest-Rated Apps

In [ ]:
top_rated = data.nlargest(10, 'user_rating')[['track_name', 'user_rating']]
top_rated

*Observation:* These apps achieved perfect ratings, reflecting excellent user satisfaction.

### Apps with the Most Ratings

In [ ]:
most_rated_apps = data.nlargest(10, 'rating_count_tot')[['track_name', 'rating_count_tot']]
most_rated_apps

*Insight:* Social and gaming apps dominate the most-rated list, indicating their massive popularity.

### Average Price by Genre

In [ ]:
avg_price_by_genre = data.groupby('prime_genre')['price'].mean().sort_values(ascending=False)
plt.figure(figsize=(12,6))
avg_price_by_genre.plot(kind='bar', color='salmon')
plt.title('Average Price by App Genre')
plt.xlabel('Genre')
plt.ylabel('Average Price (USD)')
plt.xticks(rotation=45, ha='right')
plt.show()

*Observation:* Medical and Business apps tend to be more expensive, while Social Networking and Shopping apps are usually free.

### Distribution of Content Ratings

In [ ]:
plt.figure(figsize=(8,5))
sns.countplot(x='cont_rating', data=data, order=data['cont_rating'].value_counts().index, palette='viridis')
plt.title('Distribution of Content Ratings')
plt.xlabel('Content Rating')
plt.ylabel('Number of Apps')
plt.show()

*Observation:* Most apps are rated 4+, suitable for general audiences.

### Average Rating by Content Rating

In [ ]:
avg_rating_by_content = data.groupby('cont_rating')['user_rating'].mean().sort_values(ascending=False)
avg_rating_by_content

*Insight:* Apps rated 9+ and 4+ tend to have the best user ratings, suggesting suitability for younger users may correlate with higher satisfaction.

## Key Insights & Conclusions

- The majority of apps are free, but paid apps in Medical and Business genres tend to be more expensive.
- Productivity and Music apps achieve the highest user satisfaction.
- Social and gaming apps are the most popular by rating count.
- The app ecosystem is family-friendly, with most apps rated 4+.

### Business Recommendations
- Focus on user experience for Productivity and Music apps to maintain high ratings.
- Free apps drive volume; consider freemium models for popularity-heavy genres.
- Target family-friendly content for broad reach.

## About the Author
This analysis was performed by [bamideleadedeji](https://github.com/bamideleadedeji) as part of a professional data science portfolio.