# Data Analysis Notebook

This notebook demonstrates data analysis capabilities for the sample project.

## Overview

- Load and process user data
- Generate visualizations
- Create summary statistics

In [None]:
# Import required libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from datetime import datetime

# Set up plotting style
plt.style.use('seaborn-v0_8')
print("Libraries imported successfully!")

In [None]:
# Sample user data
user_data = [
    {'name': 'Alice Johnson', 'email': 'alice@example.com', 'age': 28, 'signup_date': '2023-01-15', 'plan': 'premium'},
    {'name': 'Bob Smith', 'email': 'bob@example.com', 'age': 34, 'signup_date': '2023-02-20', 'plan': 'basic'},
    {'name': 'Charlie Brown', 'email': 'charlie@example.com', 'age': 22, 'signup_date': '2023-03-10', 'plan': 'premium'},
    {'name': 'Diana Prince', 'email': 'diana@example.com', 'age': 31, 'signup_date': '2023-04-05', 'plan': 'basic'},
    {'name': 'Eve Wilson', 'email': 'eve@example.com', 'age': 26, 'signup_date': '2023-05-12', 'plan': 'premium'}
]

# Convert to DataFrame
df = pd.DataFrame(user_data)
df['signup_date'] = pd.to_datetime(df['signup_date'])

print("Sample data loaded:")
print(df.head())

In [None]:
# Basic statistics
print("Basic Statistics:")
print(f"Total users: {len(df)}")
print(f"Average age: {df['age'].mean():.1f} years")
print(f"Age range: {df['age'].min()} - {df['age'].max()} years")
print(f"\nPlan distribution:")
print(df['plan'].value_counts())

In [None]:
# Create age distribution plot
plt.figure(figsize=(10, 6))
plt.hist(df['age'], bins=5, edgecolor='black', alpha=0.7)
plt.title('User Age Distribution')
plt.xlabel('Age')
plt.ylabel('Number of Users')
plt.grid(True, alpha=0.3)
plt.show()

In [None]:
# Monthly signup trend
monthly_signups = df.groupby(df['signup_date'].dt.to_period('M')).size()

plt.figure(figsize=(12, 6))
monthly_signups.plot(kind='bar', color='skyblue', edgecolor='black')
plt.title('Monthly User Signups')
plt.xlabel('Month')
plt.ylabel('Number of Signups')
plt.xticks(rotation=45)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## Summary

This notebook demonstrated:
- Data loading and preprocessing
- Basic statistical analysis
- Data visualization with matplotlib
- Time series analysis of user signups

### Key Findings:
- Average user age: 28.2 years
- Most popular plan: Premium (60% of users)
- Steady growth in user signups over time