# Example Notebook: Data Science Demonstrations

This notebook demonstrates various features commonly used in data science projects:

- **Data manipulation** with pandas
- **Visualizations** with matplotlib and plotly
- **Mathematical equations** with LaTeX
- **Statistical analysis** examples

## Introduction

This example showcases the capabilities of Jupyter notebooks for data exploration and analysis.

In [None]:
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime, timedelta

print("Libraries imported successfully!")

## Mathematical Equations

We can display beautiful mathematical equations using LaTeX. For example, the quadratic formula:

$$x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$$

Or the fundamental theorem of calculus:

$$\int_a^b f(x)dx = F(b) - F(a)$$

The standard deviation is given by: $\sigma = \sqrt{\frac{1}{N}\sum_{i=1}^{N}(x_i - \mu)^2}$

## Creating Sample Data

Let's generate some sample data representing daily website visits over a month.

In [None]:
# Generate sample data
np.random.seed(42)
dates = pd.date_range(start='2026-01-01', periods=30, freq='D')
visits = np.random.poisson(lam=150, size=30) + np.random.randint(-20, 40, size=30)
bounce_rate = np.random.uniform(0.3, 0.6, size=30)

# Create a DataFrame
df = pd.DataFrame({
    'Date': dates,
    'Visits': visits,
    'Bounce_Rate': bounce_rate,
    'Conversions': (visits * np.random.uniform(0.02, 0.08, size=30)).astype(int)
})

df.head(10)

## Statistical Summary

Let's examine basic statistics about our website traffic data.

In [None]:
# Calculate summary statistics
print("Website Traffic Summary Statistics")
print("=" * 40)
print(f"Average daily visits: {df['Visits'].mean():.1f}")
print(f"Median daily visits: {df['Visits'].median():.1f}")
print(f"Standard deviation: {df['Visits'].std():.1f}")
print(f"Total visits: {df['Visits'].sum()}")
print(f"Average conversion rate: {(df['Conversions'].sum() / df['Visits'].sum() * 100):.2f}%")
print(f"Average bounce rate: {df['Bounce_Rate'].mean():.1%}")

## Data Visualization

Now let's create some visualizations to better understand our data trends.

In [None]:
# Create a multi-panel visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
fig.suptitle('Website Analytics Dashboard', fontsize=16, fontweight='bold')

# Plot 1: Daily Visits Time Series
axes[0, 0].plot(df['Date'], df['Visits'], marker='o', linewidth=2, markersize=4, color='#2E86AB')
axes[0, 0].fill_between(df['Date'], df['Visits'], alpha=0.3, color='#2E86AB')
axes[0, 0].set_title('Daily Visits Over Time', fontweight='bold')
axes[0, 0].set_xlabel('Date')
axes[0, 0].set_ylabel('Number of Visits')
axes[0, 0].grid(True, alpha=0.3)
axes[0, 0].tick_params(axis='x', rotation=45)

# Plot 2: Visits Distribution
axes[0, 1].hist(df['Visits'], bins=15, color='#A23B72', alpha=0.7, edgecolor='black')
axes[0, 1].axvline(df['Visits'].mean(), color='red', linestyle='--', linewidth=2, label=f'Mean: {df["Visits"].mean():.1f}')
axes[0, 1].set_title('Distribution of Daily Visits', fontweight='bold')
axes[0, 1].set_xlabel('Number of Visits')
axes[0, 1].set_ylabel('Frequency')
axes[0, 1].legend()
axes[0, 1].grid(True, alpha=0.3, axis='y')

# Plot 3: Bounce Rate Over Time
axes[1, 0].scatter(df['Date'], df['Bounce_Rate'], c=df['Visits'], cmap='viridis', s=100, alpha=0.6, edgecolors='black')
axes[1, 0].set_title('Bounce Rate Trend (colored by visits)', fontweight='bold')
axes[1, 0].set_xlabel('Date')
axes[1, 0].set_ylabel('Bounce Rate')
axes[1, 0].tick_params(axis='x', rotation=45)
axes[1, 0].grid(True, alpha=0.3)

# Plot 4: Visits vs Conversions
axes[1, 1].scatter(df['Visits'], df['Conversions'], s=100, alpha=0.6, color='#F18F01', edgecolors='black')
z = np.polyfit(df['Visits'], df['Conversions'], 1)
p = np.poly1d(z)
axes[1, 1].plot(df['Visits'], p(df['Visits']), "r--", linewidth=2, label='Trend Line')
axes[1, 1].set_title('Visits vs Conversions', fontweight='bold')
axes[1, 1].set_xlabel('Number of Visits')
axes[1, 1].set_ylabel('Number of Conversions')
axes[1, 1].legend()
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## Data Transformation

Let's add some calculated fields to analyze weekly trends.

In [None]:
# Add week number and day of week
df['Week'] = df['Date'].dt.isocalendar().week
df['DayOfWeek'] = df['Date'].dt.day_name()
df['Conversion_Rate'] = (df['Conversions'] / df['Visits'] * 100).round(2)

# Group by week
weekly_summary = df.groupby('Week').agg({
    'Visits': ['sum', 'mean'],
    'Conversions': 'sum',
    'Bounce_Rate': 'mean'
}).round(2)

weekly_summary.columns = ['Total_Visits', 'Avg_Daily_Visits', 'Total_Conversions', 'Avg_Bounce_Rate']
print("Weekly Summary:")
print(weekly_summary)