# Cyclistic Bike-Share Analysis

## Interactive Data Analysis Notebook

This notebook provides an interactive environment for analyzing Cyclistic bike-share data. You can run each cell to execute the analysis step by step and modify parameters as needed.

---

### Project Overview

**Goal**: Analyze how casual riders and annual members use Cyclistic bikes differently to design marketing strategies for converting casual riders into annual members.

**Data**: Q1 2019 and Q1 2020 Divvy trip data from Chicago's bike-share system.

**Key Questions**:
1. How do annual members and casual riders use Cyclistic bikes differently?
2. What marketing strategies could convert casual riders to annual members?

---

## 1. Setup and Data Loading

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import sys
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Add src directory to path
sys.path.append('../src')

# Import our custom modules
from cyclistic_analyzer import CyclisticAnalyzer
from visualizations import CyclisticVisualizer
from data_utils import DataManager

print("✅ Libraries imported successfully!")
print("📊 Ready to analyze Cyclistic data")

In [None]:
# Initialize data manager and analyzer
data_manager = DataManager()
analyzer = CyclisticAnalyzer()

print("🔧 Data manager and analyzer initialized")

In [None]:
# Setup data (this will use sample data if original files are not available)
file_2019, file_2020, is_sample = data_manager.setup_data()

if is_sample:
    print("📊 Using sample data for demonstration")
    print("ℹ️  To use original data, place CSV files in the data/raw/ directory")
else:
    print("📊 Using original Divvy trip data")

print(f"\nData files:")
print(f"  - 2019 Q1: {file_2019}")
print(f"  - 2020 Q1: {file_2020}")

## 2. Data Preparation and Cleaning

In [None]:
# Prepare data for analysis
if file_2019.exists() and file_2020.exists():
    analyzer.prepare_data(str(file_2019), str(file_2020))
else:
    analyzer.prepare_data()  # Use built-in sample data

# Display basic information about the dataset
print(f"\n📈 Dataset Overview:")
print(f"Total records: {len(analyzer.df_combined):,}")
print(f"Date range: {analyzer.df_combined['started_at'].min()} to {analyzer.df_combined['started_at'].max()}")
print(f"Columns: {list(analyzer.df_combined.columns)}")

In [None]:
# Display first few rows of the data
print("📋 Sample of the data:")
analyzer.df_combined.head()

In [None]:
# Display data summary statistics
print("📊 Data Summary:")
analyzer.df_combined.describe()

## 3. Exploratory Data Analysis

In [None]:
# Analyze user type distribution
user_counts = analyzer.df_combined['member_casual'].value_counts()
user_percentages = analyzer.df_combined['member_casual'].value_counts(normalize=True) * 100

print("👥 User Type Distribution:")
for user_type, count in user_counts.items():
    percentage = user_percentages[user_type]
    print(f"  {user_type.title()}: {count:,} rides ({percentage:.1f}%)")

# Create a simple visualization
plt.figure(figsize=(8, 6))
user_counts.plot(kind='bar', color=['#3B82F6', '#10B981'])
plt.title('Distribution of Rides by User Type', fontsize=14, fontweight='bold')
plt.ylabel('Number of Rides')
plt.xlabel('User Type')
plt.xticks(rotation=0)
plt.grid(axis='y', alpha=0.3)
plt.show()

## 4. Ride Duration Analysis

In [None]:
# Analyze ride duration patterns
duration_stats = analyzer.analyze_ride_duration()
duration_stats

In [None]:
# Create duration visualization
visualizer = CyclisticVisualizer(analyzer)
visualizer.create_duration_comparison_chart()

## 5. Weekly Usage Patterns

In [None]:
# Analyze weekly usage patterns
weekly_stats = analyzer.analyze_weekly_patterns()
weekly_stats

In [None]:
# Create weekly usage visualization
visualizer.create_weekly_usage_chart()

## 6. Hourly Usage Patterns

In [None]:
# Analyze hourly usage patterns
hourly_stats = analyzer.analyze_hourly_patterns()
hourly_stats

In [None]:
# Create hourly usage visualization
visualizer.create_hourly_usage_chart()

## 7. Comprehensive Analysis

In [None]:
# Run complete analysis
results = analyzer.run_complete_analysis()

In [None]:
# Display analysis results
if results:
    print("\n📊 Key Analysis Results:")
    for key, value in results.items():
        if isinstance(value, float):
            print(f"  {key}: {value:.2f}")
        elif isinstance(value, int):
            print(f"  {key}: {value:,}")
        else:
            print(f"  {key}: {value}")

## 8. Comprehensive Dashboard

In [None]:
# Create comprehensive dashboard
visualizer.create_comprehensive_dashboard()

## 9. Key Insights and Business Recommendations

In [None]:
# Generate summary report
analyzer.generate_summary_report()

## 10. Business Recommendations

Based on the analysis, here are the key recommendations:

### 🎯 Finding #1: Ride Duration Differences
- **Casual riders** take significantly longer rides (average ~36 minutes)
- **Annual members** take shorter, more functional rides (average ~12 minutes)
- **Recommendation**: Target leisure-focused marketing for casual riders

### 📅 Finding #2: Weekly Usage Patterns
- **Casual riders** heavily favor weekends
- **Annual members** show consistent weekday usage (commuting pattern)
- **Recommendation**: Create weekend-focused membership tier

### 💡 Strategic Recommendations:

1. **Weekend Warrior Membership**: Discounted weekend-only membership
2. **Commuter Benefits Campaign**: Highlight cost-effectiveness for daily commutes
3. **Tiered Membership Options**: Flexible, lower-priced entry points
4. **Seasonal Engagement**: Recreational programs to build loyalty

### 📈 Expected Impact:
- 20-30% conversion rate from casual to member
- 25-35% increase in membership revenue
- Strengthened market position

## 11. Further Analysis (Optional)

You can extend this analysis by:

1. **Seasonal Analysis**: Compare different quarters or years
2. **Geographic Analysis**: Study station-to-station patterns
3. **Weather Impact**: Correlate usage with weather data
4. **Demographic Analysis**: If age/gender data is available
5. **Bike Type Preferences**: Analyze electric vs. classic bike usage

Feel free to modify any of the code cells above to dive deeper into specific aspects of the data!

In [None]:
# Space for additional analysis
# Add your custom analysis code here

print("🎉 Analysis complete!")
print("💡 Use the cells above to explore different aspects of the data")
print("📊 Modify parameters and re-run cells to see different results")