# Lecture 7: Data Storytelling & Visualization - Transforming Statistical Insights into Visual Intelligence

## Learning Objectives

By the end of this lecture, you will be able to:

- Define data visualization and explain its fundamental importance for communicating transportation insights
- Identify the key principles of effective data visualization design and their applications to bike-sharing analysis
- Distinguish between different visualization types and select appropriate charts for specific analytical purposes
- Create effective visualizations using Python and matplotlib to communicate transportation patterns

---

## 1. The Presentation That Changes Everything: Visual Storytelling for Transportation Consulting

Six weeks into your engagement, your bike-sharing client's CEO calls an emergency board meeting. "Our investors are flying in tomorrow," she explains urgently. "They want to see the data insights that will justify our Series A expansion. Can you present your findings in a way that will convince them to invest $5 million in our growth strategy?"

This is the defining moment every consultant dreams of and fears. You may have the analytical insights, such as strong temperature correlation patterns, seasonal demand variations showing significant growth from winter to spring, and rush hour peaks demonstrating clear commuter behavior. But these powerful statistical discoveries are meaningless unless you can transform them into **compelling visual narratives** that enable investors to immediately grasp the business opportunity and strategic potential.

Your statistical analysis represents months of rigorous work, but success now depends on your ability to communicate complex analytical findings to stakeholders who will make million-dollar decisions based on your presentations. **The difference between securing investment and losing the opportunity often comes down to visualization effectiveness and storytelling mastery.** Tomorrow's presentation will demonstrate whether your visualization mastery can convert sophisticated analytical insights into $5 million in growth capital for your client's strategic expansion.

## 2. Data Visualization Fundamentals: Building the Visual Foundation

Let's explore the essential foundations of data visualization that will transform your statistical analysis into compelling business communications. This section covers the basics of visual perception and a comprehensive framework for selecting appropriate visualization types. We'll start by understanding how human vision and cognition process graphical information.

### 2.1. Basics of Visual Perception: What Your Eyes Do Best

You're preparing tomorrow's investor presentation. You have three critical insights to communicate: hourly demand peaks, the temperature-demand relationship, and seasonal growth patterns. Here's the question every consultant faces: **Which chart type will help your audience grasp each insight instantly?**

The answer lies in understanding how human visual perception works—and leveraging its strengths while avoiding its weaknesses. Your brain processes different visual elements with dramatically different levels of accuracy. Understanding this **visual hierarchy** enables you to select the optimal chart type for each analytical insight:

1. **Position (Best)**: Comparing heights or positions along a common scale - line charts, bar charts, scatter plots
2. **Length**: Comparing bar lengths or distances
3. **Color**: Distinguishing categories or highlighting specific elements
4. **Area (Worst)**: Comparing sizes of circles or regions - pie charts, bubble charts

When you need stakeholders to **compare exact values** (like identifying the precise peak demand hour), use position. When you need to **distinguish different groups** (weekdays vs. weekends), add color. When you need both, combine them.

Let's see this principle in action with your bike-sharing data.

**Example 1: Position Shows Precise Patterns**

When you want stakeholders to identify **exact peak hours** for operational decisions, position along a shared axis gives them maximum accuracy:

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("https://raw.githubusercontent.com/pmarcelino/predictive-modeling/main/datasets/dataset.csv")
df['datetime'] = pd.to_datetime(df['datetime'])
df['hour'] = df['datetime'].dt.hour

hourly = df.groupby('hour')['count'].mean()
hourly.plot(kind='line', marker='o', figsize=(10, 5))
plt.title('Average Hourly Bike Demand - Position Shows Peaks Clearly')
plt.xlabel('Hour of Day')
plt.ylabel('Rentals per Hour')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

**Why this works:** Your eyes immediately spot the 8am and 5pm peaks because vertical position (height) is the most accurate way your brain compares quantities. Operations can see exactly when to deploy rebalancing crews—no guesswork required.

**Example 2: Adding Color to Distinguish Groups**

When comparing **two operational contexts** (not just precise values), color helps stakeholders immediately see which pattern applies:

In [None]:
df['daytype'] = df['workingday'].map({1: 'Weekday', 0: 'Weekend'})

weekday_hourly = df[df['daytype'] == 'Weekday'].groupby('hour')['count'].mean()
weekend_hourly = df[df['daytype'] == 'Weekend'].groupby('hour')['count'].mean()

plt.figure(figsize=(10, 5))
plt.plot(weekday_hourly.index, weekday_hourly.values, marker='o', label='Weekday', color='#1f77b4')
plt.plot(weekend_hourly.index, weekend_hourly.values, marker='s', label='Weekend', color='#ff7f0e')
plt.title('Demand by Day Type - Color Distinguishes Patterns')
plt.xlabel('Hour of Day')
plt.ylabel('Rentals per Hour')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

**Why this works:** Color (blue vs. orange) instantly signals "these are different operational contexts"—commuter-driven weekdays with dual peaks versus leisure-driven weekends with single midday peaks. But you still rely on **position** to read exact peak times. This combination tells operations: "You need two different capacity strategies."

### 2.2. Visualization Types and Selection Framework

Now that you understand visual perception principles, let's apply them systematically to transportation data analysis. Different analytical purposes require different visualization approaches, and selecting the wrong type can obscure critical insights or mislead stakeholders. We'll explore three fundamental visualization categories - **comparative, relationship, and distribution visualizations** - with specific selection criteria and transportation applications. This framework enables you to match visualization techniques to analytical goals, ensuring your presentations communicate insights effectively to diverse stakeholder audiences.

**1. Comparative Visualizations**

**Definition**: Comparative visualizations enable stakeholders to evaluate differences between categories, time periods, or operational conditions, forming the foundation of business decision-making.

**Bar charts** use horizontal or vertical rectangular bars where bar length represents data values, making them ideal for comparing discrete categories or groups. The human visual system excels at comparing bar lengths, enabling precise magnitude assessment between categories. When your statistical analysis reveals demand patterns - weekday mean of 193 rides per hour versus weekend mean of 189 rides per hour - bar chart visualization enables immediate comparison through visual height differences. This precise magnitude assessment directly supports capacity planning decisions, staffing optimization, and resource allocation strategies that maximize operational efficiency.

**Column charts** are vertical bar charts specifically designed for temporal data presentation, where categories represent time periods (months, seasons, years) arranged in chronological order. The vertical orientation naturally suggests progression through time while maintaining the comparative power of bar length encoding. Summer demand averaging 237 rides per hour compared to winter demand averaging 125 rides per hour translates to clear visual differences that enable seasonal planning and resource allocation decisions. The chronological arrangement reveals both individual period performance and temporal trends across the complete cycle.

**Python Example - Creating Comparative Visualizations:**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Load the bike-sharing dataset
df = pd.read_csv("https://raw.githubusercontent.com/pmarcelino/predictive-modeling/main/datasets/dataset.csv")
df['datetime'] = pd.to_datetime(df['datetime'])

# Create weekday vs weekend comparison (bar chart)
weekday_mean = df[df['workingday'] == 1]['count'].mean()
weekend_mean = df[df['workingday'] == 0]['count'].mean()

fig, ax = plt.subplots(figsize=(8, 6))
categories = ['Weekday', 'Weekend']
means = [weekday_mean, weekend_mean]
ax.bar(categories, means, color=['#1f77b4', '#ff7f0e'])
ax.set_ylabel('Mean Hourly Rides', fontsize=12)
ax.set_title('Weekday vs Weekend Demand Comparison', fontsize=14, fontweight='bold')
ax.set_ylim(0, max(means) * 1.2)

# Add value labels on bars
for i, v in enumerate(means):
    ax.text(i, v + 5, f'{v:.0f}', ha='center', fontweight='bold')

plt.tight_layout()
plt.show()

# Print summary statistics
print(f"Weekday mean: {weekday_mean:.1f} rides per hour")
print(f"Weekend mean: {weekend_mean:.1f} rides per hour")
print(f"Weekday advantage: {((weekday_mean/weekend_mean - 1) * 100):.1f}%")

This bar chart visualization **reveals surprisingly similar demand patterns between weekdays and weekends at the hourly aggregation level**. Weekdays average 193 rides per hour compared to weekends' 189 rides per hour—only a 2.4% weekday advantage. This small difference indicates that while weekdays and weekends have different temporal patterns (as we'll see in the daily profile analysis), their average hourly demand remains nearly equivalent. The visual height comparison enables instant pattern recognition, while precise value labels support capacity planning decisions. This comparative visualization type excels when stakeholders need to evaluate categorical differences that drive resource allocation strategies.

**2. Relationship Visualizations**

**Definition**: Relationship visualizations reveal connections between variables that enable predictive understanding and operational optimization.

**Scatter plots** use Cartesian coordinates to display values for two continuous variables, with each observation represented as a point positioned according to its values on horizontal (x-axis) and vertical (y-axis) dimensions. This visualization type excels at revealing correlation patterns, outlier identification, and relationship strength assessment between numerical variables. The human visual system naturally processes spatial relationships, making scatter plots ideal for detecting linear trends, curved relationships, and data clustering patterns that inform predictive modeling decisions.

Your temperature-demand correlation of r = 0.394 achieves clarity through scatter plot positioning that enables pattern recognition and magnitude assessment simultaneously. Each data point represents specific temperature-demand combinations, while overall pattern reveals relationship strength and the substantial role of other factors. The moderate positive correlation becomes visually apparent through the upward trend of points, while the scatter around any potential trend line demonstrates that temperature alone explains only 15.6% of demand variation (r² = 0.394²).

**Line plots** connect data points with straight line segments to emphasize continuity and change over ordered sequences, typically time. This visualization type transforms discrete observations into continuous visual flow that highlights trends, seasonal patterns, and temporal relationships essential for operational planning. Line plots excel when the connecting dimension (usually time) has meaningful ordering and when understanding change patterns matters more than individual point values.

Daily demand patterns from 5am to midnight show clear commuting peaks and overnight valleys that enable operational understanding and strategic planning. The line plot format reveals temporal continuity essential for identifying rush hour patterns, overnight maintenance windows, and capacity planning requirements across the complete daily cycle.

**Python Example - Relationship Visualization with Binned Scatter Plot:**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

# Load the bike-sharing dataset
df = pd.read_csv("https://raw.githubusercontent.com/pmarcelino/predictive-modeling/main/datasets/dataset.csv")
df['datetime'] = pd.to_datetime(df['datetime'])

# Create temperature bins to aggregate data and reduce visual clutter
df['temp_bin'] = pd.cut(df['temp'], bins=20)
binned = df.groupby('temp_bin', observed=True)['count'].agg(['mean', 'std', 'count'])
binned['se'] = binned['std'] / np.sqrt(binned['count'])
binned['temp_center'] = binned.index.map(lambda x: x.mid)

# Create figure for professional presentation
fig, ax = plt.subplots(figsize=(10, 6))

# Plot binned means with 95% confidence intervals
ax.errorbar(binned['temp_center'], binned['mean'], 
            yerr=binned['se']*1.96,  # 95% confidence interval
            fmt='o', markersize=8, capsize=5, capthick=2,
            color='#2ECC71', ecolor='#2ECC71', alpha=0.7,
            label='Mean Demand (95% CI)')

# Add trend line using original (unbinned) data
slope, intercept, r_value, p_value, std_err = stats.linregress(df['temp'], df['count'])
line_x = np.array([df['temp'].min(), df['temp'].max()])
line_y = slope * line_x + intercept
ax.plot(line_x, line_y, 'r--', linewidth=2.5, label=f'Trend Line (r = {r_value:.3f})')

ax.set_xlabel('Temperature (°C)', fontsize=12, fontweight='bold')
ax.set_ylabel('Hourly Bike Rentals', fontsize=12, fontweight='bold')
ax.set_title('Temperature-Demand Relationship: Warmer Weather Drives Higher Ridership',
             fontsize=13, fontweight='bold')
ax.legend(loc='upper left', fontsize=10)
ax.grid(True, alpha=0.3)

# Add correlation annotation
ax.text(0.05, 0.95, f'Correlation: {r_value:.3f}\nR² = {r_value**2:.3f}',
        transform=ax.transAxes, fontsize=11, verticalalignment='top',
        bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8))

plt.tight_layout()
plt.show()

# Print key insights
print(f"Temperature-demand correlation: r = {r_value:.3f}")
print(f"Temperature explains {(r_value**2)*100:.1f}% of demand variation")
print(f"For each 1°C increase, demand increases by approximately {slope:.1f} rentals")

When visualizing relationships with large datasets (17,000+ hourly observations), raw scatter plots create overwhelming visual clutter that obscures patterns. This **binned scatter approach aggregates observations into temperature bins**, calculating mean demand and confidence intervals for each bin. The result is a clean, professional visualization that immediately reveals the temperature-demand relationship while maintaining statistical rigor through confidence interval display.

This visualization **clearly demonstrates the positive temperature-demand relationship** through an upward trend of binned means. The correlation coefficient r = 0.394 indicates a moderate relationship, with temperature explaining 15.6% of demand variation. The confidence intervals (error bars) reveal consistent patterns across most temperature ranges, with slightly wider intervals at temperature extremes due to fewer observations. The trend line quantifies the average relationship: **for each 1°C temperature increase, demand increases by approximately 9.2 rentals per hour**. 

This aggregated presentation enables immediate pattern recognition for stakeholders while communicating statistical uncertainty appropriately. The visualization demonstrates that while temperature significantly influences demand, other factors (time of day, day of week, seasonality beyond temperature) play substantial roles—a critical insight for building comprehensive demand forecasting models that weather alone cannot provide.

**3. Distribution Visualizations**

**Definition**: Distribution visualizations reveal data spread, central tendencies, and outlier patterns essential for operational planning and risk management.

**Histograms** are bar charts that display the frequency distribution of continuous data by dividing values into bins and showing how many observations fall within each range. Each bar represents a range of values (bin), and the height indicates how frequently those values occur in the dataset. For bike-sharing demand analysis, histograms reveal **operational patterns essential for capacity planning**: if analysis shows that 45% of operating hours experience 200-400 rides per hour, 23% experience 401-600 rides per hour, and 15% exceed 600 rides per hour, this frequency information enables tiered operational strategies with appropriate staffing and bike availability for different demand conditions.

**Box plots** (also called box-and-whisker plots) provide a standardized summary of data distribution through five key statistics: minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum values. The "box" spans from Q1 to Q3 (containing the middle 50% of data), with a line marking the median, while "whiskers" extend to show the full data range excluding outliers. For hourly bike demand with median = 145 rides, Q1 = 42 rides, and Q3 = 284 rides, the box plot immediately reveals that **half of all operating hours fall between 42-284 rides**, providing clear operational planning ranges and performance benchmarks for resource allocation and service level expectations.

**Python Example - Distribution Visualization with Histogram and Box Plot:**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Load the bike-sharing dataset
df = pd.read_csv("https://raw.githubusercontent.com/pmarcelino/predictive-modeling/main/datasets/dataset.csv")
df['datetime'] = pd.to_datetime(df['datetime'])

# Create side-by-side distribution visualizations
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
fig.suptitle('Hourly Demand Distribution Analysis', fontsize=14, fontweight='bold')

# Panel 1: Histogram showing frequency distribution
axes[0].hist(df['count'], bins=40, color='#3498DB', edgecolor='black', alpha=0.7)
axes[0].set_xlabel('Hourly Bike Rentals', fontsize=11)
axes[0].set_ylabel('Frequency (Number of Hours)', fontsize=11)
axes[0].set_title('Demand Frequency Distribution', fontsize=12, fontweight='bold')
axes[0].axvline(x=df['count'].mean(), color='red', linestyle='--',
                linewidth=2, label=f'Mean: {df["count"].mean():.0f}')
axes[0].axvline(x=df['count'].median(), color='orange', linestyle='--',
                linewidth=2, label=f'Median: {df["count"].median():.0f}')
axes[0].legend()
axes[0].grid(axis='y', alpha=0.3)

# Panel 2: Box plot showing distribution characteristics
box_data = [df['count']]
bp = axes[1].boxplot(box_data, vert=True, patch_artist=True,
                     labels=['Hourly Demand'])
bp['boxes'][0].set_facecolor('#E74C3C')
bp['boxes'][0].set_alpha(0.7)
axes[1].set_ylabel('Hourly Bike Rentals', fontsize=11)
axes[1].set_title('Distribution Summary Statistics', fontsize=12, fontweight='bold')
axes[1].grid(axis='y', alpha=0.3)

# Add quartile annotations to box plot
q1 = df['count'].quantile(0.25)
q2 = df['count'].quantile(0.50)  # median
q3 = df['count'].quantile(0.75)
axes[1].text(1.15, q1, f'Q1: {q1:.0f}', fontsize=9, va='center')
axes[1].text(1.15, q2, f'Median: {q2:.0f}', fontsize=9, va='center', fontweight='bold')
axes[1].text(1.15, q3, f'Q3: {q3:.0f}', fontsize=9, va='center')

plt.tight_layout()
plt.show()

# Print distribution insights
print("=== Demand Distribution Summary ===")
print(f"Mean: {df['count'].mean():.1f} rentals per hour")
print(f"Median: {df['count'].median():.1f} rentals per hour")
print(f"Standard deviation: {df['count'].std():.1f} rentals")
print(f"25th percentile (Q1): {q1:.1f} rentals")
print(f"75th percentile (Q3): {q3:.1f} rentals")
print(f"Interquartile range (IQR): {(q3-q1):.1f} rentals")
print(f"\nInterpretation: 50% of hours fall between {q1:.0f} and {q3:.0f} rentals")

These distribution visualizations **reveal critical capacity planning insights** through complementary views. The histogram (left) shows that hourly demand follows a right-skewed distribution with most hours experiencing low-to-moderate demand (under 200 rentals), but a substantial tail extending to 800+ rentals during peak periods. The mean (192 rentals) exceeds the median (145 rentals), confirming the right-skewed pattern where high-demand periods pull the average upward. This pattern tells operations: "Plan for moderate demand most of the time, but maintain surge capacity for frequent high-demand periods."

The box plot (right) provides precise statistical quartiles: 25% of hours see fewer than 42 rentals (overnight/early morning periods), 50% fall between 42 and 284 rentals (typical daytime operations), and 25% exceed 284 rentals (peak commute and weekend afternoon periods). This distribution intelligence enables **tiered operational strategies**: minimal staffing for Q1 periods (below 42 rentals), standard operations for Q2-Q3 periods (42-284 rentals), and surge capacity deployment for high-demand periods above Q3 (284+ rentals).

## 3. Transportation Data Visualization Applications

Now, let's apply these principles to specific transportation analysis challenges. This section demonstrates how theoretical concepts translate into practical visualization solutions for three critical areas: temporal patterns, weather relationships, and business performance. You'll see how to create effective visualizations that communicate complex transportation insights to business stakeholders.

### 3.1. Temporal Pattern Visualization Strategies

Transportation systems exhibit **complex temporal patterns operating simultaneously across multiple time scales**. Let's explore effective visualization strategies that reveal these multi-scale patterns while maintaining clarity and enabling business decision-making. We'll examine daily demand profiles and monthly patterns.

**Daily Demand Profile Visualization**

Daily demand patterns provide essential operational insights that must be communicated clearly for quick understanding and informed decision-making. Creating an effective **daily demand profile** requires careful selection of visualization elements that reveal temporal patterns while supporting comparison across different operational contexts.

A **line plot** offers the clearest representation for hourly demand patterns. Because time progresses continuously throughout the day, connecting data points with lines preserves the natural temporal flow and helps viewers follow demand evolution from hour to hour. The **horizontal axis** should represent the hour of the day (0-23), while the **vertical axis** shows mean hourly demand — enabling both pattern recognition and quantitative interpretation.

**Color-coding with multiple lines** enhances this visualization by allowing direct comparison between different demand contexts, such as weekdays versus weekends. Plotting both patterns on the same chart reveals how temporal behaviors differ across operational scenarios without requiring viewers to mentally compare separate visualizations. This approach maintains data continuity while making categorical contrasts immediately visible.

**Highlighting critical periods** provides additional interpretive guidance. Subtle background shading can draw attention to key operational windows—such as morning and evening rush periods—helping stakeholders quickly identify when demand surges require increased capacity or staffing adjustments.

Together, these visualization choices create an intuitive daily profile that supports both immediate pattern recognition and deeper strategic analysis, enabling operational teams to understand when and how demand shifts throughout different types of days.

**Python Example - Daily Demand Profile Visualization:**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Load the bike-sharing dataset
df = pd.read_csv("https://raw.githubusercontent.com/pmarcelino/predictive-modeling/main/datasets/dataset.csv")
df['datetime'] = pd.to_datetime(df['datetime'])
df['hour'] = df['datetime'].dt.hour

# Calculate mean demand by hour for weekdays vs weekends
weekday_hourly = df[df['workingday'] == 1].groupby('hour')['count'].mean()
weekend_hourly = df[df['workingday'] == 0].groupby('hour')['count'].mean()

# Create line plot showing daily patterns
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(weekday_hourly.index, weekday_hourly.values, marker='o',
        linewidth=2, markersize=6, color='#1f77b4', label='Weekday')
ax.plot(weekend_hourly.index, weekend_hourly.values, marker='s',
        linewidth=2, markersize=6, color='#ff7f0e', label='Weekend')

ax.set_xlabel('Hour of Day', fontsize=12)
ax.set_ylabel('Mean Hourly Demand (rides)', fontsize=12)
ax.set_title('Daily Demand Profile: Weekday vs Weekend Patterns',
             fontsize=14, fontweight='bold')
ax.set_xticks(range(0, 24, 2))
ax.grid(True, alpha=0.3)
ax.legend(loc='upper left', fontsize=11)

# Highlight peak periods
ax.axvspan(7, 9, alpha=0.2, color='yellow', label='Morning Peak')
ax.axvspan(17, 19, alpha=0.2, color='orange', label='Evening Peak')

plt.tight_layout()
plt.show()

# Print key insights
print(f"Weekday morning peak (8am): {weekday_hourly[8]:.0f} rides")
print(f"Weekday evening peak (5pm): {weekday_hourly[17]:.0f} rides")
print(f"Weekend midday peak: {weekend_hourly.max():.0f} rides at {weekend_hourly.idxmax()}:00")

This line plot visualization **immediately reveals distinct operational patterns** between weekdays and weekends. Weekdays show clear bimodal patterns with morning commute peaks at 8am (480 rides) and evening peaks at 5pm (529 rides), while weekends exhibit single broad midday peaks at 1pm (388 rides) reflecting recreational usage. The sharp weekday peaks—reaching 100+ rides higher than weekend maximums—demonstrate commuter-driven demand requiring surge capacity deployment during rush hours. The temporal continuity preserved by line connections enables stakeholders to understand demand evolution throughout the day, supporting staffing optimization and capacity planning decisions.

**Monthly Pattern Analysis**

Monthly pattern visualization reveals seasonal cycles and growth trends essential for strategic planning and resource allocation. Creating an effective **monthly demand visualization** requires choosing elements that highlight seasonal transitions while enabling precise quantitative comparisons across the annual cycle.

**Bar charts** (or column charts) excel at presenting monthly patterns because they treat each month as a discrete category, enabling clear magnitude comparisons between periods. Unlike line plots that emphasize continuity, bars focus attention on the relative height of each month's demand, making it immediately clear which months require more or less operational capacity.

**Color-coding by season** significantly enhances pattern recognition in monthly visualizations. Assigning distinct colors to winter, spring, summer, and fall months groups the data into meaningful seasonal clusters without requiring separate charts. This approach helps viewers instantly identify seasonal phases while maintaining the month-by-month detail needed for operational planning.

**Season labels or annotations** provide additional interpretive guidance, helping stakeholders quickly understand which months belong to which seasonal patterns. Combined with the color coding, these labels create an intuitive visualization that serves both high-level strategic review and detailed planning needs.

Together, these visualization choices reveal how demand evolves across the complete annual cycle, supporting decisions about seasonal staffing adjustments, maintenance scheduling during low-demand periods, and capacity expansion for peak seasons.

**Python Example - Monthly Seasonal Pattern Visualization:**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Load the bike-sharing dataset
df = pd.read_csv("https://raw.githubusercontent.com/pmarcelino/predictive-modeling/main/datasets/dataset.csv")
df['datetime'] = pd.to_datetime(df['datetime'])
df['month'] = df['datetime'].dt.month
df['year'] = df['datetime'].dt.year

# Calculate monthly mean demand
monthly_demand = df.groupby('month')['count'].mean()

# Create column chart for monthly patterns
fig, ax = plt.subplots(figsize=(12, 6))
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
          'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
colors = ['#5DADE2' if m in [12,1,2] else '#2ECC71' if m in [3,4,5]
          else '#E74C3C' if m in [6,7,8] else '#F39C12'
          for m in range(1, 13)]

ax.bar(months, monthly_demand.values, color=colors, edgecolor='black', linewidth=1.5)
ax.set_xlabel('Month', fontsize=12)
ax.set_ylabel('Mean Hourly Demand (rides)', fontsize=12)
ax.set_title('Seasonal Demand Patterns Across Annual Cycle',
             fontsize=14, fontweight='bold')
ax.grid(axis='y', alpha=0.3)

# Add season labels
ax.text(1, monthly_demand.max() * 0.95, 'Winter', fontsize=10,
        ha='center', style='italic', color='#5DADE2')
ax.text(4, monthly_demand.max() * 0.95, 'Spring', fontsize=10,
        ha='center', style='italic', color='#2ECC71')
ax.text(7, monthly_demand.max() * 0.95, 'Summer', fontsize=10,
        ha='center', style='italic', color='#E74C3C')
ax.text(10, monthly_demand.max() * 0.95, 'Fall', fontsize=10,
        ha='center', style='italic', color='#F39C12')

plt.tight_layout()
plt.show()

# Print seasonal insights
winter_months = [12, 1, 2]
spring_months = [3, 4, 5]
summer_months = [6, 7, 8]
fall_months = [9, 10, 11]

winter_mean = monthly_demand[winter_months].mean()
spring_mean = monthly_demand[spring_months].mean()
summer_mean = monthly_demand[summer_months].mean()
fall_mean = monthly_demand[fall_months].mean()

print(f"Winter mean: {winter_mean:.0f} rides per hour")
print(f"Spring mean: {spring_mean:.0f} rides per hour")
print(f"Summer mean: {summer_mean:.0f} rides per hour")
print(f"Fall mean: {fall_mean:.0f} rides per hour")
print(f"Spring growth from winter: {((spring_mean/winter_mean - 1) * 100):.1f}%")

This monthly column chart **reveals clear seasonal cycles** in bike-sharing demand. The color coding by season (winter blue, spring green, summer red, fall orange) enhances pattern recognition while maintaining quantitative precision. The visualization shows winter baseline around 125 rides per hour, spring growth reaching 184 rides per hour (a 46.8% increase), summer peaks at 237 rides per hour, and fall maintaining strong demand at 218 rides per hour. This seasonal intelligence enables **strategic resource allocation and annual planning** that optimizes business performance across the complete annual cycle, with summer showing nearly double (89% higher than) winter demand.

### 3.2. Weather Relationship Visualization Techniques

Weather represents **one of the most significant environmental factors affecting transportation demand**, requiring sophisticated visualization approaches that reveal complex relationships while supporting operational decision-making. Let's explore how to visualize temperature-demand relationships and integrate multiple weather variables for comprehensive understanding.

**Temperature-Demand Relationship Analysis**

Temperature correlation with demand exhibits complex patterns that require careful visualization design to reveal both overall trends and contextual variations. Creating an effective **temperature-demand visualization** requires choosing elements that show the relationship strength while managing visual complexity in large datasets.

**Scatter plots** provide optimal presentation for temperature-demand relationships because both variables are continuous measurements requiring precise positioning. The **horizontal axis** should represent temperature (allowing viewers to easily map environmental conditions), while the **vertical axis** shows demand. This arrangement enables viewers to assess both the overall trend direction and the strength of the relationship through point clustering patterns.

However, raw scatter plots with thousands of data points can become visually cluttered and difficult to interpret. **Binning temperature data** addresses this challenge by grouping observations into temperature ranges and plotting the mean demand for each bin. This aggregation reveals patterns more clearly while reducing visual noise, making trends immediately apparent even in very large datasets.

**Color-coding by season** adds critical contextual information to temperature analysis. Assigning distinct colors to different seasons reveals that identical temperatures may produce different demand levels depending on seasonal context—demonstrating that factors beyond temperature (such as daylight hours, vacation patterns, or school schedules) influence demand. This multi-dimensional view provides richer operational insights than simple temperature correlation alone.

**Trend lines with correlation statistics** quantify relationship strength and enable predictive applications. Adding a best-fit line with its correlation coefficient (r-value) and coefficient of determination (R²) provides precise relationship quantification while maintaining visual clarity. **Confidence intervals** around binned means communicate measurement uncertainty, helping stakeholders understand the reliability of patterns at different temperature ranges.

Together, these visualization choices reveal how environmental conditions influence demand while accounting for seasonal complexity—supporting both operational weather-response planning and longer-term strategic decision-making.

**Python Example - Temperature-Demand Binned Scatter with Seasonal Context:**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

# Load the bike-sharing dataset
df = pd.read_csv("https://raw.githubusercontent.com/pmarcelino/predictive-modeling/main/datasets/dataset.csv")
df['datetime'] = pd.to_datetime(df['datetime'])

# Map seasons to colors for enhanced visualization
season_colors = {1: '#5DADE2', 2: '#2ECC71', 3: '#E74C3C', 4: '#F39C12'}
season_names = {1: 'Winter', 2: 'Spring', 3: 'Summer', 4: 'Fall'}

# Create figure for professional presentation
fig, ax = plt.subplots(figsize=(12, 7))

# Create temperature bins for each season to reduce visual clutter
for season in [1, 2, 3, 4]:
    season_data = df[df['season'] == season].copy()
    
    # Bin temperature data within each season
    season_data['temp_bin'] = pd.cut(season_data['temp'], bins=15)
    binned = season_data.groupby('temp_bin', observed=True)['count'].agg(['mean', 'std', 'count'])
    binned['se'] = binned['std'] / np.sqrt(binned['count'])
    binned['temp_center'] = binned.index.map(lambda x: x.mid)
    
    # Plot binned means with 95% confidence intervals by season
    ax.errorbar(binned['temp_center'], binned['mean'], 
                yerr=binned['se']*1.96,  # 95% confidence interval
                fmt='o', markersize=7, capsize=4, capthick=1.5,
                color=season_colors[season], ecolor=season_colors[season], 
                alpha=0.7, label=season_names[season])

# Add overall trend line using original (unbinned) data
slope, intercept, r_value, p_value, std_err = stats.linregress(df['temp'], df['count'])
line_x = np.array([df['temp'].min(), df['temp'].max()])
line_y = slope * line_x + intercept
ax.plot(line_x, line_y, 'k--', linewidth=2.5, label=f'Trend Line (r = {r_value:.3f})')

ax.set_xlabel('Temperature (°C)', fontsize=13, fontweight='bold')
ax.set_ylabel('Hourly Bike Rentals', fontsize=13, fontweight='bold')
ax.set_title('Temperature-Demand Relationship with Seasonal Context',
             fontsize=15, fontweight='bold')
ax.legend(title='Season', loc='upper left', fontsize=10, framealpha=0.9)
ax.grid(True, alpha=0.3)

# Add correlation statistics box
stats_text = f'Correlation: r = {r_value:.3f}\n'
stats_text += f'R² = {r_value**2:.3f}\n'
stats_text += f'Temperature explains {(r_value**2)*100:.1f}%\nof demand variation'
ax.text(0.98, 0.05, stats_text, transform=ax.transAxes,
        fontsize=10, ha='right', va='bottom',
        bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8))

plt.tight_layout()
plt.show()

# Print detailed insights
print(f"Overall temperature-demand correlation: r = {r_value:.3f}")
print(f"Temperature explains {(r_value**2)*100:.1f}% of demand variation")
print(f"Trend equation: Demand = {slope:.1f} × Temperature + {intercept:.1f}")
print("\nSeasonal temperature correlations:")
for season in [1, 2, 3, 4]:
    season_data = df[df['season'] == season]
    season_corr = season_data['temp'].corr(season_data['count'])
    print(f"  {season_names[season]}: r = {season_corr:.3f}")

This **binned scatter approach with seasonal color coding** resolves the visual clutter problem inherent in plotting 17,000+ raw data points while preserving critical seasonal insights. By aggregating observations into temperature bins within each season, the visualization clearly reveals how temperature-demand relationships vary across winter (blue), spring (green), summer (red), and fall (orange) periods. The confidence intervals show pattern consistency within temperature ranges while highlighting data sparsity at temperature extremes.

The visualization **reveals sophisticated seasonal dynamics** beyond simple temperature effects. The overall correlation (r = 0.394) masks important seasonal variations: winter shows the strongest temperature sensitivity (r = 0.457), followed by spring (r = 0.404), summer (r = 0.366), and fall (r = 0.324). This declining pattern indicates that **temperature matters most during cold conditions**—when temperatures rise from 5°C to 15°C in winter, riders respond strongly. However, during warmer seasons, other factors (daylight hours, vacation patterns, weekend effects) dominate demand patterns, weakening the pure temperature effect. At similar temperatures around 20°C, summer demand (red points) often exceeds spring demand (green points), demonstrating that seasonal context matters beyond temperature alone. This professional presentation enables operational insights including season-specific weather forecasting models, recognition that temperature sensitivity varies systematically across the annual cycle, and understanding that comprehensive demand forecasting requires integrating temperature with seasonal indicators and temporal patterns.

**Multi-Weather Variable Integration**

Comprehensive weather analysis requires **integrating multiple environmental factors** that interact to influence transportation demand. Temperature, precipitation, humidity, and wind speed exhibit complex interactions requiring sophisticated visualization approaches that reveal how these variables combine to drive demand patterns.

Multi-variable scatter plots reveal interaction effects between different weather conditions. **Temperature-demand relationships vary under different precipitation conditions**: clear weather typically shows stronger temperature correlation, while precipitation days may exhibit weaker temperature effects as precipitation becomes a dominant demand driver. Understanding these interactions enables more sophisticated forecasting models that account for weather complexity rather than treating variables in isolation.

Color coding enables third-variable integration within two-dimensional scatter plots. Temperature-demand scatter plot with color indicating precipitation levels reveals that **high demand requires both favorable temperature and absence of precipitation**. This multi-dimensional insight supports operational planning that considers comprehensive weather conditions rather than single-factor responses.

**Python Example - Multi-Weather Variable Integration with Precipitation Context:**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

# Load the bike-sharing dataset
df = pd.read_csv("https://raw.githubusercontent.com/pmarcelino/predictive-modeling/main/datasets/dataset.csv")
df['datetime'] = pd.to_datetime(df['datetime'])

# Create precipitation indicator (binary: clear vs. rainy/snowy)
df['precipitation'] = (df['weather'] >= 3).astype(int)
df['precip_label'] = df['precipitation'].map({0: 'Clear Weather', 1: 'Precipitation'})

# Create temperature bins for aggregation
df['temp_bin'] = pd.cut(df['temp'], bins=15)

# Aggregate by temperature bin and precipitation condition
binned = df.groupby(['temp_bin', 'precip_label'], observed=True)['count'].agg(['mean', 'std', 'count'])
binned['se'] = binned['std'] / np.sqrt(binned['count'])
binned = binned.reset_index()
binned['temp_center'] = binned['temp_bin'].map(lambda x: x.mid)

# Create figure for multi-variable weather integration
fig, ax = plt.subplots(figsize=(12, 7))

# Plot clear weather conditions
clear_data = binned[binned['precip_label'] == 'Clear Weather']
ax.errorbar(clear_data['temp_center'], clear_data['mean'],
            yerr=clear_data['se']*1.96,  # 95% confidence interval
            fmt='o', markersize=8, capsize=5, capthick=2,
            color='#2ECC71', ecolor='#2ECC71', alpha=0.8,
            label='Clear Weather')

# Plot precipitation conditions
precip_data = binned[binned['precip_label'] == 'Precipitation']
ax.errorbar(precip_data['temp_center'], precip_data['mean'],
            yerr=precip_data['se']*1.96,  # 95% confidence interval
            fmt='s', markersize=8, capsize=5, capthick=2,
            color='#3498DB', ecolor='#3498DB', alpha=0.8,
            label='Precipitation')

# Calculate separate correlations for each weather condition
clear_df = df[df['precip_label'] == 'Clear Weather']
precip_df = df[df['precip_label'] == 'Precipitation']

clear_corr = clear_df['temp'].corr(clear_df['count'])
precip_corr = precip_df['temp'].corr(precip_df['count'])

# Add trend lines for each condition
clear_slope, clear_int, _, _, _ = stats.linregress(clear_df['temp'], clear_df['count'])
precip_slope, precip_int, _, _, _ = stats.linregress(precip_df['temp'], precip_df['count'])

temp_range = np.array([df['temp'].min(), df['temp'].max()])
ax.plot(temp_range, clear_slope * temp_range + clear_int,
        '--', linewidth=2.5, color='#27AE60', alpha=0.7,
        label=f'Clear Trend (r = {clear_corr:.3f})')
ax.plot(temp_range, precip_slope * temp_range + precip_int,
        '--', linewidth=2.5, color='#2874A6', alpha=0.7,
        label=f'Precip Trend (r = {precip_corr:.3f})')

ax.set_xlabel('Temperature (°C)', fontsize=13, fontweight='bold')
ax.set_ylabel('Hourly Bike Rentals', fontsize=13, fontweight='bold')
ax.set_title('Multi-Weather Variable Integration: Temperature-Demand Under Different Precipitation Conditions',
             fontsize=14, fontweight='bold')
ax.legend(loc='upper left', fontsize=10, framealpha=0.9)
ax.grid(True, alpha=0.3)

# Add insights box
insights_text = f'Clear Weather:\n  r = {clear_corr:.3f} | Mean = {clear_df["count"].mean():.0f}\n'
insights_text += f'Precipitation:\n  r = {precip_corr:.3f} | Mean = {precip_df["count"].mean():.0f}\n'
insights_text += f'Demand Impact:\n  {((clear_df["count"].mean()/precip_df["count"].mean() - 1)*100):.1f}% higher in clear weather'
ax.text(0.98, 0.05, insights_text, transform=ax.transAxes,
        fontsize=10, ha='right', va='bottom',
        bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.9))

plt.tight_layout()
plt.show()

# Print detailed multi-variable insights
print("=== Multi-Weather Variable Integration Analysis ===")
print(f"\nClear Weather Conditions:")
print(f"  Temperature-demand correlation: r = {clear_corr:.3f}")
print(f"  Mean hourly demand: {clear_df['count'].mean():.0f} rides")
print(f"  Temperature sensitivity: {clear_slope:.1f} rides per °C")

print(f"\nPrecipitation Conditions:")
print(f"  Temperature-demand correlation: r = {precip_corr:.3f}")
print(f"  Mean hourly demand: {precip_df['count'].mean():.0f} rides")
print(f"  Temperature sensitivity: {precip_slope:.1f} rides per °C")

print(f"\nInteraction Effects:")
demand_reduction = ((clear_df['count'].mean() - precip_df['count'].mean()) / clear_df['count'].mean()) * 100
print(f"  Precipitation reduces demand by {demand_reduction:.1f}%")
print(f"  Temperature correlation stronger in clear weather: {clear_corr:.3f} vs {precip_corr:.3f}")
print(f"  Precipitation becomes dominant demand driver, weakening temperature effects")

This **multi-variable integration visualization** reveals critical interaction effects between temperature and precipitation. The binned scatter approach with separate trend lines for clear weather (green) and precipitation conditions (blue) demonstrates that **weather variables don't operate independently**: their combined effect determines demand outcomes. Clear weather shows stronger temperature correlation (r = 0.396) compared to precipitation conditions (r = 0.363), indicating that while precipitation weakens temperature effects, the reduction is modest—suggesting that temperature remains relevant even during adverse weather, though precipitation becomes an important additional demand driver.

The visualization **quantifies the compound weather impact** on operational planning. Clear weather generates mean demand of 198 rides per hour compared to precipitation's 119 rides per hour—demonstrating that demand is 66.4% higher in clear weather, or equivalently, precipitation reduces demand by 39.9%. This substantial gap demonstrates that **high demand requires both favorable temperature AND absence of precipitation**, not just warm conditions alone. At similar temperatures around 15°C, clear weather consistently produces 50-80 more rides per hour than precipitation conditions, confirming that precipitation significantly suppresses demand even when temperatures are favorable.

These interaction insights enable **sophisticated forecasting approaches** that integrate multiple weather variables rather than treating them as independent factors. Operations teams can anticipate that cold temperatures during clear weather may still generate reasonable demand (100-150 rides per hour), while warm temperatures during precipitation events see demand suppression to similar levels. This understanding supports weather-responsive capacity planning that considers comprehensive environmental conditions rather than single-variable forecasts, improving resource allocation efficiency during variable weather patterns.

### 3.3. Business Performance Visualization Design

Transportation business performance requires visualization approaches that **connect operational metrics to financial outcomes and strategic objectives**. Let's explore how professional performance visualization enables stakeholder understanding and supports evidence-based decision-making through operational efficiency analysis.

**Operational Efficiency Metrics**

Operational efficiency visualization reveals system utilization patterns and optimization opportunities essential for business performance improvement. Creating an effective **operational efficiency dashboard** requires selecting visualization elements that present multiple dimensions of performance simultaneously while maintaining clarity and supporting decision-making across different organizational levels.

**Multi-panel dashboards** excel at presenting comprehensive operational intelligence because they allow multiple related metrics to be viewed together without overwhelming the viewer. Rather than forcing stakeholders to mentally integrate information from separate charts, a coordinated dashboard reveals relationships between different efficiency dimensions—enabling holistic understanding of system performance.

**Line plots for temporal patterns** effectively show how utilization evolves across daily hours or monthly periods, revealing when capacity is well-utilized versus underutilized. The continuous nature of time makes line connections appropriate for tracking efficiency trends and identifying optimization opportunities at different time scales.

**Histograms for distribution analysis** reveal the consistency and variability of operational performance. Rather than just showing average utilization rates, distributions expose whether the system operates at consistently moderate levels or experiences wide swings between overcapacity and shortage conditions. This information directly informs strategic decisions about fleet sizing and demand management.

**Bar charts for categorical comparisons** enable clear performance benchmarking across discrete categories like days of the week. Height differences immediately communicate which operational contexts achieve better or worse efficiency, supporting targeted improvement initiatives.

**Reference lines for benchmarks** (such as mean utilization or target thresholds) provide interpretive context within each panel, helping viewers quickly assess whether observed performance meets operational goals. These visual anchors transform raw patterns into actionable insights.

Together, these coordinated visualization elements create a comprehensive efficiency dashboard that supports both operational monitoring and strategic planning—revealing where resources are well-deployed and where optimization opportunities exist.

**Python Example - Operational Efficiency Dashboard:**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Load the bike-sharing dataset
df = pd.read_csv("https://raw.githubusercontent.com/pmarcelino/predictive-modeling/main/datasets/dataset.csv")
df['datetime'] = pd.to_datetime(df['datetime'])
df['date'] = df['datetime'].dt.date
df['hour'] = df['datetime'].dt.hour

# Calculate daily total demand
daily_demand = df.groupby('date')['count'].sum()

# Assume fleet size for utilization calculation
fleet_size = 3000  # bikes
df['utilization_rate'] = (df['count'] / fleet_size) * 100

# Create multi-panel efficiency dashboard
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
fig.suptitle('Operational Efficiency Dashboard', fontsize=16, fontweight='bold')

# Panel 1: Hourly utilization pattern
hourly_util = df.groupby('hour')['utilization_rate'].mean()
axes[0, 0].plot(hourly_util.index, hourly_util.values, marker='o',
                linewidth=2, markersize=6, color='#2ECC71')
axes[0, 0].set_xlabel('Hour of Day', fontsize=11)
axes[0, 0].set_ylabel('Utilization Rate (%)', fontsize=11)
axes[0, 0].set_title('Daily Utilization Profile', fontsize=12, fontweight='bold')
axes[0, 0].grid(True, alpha=0.3)
axes[0, 0].axhline(y=hourly_util.mean(), color='r', linestyle='--',
                    label=f'Mean: {hourly_util.mean():.1f}%')
axes[0, 0].legend()

# Panel 2: Utilization distribution
axes[0, 1].hist(df['utilization_rate'], bins=30, color='#3498DB',
                edgecolor='black', alpha=0.7)
axes[0, 1].set_xlabel('Utilization Rate (%)', fontsize=11)
axes[0, 1].set_ylabel('Frequency', fontsize=11)
axes[0, 1].set_title('Utilization Rate Distribution', fontsize=12, fontweight='bold')
axes[0, 1].axvline(x=df['utilization_rate'].mean(), color='r',
                   linestyle='--', linewidth=2, label=f'Mean: {df["utilization_rate"].mean():.1f}%')
axes[0, 1].axvline(x=df['utilization_rate'].median(), color='orange',
                   linestyle='--', linewidth=2, label=f'Median: {df["utilization_rate"].median():.1f}%')
axes[0, 1].legend()

# Panel 3: Weekly efficiency comparison
df['weekday'] = pd.to_datetime(df['date']).dt.dayofweek
weekly_util = df.groupby('weekday')['utilization_rate'].mean()
days = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
axes[1, 0].bar(days, weekly_util.values, color='#E74C3C', edgecolor='black')
axes[1, 0].set_xlabel('Day of Week', fontsize=11)
axes[1, 0].set_ylabel('Mean Utilization Rate (%)', fontsize=11)
axes[1, 0].set_title('Weekly Efficiency Pattern', fontsize=12, fontweight='bold')
axes[1, 0].grid(axis='y', alpha=0.3)

# Panel 4: Monthly efficiency trends
df['month'] = pd.to_datetime(df['date']).dt.month
monthly_util = df.groupby('month')['utilization_rate'].mean()
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
          'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
axes[1, 1].plot(months, monthly_util.values, marker='s',
                linewidth=2, markersize=8, color='#9B59B6')
axes[1, 1].set_xlabel('Month', fontsize=11)
axes[1, 1].set_ylabel('Mean Utilization Rate (%)', fontsize=11)
axes[1, 1].set_title('Seasonal Efficiency Trends', fontsize=12, fontweight='bold')
axes[1, 1].grid(True, alpha=0.3)
axes[1, 1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

# Print efficiency insights
print("=== Operational Efficiency Summary ===")
print(f"Overall mean utilization: {df['utilization_rate'].mean():.2f}%")
print(f"Peak hour utilization: {hourly_util.max():.2f}% at hour {hourly_util.idxmax()}")
print(f"Lowest utilization: {hourly_util.min():.2f}% at hour {hourly_util.idxmin()}")
print(f"Utilization variability (std): {df['utilization_rate'].std():.2f}%")
print(f"Best day: {days[weekly_util.idxmax()]} ({weekly_util.max():.2f}%)")
print(f"Best month: {months[monthly_util.idxmax()-1]} ({monthly_util.max():.2f}%)")

This multi-panel efficiency dashboard **provides comprehensive operational intelligence** through coordinated visualizations. The daily utilization profile (top-left) reveals peak efficiency at 5pm (15.63%) and dramatic overnight lows at 4am (0.21%), showing over 70-fold hourly variation. The distribution histogram (top-right) shows that most hours operate at modest utilization levels (mean 6.39%, median 4.83%), with a right-skewed distribution indicating occasional high-demand periods. The weekly pattern analysis (bottom-left) shows relatively consistent performance across days, peaking on Friday (6.59%). The monthly trend (bottom-right) reveals seasonal patterns with June showing peak utilization (8.07%) and winter months showing the lowest efficiency. This integrated presentation enables **strategic decision-making across multiple operational dimensions** while revealing that overall utilization rates remain below 10% on average, suggesting either overcapacity in the fleet or opportunities for demand stimulation during off-peak periods.

---

## Summary and Transition to Programming Implementation

You've mastered essential data visualization principles: **visual perception fundamentals, chart selection strategies, and professional visualization techniques**. These skills transform statistical analysis results into compelling visual narratives that communicate insights effectively to business stakeholders.

Your ability to select appropriate chart types, apply visual perception principles, and create effective transportation visualizations prepares you to create professional presentations that drive strategic decisions and operational improvements in transportation consulting.

In the next lecture, you'll learn how to implement these visualization concepts using Python libraries, creating production-quality charts that communicate demand patterns, weather relationships, and business recommendations to your bike-sharing client.