The first step–before we start to build a model–is always to take a closer look at the data by analyzing
some statistics of the data set. In this step, you can already gain some insights into the data which can
help you build your model and interpret your results later on.
Look into the provided data set by e.g. plotting the individual features. Based on your analysis,
answer the following questions:
(i) Which are the numerical features and which are the categorical features?
(ii) Is there a greater trend to need an increase in the availability of bicycles? Study this question
from various perspectives:
• Can any trend be seen comparing different hours, weeks, and months?
• Is there any difference between weekdays and holidays?
• Is there any trend depending on the weather? Rainy days, snowy days, etc.
Write concise answers to each question and support your findings with evidence (statistics, plots,
etc.). Discuss the results. Additionally, you can explore the correlation of features, outliers, range of
values, and many more aspects

In [2]:
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("training_data_ht2025.csv")

# Create binary: 1 if high real demand (i.e. "low_bike_demand" = bikes are being used)
df['high_demand'] = (df['increase_stock'] == 'low_bike_demand').astype(int)

# Group by holiday and calculate percentage of high demand hours
holiday_stats = df.groupby('holiday')['high_demand'].mean() * 100

# Plot
plt.figure(figsize=(8, 6))
bars = plt.bar(
    [0, 1],
    holiday_stats.values,
    color=['#3498db', '#e74c3c'],
    width=0.6
)

# Add percentage labels on top
for i, bar in enumerate(bars):
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, height + 1,
             f'{height:.1f}%',
             ha='center', va='bottom', fontsize=14, fontweight='bold')

# Clean looks
plt.xticks([0, 1], ['Regular Day', 'Holiday'], fontsize=12)
plt.ylabel('High Bike Demand (%)', fontsize=13)
plt.title('Do People Ride More Bikes on Holidays?', fontsize=16, pad=20)
plt.ylim(0, 100)
plt.box(False)  # remove top/right borders

# Save
plt.tight_layout()
plt.savefig("holiday_bike_demand.png", dpi=300, transparent=False)
plt.show()

ModuleNotFoundError: No module named 'matplotlib'