
# 🚲 Smart Bike Rental Demand Predictor

This project aims to build a predictive model for bike rental demand using historical usage patterns and environmental data.
We use the UCI Bike Sharing Dataset which includes daily and hourly aggregated bike rental data from a system in Washington, D.C.



## 📁 Dataset Description

==========================================
Bike Sharing Dataset
==========================================

Hadi Fanaee-T

Laboratory of Artificial Intelligence and Decision Support (LIAAD), University of Porto
INESC Porto, Campus da FEUP
Rua Dr. Roberto Frias, 378
4200 - 465 Porto, Portugal


=========================================
Background 
=========================================

Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return 
back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return 
back at another position. Currently, there are about over 500 bike-sharing programs around the world which is composed of 
over 500 thousands bicycles. Today, there exists great interest in these systems due to their important role in traffic, 
environmental and health issues. 

Apart from interesting real world applications of bike sharing systems, the char...

We will primarily use the `day.csv` and `hour.csv` datasets which include features such as:
- Season, holiday, working day
- Weather situation
- Temperature, humidity, wind speed
- Casual and registered users


In [None]:

import pandas as pd

# Load datasets
day_df = pd.read_csv("day.csv")
hour_df = pd.read_csv("hour.csv")

# Display first few rows
day_df.head()


In [None]:

import matplotlib.pyplot as plt
import seaborn as sns

# Plot rental counts by season
sns.boxplot(data=day_df, x='season', y='cnt')
plt.title('Rental Demand by Season')
plt.show()


In [None]:

# Convert date column to datetime
day_df['dteday'] = pd.to_datetime(day_df['dteday'])
day_df['year'] = day_df['dteday'].dt.year
day_df['month'] = day_df['dteday'].dt.month

# Encode categorical variables
categorical_features = ['season', 'weathersit', 'holiday', 'workingday']
day_df = pd.get_dummies(day_df, columns=categorical_features, drop_first=True)

day_df.head()


In [None]:

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Features and target
X = day_df.drop(['instant', 'dteday', 'casual', 'registered', 'cnt'], axis=1)
y = day_df['cnt']

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2 = r2_score(y_test, y_pred)

print(f"RMSE: {rmse:.2f}")
print(f"R^2 Score: {r2:.2f}")



## ✅ Conclusion

- The bike rental demand shows clear patterns based on season and weather.
- A Random Forest model provides a good prediction baseline with strong performance metrics.
- This model can help optimize resource allocation for bike sharing systems.

Further improvements could include hyperparameter tuning, additional time series modeling, and usage of the `hour.csv` dataset.
