Bike sharing is now a common mean of transportation. This project investigates which model can predict bike rentals accurately. For that purpose, the dataset used compiles different information about bike rentals in the city of Washington D.C. The dataset can be downloaded from here.
Some of the relevant columns are:
instant
- An ID for each rowdteday
- The date of the rentalsseason
- The season. It can be:1
: spring2
: summer3
: fall4
: winter
year
- Year the rental occured. It can be:1
: 20112
: 2012
mnth
- Month of the rentalhr
- Hour of the rentalholiday
- Whether it was a holiday or notweekday
- The day of the week (1 to 7)workingday
- Whether it was a working day or notweathersit
- The weather. It can be:1
: Clear2
: Mist3
: Snow4
: Rain
temp
- Normalized temperatureatem
- Adjusted temperaturehum
- Normalized humiditywindspeed
- Normalized wind speedcasual
- Number of people renting a bike without being signed upregistered
- Number of people renting a bike and being registeredcnt
- Total number of bike rentals (casual
+registered
)
Objective: The objective is to identify the best machine learning model to predict the total number of bike rentals for a given time of the day.
Techniques used:
- Pandas, matplotlib
- Linear regression
- Decision tree regressor
- Random forest regressor