Santander Cycles (or Boris Bikes) is a public bicycle hire scheme in London in the United Kingdom. I've used Tableau to provide an Exploratory Data Analysis (EDA) with visualizations of the bike usage in 2016 to find out if weather, season, holidays and other such variables have any effect on number of bike users. I've created a dashboard in Tableau Public to bring all the insights together.
Historical data for bike sharing in London 'Powered by TFL Open Data' from Kaggle. I’ve filtered the data only for the year 2016, with 8699 rows and 9 columns for every hour of the year which contains data on 10,129,546 bikes.
Column Name | Column Description |
---|---|
Timestamp | timestamp field for grouping the data |
Count Of Bikes | the count of a new bike shares |
Temp | real temperature in degree celsius |
Humidity | humidity in percentage |
Wind Speed | wind speed in km/h |
Weather Code | category of the weather 1 = Clear; 2 = scattered clouds / few clouds; 3 = Broken clouds; 4 = Cloudy; 7 = Rain/ light Rain shower/ Light rain; 10 = rain with thunderstorm; 26 = snowfall |
Is holiday | Boolean field 1 =holiday; 0= non holiday |
Is weekend | Boolean field 1=weekend; 0=weekday |
Season | category field meteorological seasons (spring, summer, fall and winter) |
-
Bike-sharing companies: These are the entities that own, operate, and maintain the bike-sharing system, such as providing the bikes, stations, docks, apps, etc.
-
The public: These are the potential and actual users of the bike-sharing system, who benefit from the convenience, affordability, and health benefits of bike-sharing.
-
The media: These are the platforms that communicate and promote the bike-sharing system to the public, such as social media, news outlets, blogs, etc.
-
The government: These are the authorities that regulate and support the bike-sharing system, such as providing policies, subsidies, infrastructure, etc.
The analysis is broken down into 2 parts:
Part 1: General Statistics for the London Bike Share dataset. In this part, I'll be looking for answers to these questions:
- Which month showed the highest number of bike riders?
- Which weekdays were busiest?
- Is there any difference of bike riders in weekdays and weekends?
- How many bike riders used these public bikes on holidays and non-holidays?
- Which hour sees the highest number of bike users during weekdays as well as weekends?
Part 2: Here I'll investigate the effects of season, weather, temperature, humidity and wind speed over the bike usage.
Riding bike is an outdoor activity and naturally it is affected by weather and seasons equally. As a summer morning will draw more people out than a rainy or snowy day, it is important to take into account the effect of different weather conditions. In this part, I'll be looking for answers to these questions:
- Which season is the most popular for bike riding?
- How do temperature, humidity and wind speed effect number of bike users?
- Which weather conditions were most suitable for bike riding?
Part 1:
From the above visualizations, the following insights can be inferred:
- July showed the highest number of bike users in 2016.
- Most of the bikes were used on Tuesday, Wednesday and Thursday.
- There is a significant difference of bike riders in weekdays and weekends. People used these public bikes mostly to commute to workplaces which coincides with our observation of having the number of riders on weekdays being 3 times of the number of riders on weekends.
- People tend to use these public bikes not only for going to and from workplaces, but for leisure as well on weekends, a significant number being on Summer when the weather is temperate and mild.
- 99% of the total users i.e more than 9 million people used bikes on non-holidays, as the number of holidays being very small as compared to the number of non-holidays.
- Among the holidays, people mostly used the bikes on summer bank holiday in August.
- The busiest hour in weekdays for all the months is 8 AM and 5 PM, and for weekends between 1 PM - 5 PM.
Part 2:
The scatterplot shows the number of riders on clear days.
This scatterplot shows the number of riders during days with scattered clouds.
This scatterplot shows the number of riders during cloudy days.
This scatterplot shows the number of riders during thunderstorms.
This scatterplot shows the number of riders when it was snowing in and around London.
From the analysis of various weather conditions, the following insights can be drawn-
- Summer has the highest number of bike users.
- The effect of seasons and weather on number of bike users is that mild and temperate climate draws more people out, while extreme heat or cold or rain decreases the number of users.
- Humidity has a negative correlation with number of users, as can be seen from the scatterplots. Number of riders decreases as humidity of the air increases. Autumn (September - November) is usually London’s rainiest season and Winters (December - February) are characterised by cold and often rainy weather. During these months humidity was 80-90%, which indicates higher chances of precipitation.
- The temperature range was between 10-12 degree celsius when most of the people rode out. Upto a certain point, number of bike users has a positive correlation with temperature, but high temperature (during Summer season) is also unfavourable for riding out.
- The wind speed has no visible correlation with number of bike riders, though heavy rainfall and snowstorms are often accompanied by high wind speed.
From the above analysis, the following conclusions can be drawn:
-
Weather and seasons have positive if not strong influence over the number of bike riders in London. Temperate weather conditions as can be seen in summer and fall, draws more riders out than a rainy or snowy day.
-
Number of riders seems to increase as temperature increases, annually as well as daily. July and August had the highest number of riders, while these were also the warmest months in 2016.
-
Number of riders gradually decreases as humidity of the air increases. It coincides well with people heading out on clearer days than on cloudy or rainy days.
-
Most people tend to use bikes during the daytime rush-hours i.e., at 8 AM and 5 PM during weekdays, as they commute to and from workplaces. On weekends most people use bikes in early afternoon (1PM — 5PM).
-
Bike-sharing companies: They should improve their user-interface design, to make it more user-friendly, intuitive, and informative. They should also monitor the feedback and ratings of their products, and adjust their pricing and quality accordingly, to maintain customer satisfaction and loyalty.
-
The public: They should use the bike-sharing system responsibly and civilly, by following the relevant rules and norms, such as parking in designated areas, returning the bikes on time, reporting any damages, etc. They should also spread the word and encourage others to use the bike-sharing system, by sharing their positive experiences and benefits on social media, blogs, etc.
-
The media: They should raise the awareness and interest of the public towards the bike-sharing system, by highlighting its advantages and impacts on sustainability, health, and mobility. They should also report on the challenges and issues of the bike-sharing system, and provide constructive suggestions and solutions.
-
The government: They should provide financial and policy support to the bike-sharing system, such as offering grants, subsidies, tax incentives, etc. They should also provide adequate and safe infrastructure for the bike-sharing system, such as bike lanes, parking zones, signage, etc.
For this project I've created an exploratory dashboard providing an overview of the analysis, highlighting important insights.
In order to build the tableau visualizations and the dashboards, I've taken the following steps:
-
Used floating containers to contain the individual data visualizations.
-
Used titles for each of the visualizations and the dashboard.
-
Used filter and annotations to highlight important data points.
-
Used consistent color scheme that enhances the data.
-
Cleared the chart junks such as titles of individual data viz, axis and axis labels, color legends etc that doesn't contribute to the data design.
-
Visually grouped the data by using layout containers and colored background
-
Added filter and highlight actions in the dashboard to provide user interactivities.
-
Added an info button that provides additional informations on the dataset and how to read the dashboard.
-
Created 3 device specific layouts.
Check out the dashboard here London Bike Share - What do you need to know before heading out?