Skip to content

Exploratory data analysis of the London Public Bikes dataset for 2016 with Tableau

License

Notifications You must be signed in to change notification settings

Arpita-deb/London_Bike_Share_Tableau_EDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

Santander Bikes Seasonal Data Analysis for 2016

Exploratory data analysis of the London Public Bikes dataset for 2016 with Tableau

Introduction:

Santander Cycles (or Boris Bikes) is a public bicycle hire scheme in London in the United Kingdom. I've used Tableau to provide an Exploratory Data Analysis (EDA) with visualizations of the bike usage in 2016 to find out if weather, season, holidays and other such variables have any effect on number of bike users. I've created a dashboard in Tableau Public to bring all the insights together.

Dataset Used:

Historical data for bike sharing in London 'Powered by TFL Open Data' from Kaggle. I’ve filtered the data only for the year 2016, with 8699 rows and 9 columns for every hour of the year which contains data on 10,129,546 bikes.

Data Dictionary:

Column Name Column Description
Timestamp timestamp field for grouping the data
Count Of Bikes the count of a new bike shares
Temp real temperature in degree celsius
Humidity humidity in percentage
Wind Speed wind speed in km/h
Weather Code category of the weather 1 = Clear; 2 = scattered clouds / few clouds; 3 = Broken clouds; 4 = Cloudy; 7 = Rain/ light Rain shower/ Light rain; 10 = rain with thunderstorm; 26 = snowfall
Is holiday Boolean field 1 =holiday; 0= non holiday
Is weekend Boolean field 1=weekend; 0=weekday
Season category field meteorological seasons (spring, summer, fall and winter)

Potential Stakeholders:

  • Bike-sharing companies: These are the entities that own, operate, and maintain the bike-sharing system, such as providing the bikes, stations, docks, apps, etc.

  • The public: These are the potential and actual users of the bike-sharing system, who benefit from the convenience, affordability, and health benefits of bike-sharing.

  • The media: These are the platforms that communicate and promote the bike-sharing system to the public, such as social media, news outlets, blogs, etc.

  • The government: These are the authorities that regulate and support the bike-sharing system, such as providing policies, subsidies, infrastructure, etc.

Overview of the Project:

The analysis is broken down into 2 parts:

Part 1: General Statistics for the London Bike Share dataset. In this part, I'll be looking for answers to these questions:

  1. Which month showed the highest number of bike riders?
  2. Which weekdays were busiest?
  3. Is there any difference of bike riders in weekdays and weekends?
  4. How many bike riders used these public bikes on holidays and non-holidays?
  5. Which hour sees the highest number of bike users during weekdays as well as weekends?

Part 2: Here I'll investigate the effects of season, weather, temperature, humidity and wind speed over the bike usage.

Riding bike is an outdoor activity and naturally it is affected by weather and seasons equally. As a summer morning will draw more people out than a rainy or snowy day, it is important to take into account the effect of different weather conditions. In this part, I'll be looking for answers to these questions:

  1. Which season is the most popular for bike riding?
  2. How do temperature, humidity and wind speed effect number of bike users?
  3. Which weather conditions were most suitable for bike riding?

Analysis:

Part 1:

Number of bikes used in each month)

number of bikes by weekday

Weekend/weekday

holiday/nonholiday donut

list of holidays

hour of the day

hours for weekends/weekdays)

weekends by seasons

From the above visualizations, the following insights can be inferred:

  • July showed the highest number of bike users in 2016.
  • Most of the bikes were used on Tuesday, Wednesday and Thursday.
  • There is a significant difference of bike riders in weekdays and weekends. People used these public bikes mostly to commute to workplaces which coincides with our observation of having the number of riders on weekdays being 3 times of the number of riders on weekends.
  • People tend to use these public bikes not only for going to and from workplaces, but for leisure as well on weekends, a significant number being on Summer when the weather is temperate and mild.
  • 99% of the total users i.e more than 9 million people used bikes on non-holidays, as the number of holidays being very small as compared to the number of non-holidays.
  • Among the holidays, people mostly used the bikes on summer bank holiday in August.
  • The busiest hour in weekdays for all the months is 8 AM and 5 PM, and for weekends between 1 PM - 5 PM.

Part 2:

seasons donut

Scatterplot The scatterplot shows the number of riders on clear days.

scattered clouds (2) This scatterplot shows the number of riders during days with scattered clouds.

cloudy (2) This scatterplot shows the number of riders during cloudy days.

thunderstorm (2) This scatterplot shows the number of riders during thunderstorms.

snowing (2) This scatterplot shows the number of riders when it was snowing in and around London.

humidity histogram

temp graph

temp histogram

cummulative effects of weather

From the analysis of various weather conditions, the following insights can be drawn-

  • Summer has the highest number of bike users.
  • The effect of seasons and weather on number of bike users is that mild and temperate climate draws more people out, while extreme heat or cold or rain decreases the number of users.
  • Humidity has a negative correlation with number of users, as can be seen from the scatterplots. Number of riders decreases as humidity of the air increases. Autumn (September - November) is usually London’s rainiest season and Winters (December - February) are characterised by cold and often rainy weather. During these months humidity was 80-90%, which indicates higher chances of precipitation.
  • The temperature range was between 10-12 degree celsius when most of the people rode out. Upto a certain point, number of bike users has a positive correlation with temperature, but high temperature (during Summer season) is also unfavourable for riding out.
  • The wind speed has no visible correlation with number of bike riders, though heavy rainfall and snowstorms are often accompanied by high wind speed.

Conclusion:

From the above analysis, the following conclusions can be drawn:

  1. Weather and seasons have positive if not strong influence over the number of bike riders in London. Temperate weather conditions as can be seen in summer and fall, draws more riders out than a rainy or snowy day.

  2. Number of riders seems to increase as temperature increases, annually as well as daily. July and August had the highest number of riders, while these were also the warmest months in 2016.

  3. Number of riders gradually decreases as humidity of the air increases. It coincides well with people heading out on clearer days than on cloudy or rainy days.

  4. Most people tend to use bikes during the daytime rush-hours i.e., at 8 AM and 5 PM during weekdays, as they commute to and from workplaces. On weekends most people use bikes in early afternoon (1PM — 5PM).

Some Recommendations:

  • Bike-sharing companies: They should improve their user-interface design, to make it more user-friendly, intuitive, and informative. They should also monitor the feedback and ratings of their products, and adjust their pricing and quality accordingly, to maintain customer satisfaction and loyalty.

  • The public: They should use the bike-sharing system responsibly and civilly, by following the relevant rules and norms, such as parking in designated areas, returning the bikes on time, reporting any damages, etc. They should also spread the word and encourage others to use the bike-sharing system, by sharing their positive experiences and benefits on social media, blogs, etc.

  • The media: They should raise the awareness and interest of the public towards the bike-sharing system, by highlighting its advantages and impacts on sustainability, health, and mobility. They should also report on the challenges and issues of the bike-sharing system, and provide constructive suggestions and solutions.

  • The government: They should provide financial and policy support to the bike-sharing system, such as offering grants, subsidies, tax incentives, etc. They should also provide adequate and safe infrastructure for the bike-sharing system, such as bike lanes, parking zones, signage, etc.

Dashboard:

Santander cycles For this project I've created an exploratory dashboard providing an overview of the analysis, highlighting important insights.

In order to build the tableau visualizations and the dashboards, I've taken the following steps:

  1. Used floating containers to contain the individual data visualizations.

  2. Used titles for each of the visualizations and the dashboard.

  3. Used filter and annotations to highlight important data points.

  4. Used consistent color scheme that enhances the data.

  5. Cleared the chart junks such as titles of individual data viz, axis and axis labels, color legends etc that doesn't contribute to the data design.

  6. Visually grouped the data by using layout containers and colored background

  7. Added filter and highlight actions in the dashboard to provide user interactivities.

  8. Added an info button that provides additional informations on the dataset and how to read the dashboard.

  9. Created 3 device specific layouts.

Check out the dashboard here London Bike Share - What do you need to know before heading out?

References:

About

Exploratory data analysis of the London Public Bikes dataset for 2016 with Tableau

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published