# Edinburgh bike counter

Arjan Geers

Analysis of [Edinburgh bike counter data](https://data.edinburghopendata.info/dataset/bike-counter-data-set-cluster).

## Preamble

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd

from bikecounter.data import get_edinburgh_bike_counter_data

In [None]:
%matplotlib inline
plt.style.use('seaborn')

## Get data

Download and read the data of all bike counters. Each counter has one or more 'channels', which correspond to directions such as northbound and southbound. For now, we'll just consider the total count across all channels.

In [None]:
df = get_edinburgh_bike_counter_data()
df.tail()

## Bike counter activity

Check on what days the bike counters counted at least one bike.

In [None]:
def get_datetime_ranges(datetimes, bin_size='D'):
    """Given a list of sorted datetimes, return a list
    of datetime ranges [(start, end), (...)] that
    correspond to periods without gaps at the bin_size
    resolution.
    
    """
    datetime_ranges = []
    start = datetimes[0]
    for i in range(1, len(datetimes)):
        if (datetimes[i].to_period(bin_size) -
            datetimes[i - 1].to_period(bin_size)).n > 1:
            end = datetimes[i - 1]
            datetime_ranges.append((start, end))
            start = datetimes[i]
    end = datetimes[-1]
    datetime_ranges.append((start, end))
    return datetime_ranges

In [None]:
dfr = df.resample('D').sum()

fig, ax = plt.subplots(figsize=(8, 12))
for y, bike_counter in enumerate(dfr.columns):
    active = dfr[bike_counter].where(dfr[bike_counter] > 0).dropna().index
    active_ranges = get_datetime_ranges(active, bin_size=on 'D')
    for active_range in active_ranges:
        ax.hlines(y, active_range[0], active_range[1])


ax.set_title('Days with bike counter activity')
plt.yticks(range(len(dfr.columns)), dfr.columns);

We notice two distinct sets of bike counters. '01' to '19' were active from 2008 to 2015 and '20' to '48' from 2015 to 2016. Only '27' and '28' were active throughout both periods.

There are many large and small gaps in the time series. Gaps often appear at the same time for the second set of counters.