<a href="https://colab.research.google.com/github/vivacitylabs/data-toolkit/blob/master/notebooks/turning_counts_bulk_download_generator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Turning Counts - Bulk Download Generator



## Generate a csv file of turning counts data over multiple days

This notebook is a tool to access VivaCity data via the API. It is aimed as an **interim solution** while we're working on new dashboard developments. You can contact customer support if you have any issues (support@vivacitylabs.com) or raise a ticked on the [Customer Help Portal](https://vivacitylabs.atlassian.net/servicedesk/customer/portal/16).

#### How it works

This notebook will run you through all the necessary steps and will save a csv file locally or in your Google Drive.

You will need to fill in a few details and then hit the run button (▶) next to the code cells.

If you want to make changes to the code and save them, you will first need to save a copy of this notebook to your Google Drive.



**What you will need**

- VivaCity API login credentials
- Sensors and their zones you want to download data for



ℹ  Note the notebook only works for zones that have [turning counts](https://vivacitylabs.customerly.help/vivacity-dashboard/turning-counts) enabled


#### Output format


You will receive turning couunts in selected time buckets in the following format:

- origin: zone_id
- destination: zone_id

| origin | destination | origin_longname | destination_longname |  Local Datetime | 	car |	motorbike |
|:---------:|:---------:|:---------:|:---------:|:---------:|:---------:|:---------:|
| 1121 |1530	| S4 - Zone 1 |	S4 - Zone 2 | 2023-02-20 00:00:00	|2 |	0 |



## Stage 1: Getting Started
Let's begin by importing the packages we'll need and creating some useful functions!

Hit the run button (▶) in the top left corner.

In [None]:
#@title { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Run this to import functions and connect to Google Drive
import requests
import getpass
import json
from datetime import date, datetime, timedelta
import pytz
import pandas as pd
import csv
import time
from IPython.display import Markdown, display
def printmd(string):
    display(Markdown(string))
from ipywidgets import interact, interactive, fixed, interact_manual, Layout, Box
import ipywidgets as widgets
import warnings
warnings.filterwarnings('ignore')

def get_date_range(start_date, end_date):
    start_dates = []
    end_dates = []

    start_date = datetime.fromisoformat(start_date)
    end_date = datetime.fromisoformat(end_date)
    while True:
        start_dates.append(start_date.strftime('%Y-%m-%dT%H:%M:%S.000Z'))
        end_dates.append((start_date+timedelta(days=1)).strftime('%Y-%m-%dT%H:%M:%S.000Z'))
        start_date = start_date+timedelta(days=1)
        if start_date > end_date:
            break
    date_range = list(zip(start_dates, end_dates))
    return date_range

import plotly.graph_objects as go

## Stage 2: Data Import
At the end of this process we will requested data from the API for the dates in the range that you have determined in the **Zone Details** step.

The resulting JSON responses will then be converted into a data table in the **Data Processing** step.

Authentication is handled at this stage.

### Authentication
Now you will need your API login details, ie. a username and a password. If you don't have one, please contact contact customer support (support@vivacitylabs.com).


1.   Enter the username into the field on the right, then hit the run button (▶).
2.   Input the password in the box that appears below it and hit "enter" on your keyboard.


In [None]:
#@title  { run: "auto", vertical-output: true, display-mode: "form" }
#@markdown **Code cell:** Insert your login credentials, then run
key = "api-key" #@param {type:"string"}

### Retrieve available sensors and zones

We'll now get an access token using the username and password set above and get all sensors and their zones the api user has access to.

In [None]:
#@title { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Run this to retrieve sensors and their zones available to you from the API

#get hardware meta data
print("\nRequesting metadata ...")
url = "https://api.vivacitylabs.com/hardware/metadata"

payload = {}
headers = {
  'Accept': 'application/json',
  'x-vivacity-api-key': key
}

response = requests.request("GET", url, headers=headers, data=payload)
if response.status_code == 401:
  print("\n!Error: Can't access the data. Ask customer support if your user is setup correctly on API 3\n")
hardware = response.json()

# Clean up hardware and zones
dict_hard = { "hardware_id" : [], "sensor_name" : [], "zone_id": [], "zone_name" : [], "turning_counts" : [] }

for id in hardware:
    for lens in hardware[id]["view_points"]:
        for entity in hardware[id]["view_points"][lens]: #derive sensor name from first countline name
            if len(list(hardware[id]["view_points"][lens]["countlines"].keys())) == 0:
                cname = "Unknown"
            else:
                cid = list(hardware[id]["view_points"][lens]["countlines"].keys())[0]
                cname = hardware[id]["view_points"][lens]["countlines"][cid]['name']
        for zone_id in hardware[id]["view_points"][lens]["zones"]:
            dict_hard["hardware_id"].append(id)
            dict_hard["sensor_name"].append(cname)
            dict_hard["zone_id"].append(zone_id)
            dict_hard["zone_name"].append(hardware[id]["view_points"][lens]["zones"][zone_id]['name'])
            dict_hard["turning_counts"].append(hardware[id]["view_points"][lens]["zones"][zone_id]['is_turning'])

#turn into dataframe and clean up
df_hard = pd.DataFrame.from_dict(dict_hard)
df_hard = df_hard[df_hard["turning_counts"] == True].drop_duplicates().reset_index(drop=True)
df_hard["sensor_name"] = df_hard["sensor_name"].str.split("_")
for i in range(len(df_hard)):
    if len(df_hard["sensor_name"].iloc[i])>1:
        df_hard["sensor_name"].iloc[i] = df_hard["sensor_name"].iloc[i][0] + " " + df_hard["sensor_name"].iloc[i][1]
    else:
        df_hard["sensor_name"].iloc[i] = df_hard["sensor_name"].iloc[i]

#create dropdown(
df_hard["zones_dropdown"] = (df_hard["sensor_name"].astype(str) + " - " + df_hard["zone_name"].astype(str)
                             + " (" +  df_hard["zone_id"].astype(str) + ")")
print("Done. Successfully retrieved metadata.")


Requesting metadata ...
Done. Successfully retrieved metadata.


### Select zones and dates
Choose one or more zones, the classes you want data for, the date period and time bucket (15min, 1h, 24h).

**Note:** The sensor name is derived from countline names (eg. countline name: S4_harleyRd_crossing => extracted sensor name: S4_harleyRd). Sometimes countlines are not named consistently resulting in odd sensor names.

In [None]:
#@title { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Run this and then make your selection below
box_layout = Layout(display='flex', flex_flow='column', align_items='stretch', border=None, width='28%')

start_date_input = widgets.DatePicker(description="Start date",layout=Layout(width='55%'))
end_date_input = widgets.DatePicker(description="End date",layout=Layout(width='55%'))
timezone = widgets.Dropdown(options=["America/New_York",
                                     "Australia/Adelaide","Australia/Brisbane", "Australia/Darwin",
                                     "Australia/Melbourne", "Australia/Perth", "Australia/Sydney",
                                     "Europe/Berlin",'Europe/London', 'NZ', 'NZ-CHAT'],
                            description="Timezone",layout=Layout(width='55%'))
zones_input = widgets.SelectMultiple(
    options=df_hard["zones_dropdown"].unique(),
    description='Zones',
    disabled=False,
    layout=Layout(width='auto', height='170px'))
class_input = widgets.SelectMultiple(
    options=[ "cyclist", "motorbike", "car", "pedestrian", "taxi", "van", "minibus", "bus", "rigid", "truck", "emergency_car", "emergency_van", "fire_engine", "escooter"],
    description='Class',  disabled=False,
    layout=Layout(width='55%', height='235px')
)

items = [start_date_input, end_date_input, timezone, class_input, zones_input]
box = Box(children=items, layout=box_layout)
printmd("**Select date period and zones**")
printmd("Hold  `Ctrl`  to select multiple classes or zones")
print("")
box

**Select date period and zones**

Hold  `Ctrl`  to select multiple classes or zones




Box(children=(DatePicker(value=None, description='Start date', layout=Layout(width='55%')), DatePicker(value=N…

### Getting the data


We now query Turning Counts data from the API.

The output will tell you how many requests are made and what the progress is.


In [None]:
#@title { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Run this to set the API request parameters and check your selection again
params = {}
params['start_zone_ids'] = df_hard[df_hard["zones_dropdown"].isin(zones_input.value)]["zone_id"].to_list()
params['end_zone_ids'] = df_hard[df_hard["zones_dropdown"].isin(zones_input.value)]["zone_id"].to_list()
params['classes'] = list(class_input.value)
params["time_bucket"] = "15m"
params['fill_zeros'] = "true"

#convert local datetime to UTC datetime
current_date = datetime.now()
try:
  start_date_utc = str(pd.to_datetime(start_date_input.value).tz_localize(timezone.value).astimezone(pytz.utc))
  end_date_utc = str(pd.to_datetime(end_date_input.value).tz_localize(timezone.value).astimezone(pytz.utc))
  if start_date_input.value > end_date_input.value:
    print("Warning: Start date is after end date, please correct your date selection")
  elif datetime.combine(end_date_input.value, datetime.min.time()) > current_date:
    print("Warning: End date is in the future. Please select only until today.")
  else:
    date_range = get_date_range(start_date_utc, end_date_utc)
except AttributeError:
  print("Warning: No start and end time selected")
finally:
  if len(df_hard[df_hard["zones_dropdown"].isin(zones_input.value)]["hardware_id"].unique())!=1:
    print("Warning: You selected zones from multiple sensors or no zones at all, please check your selection")
  if len(class_input.value)==0:
    print("Warning: You selected no classes, please check your selection")
  printmd("\n**Check your selection:**\n")
  print("Dates:", start_date_input.value, "to", end_date_input.value, "\nClass:", list(class_input.value),
          "\nZones:", list(zones_input.value))


**Check your selection:**


Dates: 2024-11-26 to 2024-11-27 
Class: ['car'] 
Zones: ['Zone A (10069)', 'Zone C (10070)', 'Zone B (10071)']


In [None]:
#@title { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Run this to request data from the API (can take a bit)

data = []
for i,date in enumerate(date_range):
  params["from"] = date[0]
  params["to"] = date[1]
  response = requests.get('https://api.vivacitylabs.com/zone/turning_movements', params=params, headers=headers)
  print(str(i+1) + "/" + str(len(date_range)) + ": " + str(response.status_code) + " " + response.reason)
  if response.status_code == 200:
    turning_json = response.json()

    turning_dict = {"origin": [], "destination" : [], "from" : [], "to" : [], "class" : [], "value" : []}

    for origin, destinations in turning_json.items():
          for destination, items in destinations.items():
              for time_bucket in items:
                  for _class in time_bucket["turning_movements"].keys():
                      turning_dict["origin"].append(origin)
                      turning_dict["destination"].append(destination)
                      turning_dict["from"].append(time_bucket["from"])
                      turning_dict["to"].append(time_bucket["to"])
                      turning_dict["class"].append(_class)
                      turning_dict["value"].append(time_bucket["turning_movements"][_class])

    turning_df = pd.DataFrame.from_dict(turning_dict)
    data.append(turning_df)

  else:
    print("Data missing for " + params["timeFrom"].split("T")[0] + " to " + params["timeTo"].split("T")[0])
  time.sleep(1)

data = pd.concat(data, axis=0, ignore_index=True)

1/2: 200 OK
2/2: 200 OK


## Stage 3: Data Processing
Now we process the raw data output and put it into a nice format. This might take a while depending on the date period chosen.

In [None]:
#@title  { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Make your selections below and then hit run to process data

#@markdown Include same to same movements?
same_to_same = "No" #@param [ "No", "Yes"]
#@markdown Select time bucket
timebucket = "24h" #@param ['15min', "1h", "24h"]

print("Processing data ...")

export = data.copy()

#remove same to same movement
if same_to_same == "No":
  export = export[export['origin'] != export['destination']]

#reshape data
export = export.pivot(index=['origin', 'destination', 'from'], columns='class', values='value').reset_index()

# convert back to local time
export = export.rename(columns={"from":"UTC Datetime"})
export["UTC Datetime"]  = pd.to_datetime(export["UTC Datetime"])
export["Local Datetime"]  = pd.to_datetime([pd.to_datetime(i).astimezone(pytz.timezone(timezone.value)).strftime("%Y-%m-%d %H:%M:%S") for i in export["UTC Datetime"]])
print("Half way there, still processing ...")

# group by selected time bucket
export = export.groupby([pd.Grouper(key='Local Datetime', freq=timebucket), 'origin', 'destination']).sum(numeric_only=True).reset_index()

#add zone names
export["origin_longname"] = export["origin"].map(dict(zip(df_hard["zone_id"], df_hard["zones_dropdown"])))
export["destination_longname"] = export["destination"].map(dict(zip(df_hard["zone_id"], df_hard["zones_dropdown"])))

#drop columns and reorder for export
columns = ['origin', 'destination', 'origin_longname', 'destination_longname', 'Local Datetime'] + params["classes"]
export = export[columns]
print("Done. Data ready for export.")

Processing data ...
Half way there, still processing ...
Done. Data ready for export.


## Stage 4: Data Export
Now let's write this to a .csv file. You can either save the file locally (it will show in your Downloads folder) or save it to a Google Drive.


* **Local Downloads Folder:** This might not work if your browser or computer blocking downloads.
* **Google Drive:** If you want to save it in Google Drive, you will be asked for permission to connect to your Google Account.

In [None]:
#@title  { vertical-output: true, display-mode: "form" }
#@markdown Select where to save the csv file
download_location = "Local folder" #@param [ "Local folder", "Google Drive"]
#@markdown Name your file
filename = "turning-counts-test2" #@param {type:"string"}
#@markdown Hit run (>)

if download_location == "Local folder":
  from google.colab import files
  export.to_csv(filename + ".csv", index = False)
  files.download(filename + ".csv")
else:
  from google.colab import drive
  drive.mount('/content/drive')
  path = '/content/drive/My Drive/'
  export.to_csv(path + filename +".csv", index = False)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# Stage 5: Data visualisation - Work in progress

Aim: Produce high level visualisation of turning movements

In [None]:
test = data.groupby(["origin", "destination"])["value"].sum().to_frame().reset_index()
test["origin"] = test["origin"].astype(int)*10
test

Unnamed: 0,origin,destination,value
0,100690,10069,0
1,100690,10070,64
2,100690,10071,153
3,100700,10069,141
4,100700,10070,1
5,100700,10071,7517
6,100710,10069,0
7,100710,10070,3
8,100710,10071,0


In [None]:

link = dict(source = test["origin"], target = test["destination"], value = test["value"])
test_data = go.Sankey(link = link)

fig = go.Figure(test_data)

fig.show()

In [None]:
import plotly.graph_objects as go
fig = go.Figure( go.Scatter(x=[1,2,3], y=[1,3,2] ) )
fig.show()