<a href="https://colab.research.google.com/github/vivacitylabs/data-toolkit/blob/master/notebooks/sensor_metadata_bulk_download_generator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sensor metadata - Bulk Download Generator



## Generate a csv file of sensor meta data and all their linked entities

This notebook is a tool to access VivaCity data via the API. It is aimed as an **interim solution** while we're working on new dashboard developments. You can contact customer support if you have any issues (support@vivacitylabs.com) or raise a ticked on the [Customer Help Portal](https://vivacitylabs.atlassian.net/servicedesk/customer/portal/16).

#### How it works

This notebook will run you through all the necessary steps and will save a csv file locally or in your Google Drive.

You will need to fill in a few details and then hit the run button (▶) next to the code cells.

If you want to make changes to the code and save them, you will first need to save a copy of this notebook to your Google Drive.

**What you will need**

- VivaCity API login credentials
- Sensor combinations you want to download data for

#### Output format

You will receive a list of all your countlines and the associated sensors in the following format.

Please note that sensor numbers are derived from countline names and depend on consistency in countline names. Sensors will appear multiple times if they have more than 1 countline linked to them.


| derived_sensor_number | sensor_lat | sensor_long |  countline_id | countline_name |	availableClasses| countline_direction | countline_location | devideuid |
|:---------:|:---------:|:---------:|:---------:|:---------:|:---------:|:---------:|:---------:|:---------:|
| 35 |53.483353	| -2.25362 |	13427| S35_HighStreet_road | ['pedestrian', 'cyclist', ...] | both	| {'start':{'lat': ...}	|  6a74a1e0-0477-11ea-dwr4-42010af00327 |

## Stage 1: Getting Started
Let's begin by importing the packages we'll need and creating some useful functions!

Hit the run button (▶) in the top left corner.

In [None]:
#@title { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Run this to import functions and connect to Google Drive
import requests
import getpass
import json
import pandas as pd
from datetime import date, datetime, timedelta
import csv
import time
import pytz
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

## Stage 2: Data Import

1. First, we authenticate the API user to get access to the data. If the user isn't setup properly, this will throw an error. Check you got the correct username and password.

2. Second, we will get all the sensors and countlines available to the API user .

3. In the future: We will add all related entities to the sensor (countlines and zones)

ℹ  Sensor names are retrieved from countline names so they can sligthly differ if not named consistently.


### Authentication
Now you will need your API login details, ie. a username and a password. If you don't have one, please contact contact customer support (support@vivacitylabs.com).

1.   Enter the username into the field on the right, then hit the run button (▶).
2.   Input the password in the box that appears below it and hit "enter" on your keyboard.

In [None]:
#@title  {vertical-output: true, display-mode: "form" }
#@markdown Insert your login credentials
username = "api-username" #@param {type:"string"}

auth_body = {}
auth_body['username'] = username
auth_body['password'] = getpass.getpass()

··········


### Get access token

In [None]:
#@title { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Run this to get authorized access to the API
print("Authorising...")
auth_response = requests.post("https://api.vivacitylabs.com/get-token", data=auth_body, headers={'Content-Type':'application/x-www-form-urlencoded'})
if auth_response.status_code == 401:
  print("\n!Error: Can't connect to the API. Check your username and password again.\nIf issues persists, ask customer support if your user is setup correctly on the API\n")
else:
  headers = {}
  headers['Authorization'] = "Bearer " + auth_response.json()['access_token']
  refresh_body = {}
  refresh_body['refresh_token'] = auth_response.json()['refresh_token']
  start = time.time()
  print("Done. Successfully retrieved access token.")

Authorising...
Done. Successfully retrieved access token.


### Get meta data for sensors and their countlines

In [None]:
#@title { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Run this to get sensor and countline metadata from the API V2

#get sensors
#print("\nRequesting metadata ...")
api_url_base = 'https://api.vivacitylabs.com'
sensor_request = requests.get(f'{api_url_base}/sensor', headers=headers)
sensors = sensor_request.json()

#get countlines
countline_request = requests.get(f'{api_url_base}/countline', headers=headers)
countlines = countline_request.json()

#convert to dataframe and merge both
df_sensors = pd.DataFrame.from_dict(sensors).explode('countlines').reset_index(drop=True).rename(columns={"id":"deviceuid", "location": "sensor_location", "countlines":"countline_id"})
df_countlines = pd.DataFrame.from_dict(countlines).rename(columns={"id": "countline_id", "name":"countline_name", "location":"countline_location", "direction":"countline_direction"})
df_meta = pd.merge(df_sensors, df_countlines, on="countline_id", how="left")

# split sensor coordinates
df_meta["sensor_lat"] = [i["lat"] if i is not None else "None" for i in df_meta["sensor_location"]]
df_meta["sensor_long"] = [i["long"] if i is not None else "None" for i in df_meta["sensor_location"]]

# drop sensors with no coordinates
df_meta = df_meta[df_meta["sensor_lat"]!="None"]

# drop sensors with no countlines
df_meta = df_meta[~df_meta["countline_id"].isna()]
df_meta = df_meta.reset_index(drop=True)

# drop countline with no coordinates
countline_coords = []
for i in range(len(df_meta["countline_location"])):
  if list(df_meta["countline_location"][i]["start"].items())[0][1] == None:
    countline_coords.append("No")
  else:
    countline_coords.append("Yes")
df_meta["countline_coords"] = countline_coords
df_meta = df_meta[df_meta["countline_coords"]!="No"]

# derive sensor number from countline names
sensor_name = df_meta[~df_meta["countline_name"].isna()].copy()
sensor_name = df_meta.copy()
sensor_name['derived_sensor_name'] = sensor_name['countline_name'].str.extract('([S,s]_?\d{1,3})', expand=True)
sensor_name["derived_sensor_name"] = sensor_name["derived_sensor_name"].fillna("Unknown")
sensor_name = sensor_name.drop_duplicates(subset='deviceuid', keep='first').sort_values(by='derived_sensor_name',ascending=True)

# derived_sensor_number
sensor_name['derived_sensor_number'] = sensor_name['derived_sensor_name'].str.extract('(\d+)', expand=True)
sensor_name["derived_sensor_number"] = sensor_name["derived_sensor_number"].fillna("Unknown")

# merge name into main dataframe
df_meta = pd.merge(df_meta, sensor_name[["deviceuid", "derived_sensor_name", "derived_sensor_number"]], on="deviceuid", how="left")

# reorder columns
df_meta = df_meta[['derived_sensor_number', 'sensor_lat', 'sensor_long', 'countline_id',
       'countline_name', 'availableClasses', 'countline_direction', 'countline_location', 'deviceuid']]


## Stage 3: Data Export
Now let's write this to a .csv file. You can either save the file locally (it will show in your Downloads folder) or save it to a Google Drive.

* **Local Downloads Folder:** This might not work if your browser or computer blocking downloads.
* **Google Drive:** If you want to save it in Google Drive, you will be asked for permission to connect to your Google Account.

In [None]:
#@title  { vertical-output: true, display-mode: "form" }
#@markdown Select where to save the csv file
download_location = "Local folder" #@param [ "Local folder", "Google Drive"]
#@markdown Name your file
filename = "sensors_and_countlines_metadata" #@param {type:"string"}
#@markdown Hit run (>)
df_export = df_meta

if download_location == "Local folder":
  from google.colab import files
  df_export.to_csv(filename + ".csv", index = False)
  files.download(filename + ".csv")
else:
  from google.colab import drive
  drive.mount('/content/drive')
  path = '/content/drive/My Drive/'
  df_export.to_csv(path + filename +".csv", index = False)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

### WIP: Add zones and other metadata from API v3

In [None]:
#@title { vertical-output: false, display-mode: "form" }
#@markdown **Code cell:** Once finalised this will match zones to sensors

#get hardware meta data
#print("\nRequesting metadata ...")
api_url_base = 'https://beta.api.vivacitylabs.com'
hardware_request = requests.get(f'{api_url_base}/hardware/metadata', headers=headers)
if hardware_request.status_code == 401:
  print("\n!Error: Can't access the data. Ask customer support if your user is setup correctly on API 3\n")
hardware = hardware_request.json()

# Get hardware info
dict_hard = { "hardware_id" : [], "countline_id" : [], "countline_name" : [] }
for id in hardware:
  for lens in hardware[id]["view_points"]:
    for entity in hardware[id]["view_points"][lens]:
      for countline_id in hardware[id]["view_points"][lens]["countlines"]:
        dict_hard["hardware_id"].append(id)
        dict_hard["countline_id"].append(countline_id)
        dict_hard["countline_name"].append(hardware[id]["view_points"][lens]["countlines"][countline_id]['name'])

#turn into dataframe and clean up
df_hard = pd.DataFrame.from_dict(dict_hard)
df_hard["sensor_name"] = df_hard["countline_name"]
df_hard["countline_name_display"] = df_hard["countline_name"] + " (" + df_hard["countline_id"] + ")"
for i in range(len(df_hard)):
  if len(df_hard["sensor_name"].iloc[i])>1:
    df_hard["sensor_name"].iloc[i] = df_hard["sensor_name"].iloc[i][0] + "" + df_hard["sensor_name"].iloc[i][1]
  else:
    df_hard["sensor_name"].iloc[i] = df_hard["sensor_name"].iloc[i]
df_hard = df_hard.drop_duplicates()
#print(len(df_hard["countline_id"].unique()), " countlines available")