# Placemeter

This notebook is focused on how to make a DataFram for Placemeter's sensors all aroudn New York City through their API and plot the results. Thus, It is necessarily to have Placemeter API token.


In [1]:
%load_ext watermark
%watermark -v -m -p pandas,numpy

import json
import codecs

import pandas as pd
from pandas.tseries.offsets import DateOffset
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import time
from datetime import date
from time import mktime

import uncurl
import requests

### A .txt file with the Placemeter API token should be placed with this code file
api_key = open('token.txt', 'r').read() 

from plotly import __version__
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
from plotly.graph_objs import Bar, Scatter, Figure, Layout

from plotly.graph_objs import *
import numpy as np

import cufflinks as cf

print ("Plotly  ", __version__)
init_notebook_mode(connected=True)

import matplotlib.pyplot as plt
%matplotlib inline 


CPython 3.5.2
IPython 4.2.0

pandas 0.18.1
numpy 1.11.1

compiler   : GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)
system     : Darwin
release    : 15.6.0
machine    : x86_64
processor  : i386
CPU cores  : 8
interpreter: 64bit
Plotly   1.12.5


# Exploring API queries

In the API, the following GET command is recommended for connecting to Placemeter servers. Which needs to be converted to “requests.get ()” for use in python.


In [2]:
uncurl.parse ('GET https://api.placemeter.net/api/v1/measurementpoints/id/data?start=start&end=end&resolution=resolution&metrics=metrics[&classes=classes][&include_unreliable_data=include_unreliable_data]')

'requests.get("https://api.placemeter.net/api/v1/measurementpoints/id/data?start=start&end=end&resolution=resolution&metrics=metrics[&classes=classes][&include_unreliable_data=include_unreliable_data]",\n    headers={},\n    cookies={},\n)'

Now we are going to get the heading for all the available sensors in a 'list' format. 


In [3]:
r = requests.get("https://api.placemeter.net/api/v1/measurementpoints", headers={ "Authorization": api_key }) 
data = r.json()  # making a jason file 
print (data)
print (type (data))

[{'name': 'Broadway @ Canal - Street', 'sensor': 7589, 'type': 'turnstile', 'id': 14331, 'metrics': [{'name': 'Direction 1', 'id': 'direction_1'}, {'name': 'Direction 2', 'id': 'direction_2'}], 'location': {'longitude': -74.0018671, 'latitude': 40.7198744}, 'classes': ['all']}, {'name': 'Canal Subway NQR', 'sensor': 7589, 'type': 'doorway', 'id': 14333, 'metrics': [{'name': 'Direction 1', 'id': 'direction_1'}, {'name': 'Direction 2', 'id': 'direction_2'}], 'location': {'longitude': -74.0018671, 'latitude': 40.7198744}, 'classes': ['all']}, {'name': 'Broadway @ Canal - West Sidewalk', 'sensor': 7589, 'type': 'turnstile', 'id': 14332, 'metrics': [{'name': 'Direction 1', 'id': 'direction_1'}, {'name': 'Direction 2', 'id': 'direction_2'}], 'location': {'longitude': -74.0018671, 'latitude': 40.7198744}, 'classes': ['all']}, {'name': 'Canal South Half?', 'sensor': 7665, 'type': 'turnstile', 'id': 17204, 'metrics': [{'name': 'Direction 1', 'id': 'direction_1'}, {'name': 'Direction 2', 'id': '

The above list can be placed in a DataFrame for better readability. 


In [4]:
df1 = pd.DataFrame.from_dict(data)
df1.head()

Unnamed: 0,classes,id,location,metrics,name,sensor,type
0,[all],14331,"{'longitude': -74.0018671, 'latitude': 40.7198...","[{'name': 'Direction 1', 'id': 'direction_1'},...",Broadway @ Canal - Street,7589.0,turnstile
1,[all],14333,"{'longitude': -74.0018671, 'latitude': 40.7198...","[{'name': 'Direction 1', 'id': 'direction_1'},...",Canal Subway NQR,7589.0,doorway
2,[all],14332,"{'longitude': -74.0018671, 'latitude': 40.7198...","[{'name': 'Direction 1', 'id': 'direction_1'},...",Broadway @ Canal - West Sidewalk,7589.0,turnstile
3,"[all, ped, car]",17204,"{'longitude': None, 'latitude': None}","[{'name': 'Direction 1', 'id': 'direction_1'},...",Canal South Half?,7665.0,turnstile
4,[all],17205,"{'longitude': None, 'latitude': None}","[{'name': 'Traffic in', 'id': 'direction_1'}, ...",Area 1,7665.0,


#  Downloading and plotting data for a single sensor

The following code is to download the data from sensor 14333 and plot the number of recorded people between 2016-05-15 and 2016-08-31.

We are going to use a similar command as previous section to connect to the 14333 sensor, which is located at Canal Subway NQR doorway, and then placed the data in a DataFrame. 


In [5]:
## 14333
r14333 = requests.get("https://api.placemeter.net/api/v1/measurementpoints/14333/data?start=1463270400&end=1470009600", headers={ "Authorization": api_key })
data14333= r14333.json()
df14333 = pd.DataFrame.from_dict(data14333['data'])
df14333.tail()

Unnamed: 0,all,timestamp,unreliable_periods
1867,"{'direction_2': 140, 'direction_1': 180}",1469991600,[]
1868,"{'direction_2': 155, 'direction_1': 238}",1469995200,[]
1869,"{'direction_2': 125, 'direction_1': 209}",1469998800,[]
1870,"{'direction_2': 114, 'direction_1': 203}",1470002400,[]
1871,"{'direction_2': 75, 'direction_1': 203}",1470006000,[]


As you can see in the above table, in our dataframe we have a column which is called “all” and contains informant regarding the number of travels in each direction. In the flowing section, we are looping over each raw separate the value of  direction_1 and direction_2 and place them in columns with the same name. 


In [6]:
df3 = pd.DataFrame(columns= ['direction1', 'direction2'])
for (i,r) in df14333.iterrows():
    e = r['all']
    df3.loc[i] = [e['direction_1'], e['direction_2']]
df3.tail()

Unnamed: 0,direction1,direction2
1867,180.0,140.0
1868,238.0,155.0
1869,209.0,125.0
1870,203.0,114.0
1871,203.0,75.0


Now we are going to concat the df3 to df14333.  


In [7]:
df14333 = pd.concat([df14333, df3], axis =1)
df14333 = df14333[['timestamp','unreliable_periods','direction1','direction2']]
df14333.tail()

Unnamed: 0,timestamp,unreliable_periods,direction1,direction2
1867,1469991600,[],180.0,140.0
1868,1469995200,[],238.0,155.0
1869,1469998800,[],209.0,125.0
1870,1470002400,[],203.0,114.0
1871,1470006000,[],203.0,75.0


Finally, we are going to translate out timestamp from Unix Time Stamp to EST. 


In [8]:
df14333[ 'timestamp'] = pd.to_datetime(df14333['timestamp'], unit='s')
df14333 = df14333.set_index( 'timestamp').tz_localize('UTC').tz_convert('America/New_York').tz_convert(None).reset_index()
df14333.set_index( 'timestamp', inplace = True)
df14333.index = df14333.index - DateOffset(hours = 4) 
df14333.tail()


Unnamed: 0_level_0,unreliable_periods,direction1,direction2
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2016-07-31 15:00:00,[],180.0,140.0
2016-07-31 16:00:00,[],238.0,155.0
2016-07-31 17:00:00,[],209.0,125.0
2016-07-31 18:00:00,[],203.0,114.0
2016-07-31 19:00:00,[],203.0,75.0


Plotting the data with Plotly


In [9]:
iplot(df14333.iplot(asFigure=True,
                            kind='scatter',xTitle='Dates',yTitle='Number of People',title='14333'))

# For any number of sensor and time period 

In the following I have put the pervious codes to getter and expanded over it  so that we can get the data for any number of sensors, and selectable time period. 



In [10]:
main_df = pd.DataFrame()

## Sensor IDs
idnumber = '14333', '14331', '14332' 

## Starting day for collecting data.
start = date(2016, 6, 1) 
start =int(mktime(start.timetuple())) 

## End day for collecting data.
end = date(2016, 8, 16) 
end =int(mktime(end.timetuple()))

In [11]:
for sensors in idnumber:
    query  = "https://api.placemeter.net/api/v1/measurementpoints/"+sensors+"/data?start="+str(start)+"&end="+str(end)
    r = requests.get(query, headers={ "Authorization": api_key })
    data= r.json()
    df = pd.DataFrame.from_dict(data['data'])
    df.rename(columns={'unreliable_periods':str(sensors)+'_unreliable_periods', 
                       'timestamp':str(sensors)+'_timestamp',
                       'all':str(sensors)+'_all'}, inplace=True)
    
    df3 = pd.DataFrame(columns= [str(sensors)+'_direction1', str(sensors)+'_direction2'])
    
    for (i,r) in df.iterrows():
        e = r[str(sensors)+'_all']
        df3.loc[i] = [e['direction_1'], e['direction_2']]
    
    df = pd.concat([df, df3], axis =1)
    df = df[[str(sensors)+'_timestamp', str(sensors)+'_unreliable_periods', str(sensors)+'_direction1', str(sensors)+'_direction2']]
    
    df[ str(sensors)+'_timestamp'] = pd.to_datetime(df[ str(sensors)+'_timestamp'], unit='s')
    df = df.set_index( str(sensors)+'_timestamp').tz_localize('UTC').tz_convert('America/New_York').tz_convert(None).reset_index()
    df.set_index( str(sensors)+'_timestamp', inplace = True)
    df.index = df.index - DateOffset(hours = 4) 
    
    if main_df.empty:
        main_df = df
    else:
        main_df = main_df.join(df)

    

#### Unfortunately, Github dose not render iplot yet, so I have added couple of screenshot from the plot and published it on [Anaconda]( https://anaconda.org/sohrabrs/retrieving_data_from-placemeter/notebook)

In [12]:
        
iplot(main_df.iplot(asFigure=True,
                        kind='scatter',xTitle='Dates',yTitle='Number of People',title='All the sensors'))

<img src="fig1.png">
<img src="fig2.png">
