# Onboard API Wrapper Introduction
In this notebook we will be exploring the Onboard API and API wrapper. Make sure to run the following chunk of code first in order to install it.

These notebooks were adapted from previous notebooks developed for a hackathon we ran. A series of YouTube videos discussed those previous versions of the notebooks. While they are no longer a perfect match for the content of these notebooks, they have some useful information in them, including: 
- [how to get started with your Onboard account](https://youtu.be/WmAlWNSH5Tk) [note: only the first half of this video will likely be relevant for the current version of these notebooks],
- get [acquainted with the Onboard API](https://youtu.be/HqwutnN0Bvc), 
- and start using the [Onboard API wrapper to extract and explore data](https://youtu.be/cn32ma0FaQg).

In [1]:
# install API wrapper first
!pip install onboard.client

## API Keys

For the Onboard API, your API key allows secure, automated access to the Onboard portal, where you can access API endpoints and data by attaching one or more scopes to the key. The key is linked to its creator's user account, and can only access information that is allowed according to the scopes attached to it.

So to start using Onboard API, **make sure you not only have an account, but you also have generated an API key**. The following section explains how to do this.

## Generating API keys

In the [official documentation](https://onboard-data-python-client-api.readthedocs.io/en/latest/Initial%20Setup.html#setting-up-api-access) you can find the instructions to generate your API Key:

> If you are an existing Onboard user you can head over to [your account’s api keys page](https://portal.onboarddata.io/account?tab=api) and generate a new key and grant scopes for “general” and “buildings:read”. If you would like to get access to Onboard and start prototyping against an example building please request [access here](https://www.onboarddata.io/contact-us).

After generating your API key with the **general** and **buidings:read** scopes, we use it to connect to the client.

### Connecting to the API

One trick to use your api key while still keeping it (relatively) secret is to create a separate `key.py` file which will consist only of defining the string variable `api_key`, like so:

```
api_key = 'ob-p-{the rest of your key here}'
```

So now we can import that variable and provide it to the client, without having to display the key in plain text in this notebook. Isn't that neat? ![Isn't that neat?](https://media.giphy.com/media/CWKcLd53mbw0o/giphy.gif "neat")

In [3]:
from onboard.client import OnboardClient # import wrapper

try: # for this you can either create a key.py file with one line: api_key = 'your api key here'
    from key import api_key
except ImportError: # or you can just input your api key when you get the prompt
  api_key = input('please enter your api_key')

client = OnboardClient(api_key=api_key)

client = OnboardClient(api_key=api_key) # use imported api_key to connect to API

In [4]:
client.whoami() # to confirm you've connected to the API and get info about your account and this key's scopes

{'result': 'ok',
 'apiKeyInHeader': True,
 'apiKeyScopes': ['general',
  'buildings:read',
  'auth',
  'collection:admin',
  'buildings:write',
  'collection'],
 'apiVersion': '2022-04-14',
 'userInfo': {'token': None,
  'name': 'Christopher',
  'full_name': 'Christopher DT',
  'username': 'christopher',
  'palette': {'primary': {'main': ''}, 'secondary': {'main': ''}},
  'org_short_name': 'Onboard',
  'org_id': 2,
  'logo_url': '',
  'mfa_required': True},
 'authLevel': 0}

For these notebooks, we will work with the example/mock building data (ids 427 and 428), which anyone with API access will be able to work with.

In [4]:
import pandas as pd # we will manipulate and analyze data using Pandas
building_ids = [427, 428] # these are the building IDs for our mock example buildings
buildings = pd.DataFrame(client.get_all_buildings()).query('id in @building_ids')
buildings

Unnamed: 0,id,org_id,name,address,sq_ft,image_src,bms_manufacturer,bms_product_name,bms_version,timezone,info,status,equip_count,point_count
14,428,6,Laboratory,,,,,,,America/New_York,,LIVE,122,1654
21,427,6,Office Building,,,,,,,America/New_York,,LIVE,137,2422


In [5]:
buildings.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 14 to 21
Data columns (total 14 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   id                2 non-null      int64  
 1   org_id            2 non-null      int64  
 2   name              2 non-null      object 
 3   address           0 non-null      object 
 4   sq_ft             0 non-null      float64
 5   image_src         0 non-null      object 
 6   bms_manufacturer  0 non-null      object 
 7   bms_product_name  0 non-null      object 
 8   bms_version       0 non-null      object 
 9   timezone          2 non-null      object 
 10  info              0 non-null      object 
 11  status            2 non-null      object 
 12  equip_count       2 non-null      int64  
 13  point_count       2 non-null      int64  
dtypes: float64(1), int64(4), object(9)
memory usage: 240.0+ bytes


We can also get the equipment for those buildings. How about the first one?

In [6]:
# We can get all the equipment from one of our buidings
bid = buildings.id.tolist()[0]
all_equipment = pd.DataFrame(client.get_building_equipment(bid))
all_equipment.head()

Unnamed: 0,id,building_id,equip_id,suffix,equip_type_name,equip_type_id,equip_type_abbr,equip_type_tag,equip_subtype_name,equip_subtype_id,equip_subtype_tag,flow_order,floor_num_physical,floor_num_served,area_served_desc,equip_dis,parent_equip,child_equip,points,tags
0,27294,428,exhaustFan-01,1,Fan,26,FAN,fan,Exhaust Fan,12.0,exhaustFan,4,,,Bathrooms,,[],[],"[{'id': 290774, 'building_id': 428, 'last_upda...","[fan, hvac, exhaustFan]"
1,27295,428,exhaustFan-021,21,Fan,26,FAN,fan,Exhaust Fan,12.0,exhaustFan,4,1004.0,1004.0,Lab 021,,[],[],"[{'id': 289684, 'building_id': 428, 'last_upda...","[fan, hvac, exhaustFan]"
2,27296,428,exhaustFan-022,22,Fan,26,FAN,fan,Exhaust Fan,12.0,exhaustFan,4,1004.0,1004.0,Lab 022,,[],[],"[{'id': 289653, 'building_id': 428, 'last_upda...","[fan, hvac, exhaustFan]"
3,27297,428,exhaustFan-115,115,Fan,26,FAN,fan,Exhaust Fan,12.0,exhaustFan,4,1.0,1.0,Lab 115,,[],[],"[{'id': 290137, 'building_id': 428, 'last_upda...","[fan, hvac, exhaustFan]"
4,27298,428,exhaustFan-116,116,Fan,26,FAN,fan,Exhaust Fan,12.0,exhaustFan,4,1.0,1.0,Lab 116,,[],[],"[{'id': 290088, 'building_id': 428, 'last_upda...","[fan, hvac, exhaustFan]"


In [7]:
all_equipment.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 122 entries, 0 to 121
Data columns (total 20 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   id                  122 non-null    int64  
 1   building_id         122 non-null    int64  
 2   equip_id            122 non-null    object 
 3   suffix              120 non-null    object 
 4   equip_type_name     122 non-null    object 
 5   equip_type_id       122 non-null    int64  
 6   equip_type_abbr     122 non-null    object 
 7   equip_type_tag      122 non-null    object 
 8   equip_subtype_name  44 non-null     object 
 9   equip_subtype_id    44 non-null     float64
 10  equip_subtype_tag   44 non-null     object 
 11  flow_order          122 non-null    int64  
 12  floor_num_physical  115 non-null    float64
 13  floor_num_served    121 non-null    float64
 14  area_served_desc    122 non-null    object 
 15  equip_dis           0 non-null      object 
 16  parent_e

In this data frame we have listed all the equipment in the select building. Check out the column `points`: these are all the data points asociated to that equipment. You can [query specific points](https://onboard-data-python-client-api.readthedocs.io/en/latest/Querying%20Building-Specific%20Data.html#querying-specific-points) with certain conditions using *PointSelector* (we'll get deeper into this in the following notebook).

In [8]:
from onboard.client.models import PointSelector

First create your *PointSelector* object:

In [9]:
query = PointSelector()

And you can specify all the conditions you want ([check the doc](https://onboard-data-python-client-api.readthedocs.io/en/latest/Querying%20Building-Specific%20Data.html#querying-specific-points)):

In [10]:
query.point_types = ['Zone Temperature']
query.equipment_types = ['fcu']
query.buildings = [427, 428] # again, the example buildings

And when you execute that query using `select_points` you will get the points that satisfy those conditions:

In [11]:
selection = client.select_points(query)
selection

{'orgs': [6],
 'buildings': [427, 428],
 'equipment': [27136,
  27137,
  27063,
  27065,
  27067,
  27070,
  27074,
  27076,
  27077,
  27079,
  27084,
  27088,
  27096,
  27099,
  27356,
  27357,
  27106,
  27107,
  27108,
  27109,
  27110,
  27120,
  27123,
  27127,
  27128,
  27129,
  27130,
  27131,
  27132,
  27133,
  27134,
  27135],
 'equipment_types': [9],
 'point_types': [77],
 'points': [284803,
  286085,
  285964,
  286353,
  284822,
  285719,
  288153,
  289701,
  289575,
  284841,
  285745,
  286258,
  286641,
  286898,
  285621,
  284860,
  285889,
  285766,
  286284,
  286668,
  288847,
  286160,
  286805,
  288982,
  288859,
  285791,
  286306,
  287714,
  288612,
  286566,
  288623,
  286328]}

And those are the points, identified by id:

In [12]:
points = selection["points"]
points

[284803,
 286085,
 285964,
 286353,
 284822,
 285719,
 288153,
 289701,
 289575,
 284841,
 285745,
 286258,
 286641,
 286898,
 285621,
 284860,
 285889,
 285766,
 286284,
 286668,
 288847,
 286160,
 286805,
 288982,
 288859,
 285791,
 286306,
 287714,
 288612,
 286566,
 288623,
 286328]

And you can get all the details about those points using `get_points_by_ids`:

In [13]:
sensor_metadata = client.get_points_by_ids(points)
pd.DataFrame(sensor_metadata).head()

Unnamed: 0,id,building_id,last_updated,first_updated,device,network_device,objectId,name,description,units,tagged_units,raw_unit_id,value,type,point_type_id,measurement_id,datasource_hash,topic,state_text,equip_id
0,284803,427,1665012000000.0,1626898000000.0,,,,zone_temp_1,Zone Temp,degreesFahrenheit,f,2,70.0,Zone Temperature,77,1,41140c75b1f87dc3a1d9d158efdd51ad,onboard/officebldg1/device1003815/analogInput/2,,27133
1,284822,427,1665012000000.0,1626898000000.0,,,,zone_temp_1,Zone Temp,degreesFahrenheit,f,2,70.0,Zone Temperature,77,1,d8b90a1d2b4d098ba8753d972d2030ac,onboard/officebldg1/device1003816/analogInput/2,,27134
2,284841,427,1665012000000.0,1626898000000.0,,,,zone_temp_1,Zone Temp,degreesFahrenheit,f,2,70.0,Zone Temperature,77,1,6e8253c604490a22bf4f5c2e5c0d8430,onboard/officebldg1/device1003817/analogInput/2,,27135
3,284860,427,1665012000000.0,1626898000000.0,,,,zone_temp_1,Zone Temp,degreesFahrenheit,f,2,70.0,Zone Temperature,77,1,f7f34df6351edaaa6aca3218cb10bd12,onboard/officebldg1/device1003818/analogInput/2,,27136
4,285621,427,1665012000000.0,1626898000000.0,,,,zone_temp_1,Zone Temp,degreesFahrenheit,f,2,74.8,Zone Temperature,77,1,fa4560f8a10571788756c5f65cb6352e,onboard/officebldg1/device10311/analogInput/4,,27123


Let's stop for a moment on the `last_updated` and `first_updated`. Looks weird for a datetime, right? This is because it's in [Unix Time](https://en.wikipedia.org/wiki/Unix_time), formatted as the miliseconds that passed since the Unix Epoch; The Unix epoch is 00:00:00 UTC on 1 January 1970 (an arbitrary date). Fortunately, it's very easy to convert those miliseconds to UTC:

In [8]:
from datetime import datetime, timezone, timedelta

# The starting point of Unix Time
datetime.fromtimestamp(0, timezone.utc)

datetime.datetime(1970, 1, 1, 0, 0, tzinfo=datetime.timezone.utc)

Let's do it for each value in `last_updated` using panda's [`apply` method](https://pandas.pydata.org/docs/reference/api/pandas.Series.apply.html) (remember that we are converting miliseconds):

In [18]:
sensor_metadata = pd.DataFrame(sensor_metadata)

Let's inspect the datetime period we are working with; we have to convert the `first_updated` and `last_updated` from unix timestamp to a datetime, like we just did, to find the limits.

In [19]:
sensor_metadata.first_updated.apply(lambda x: datetime.fromtimestamp(x/1000, timezone.utc)).min()

Timestamp('2021-07-21 20:04:07.444000+0000', tz='UTC')

In [20]:
sensor_metadata.last_updated.apply(lambda x: datetime.fromtimestamp(x/1000, timezone.utc)).max()

Timestamp('2022-10-05 23:12:41.215000+0000', tz='UTC')

We have data from between 2021 and 2022, which we will use soon when getting the time-series using Onboard Client's methods for [querying time-series data](https://onboard-data-python-client-api.readthedocs.io/en/latest/Querying%20Building-Specific%20Data.html#querying-time-series-data). Let's import the needed libraries first:

In [10]:
import pytz
from onboard.client.models import TimeseriesQuery, PointData
from onboard.client.dataframes import points_df_from_streaming_timeseries

First we have to select our time period in UTC. As we see before, we have 2021-2022 data. Let's select a period of data; we are using the [python library datetime](https://docs.python.org/3/library/datetime.html) to create datetime objects. Remember that all the data from the API is in UTC, you have to localize all your datetimes using [PyTZ](https://pythonhosted.org/pytz/), a library designed for that purpose.

In [11]:
# Select your timezone
tz = pytz.timezone('UTC')

# to set specific absolute times:
# start = datetime(2022,9,15,0,0,0).replace(tzinfo=tz)
# end = datetime(2022,10,1,0,0,0).replace(tzinfo=tz)

#set relative dates/times
end = datetime.now(pytz.utc)
start = end - timedelta(days = 30)

print(f"from {start} to {end}")

from 2022-10-01 20:31:59.691301+00:00 to 2022-10-31 20:31:59.691301+00:00


And let's get the time-series data from the previously selected `points`. Using `TimeseriesQuery` we create the query we want to execute:

In [23]:
timeseries_query = TimeseriesQuery(point_ids = points[0:10], start = start, end = end)
timeseries_query

TimeseriesQuery(start=datetime.datetime(2022, 9, 5, 23, 15, 7, 37360, tzinfo=<UTC>), end=datetime.datetime(2022, 10, 5, 23, 15, 7, 37360, tzinfo=<UTC>), selector=None, point_ids=[284803, 286085, 285964, 286353, 284822, 285719, 288153, 289701, 289575, 284841], units={})

And now we execute it and transform to dataframe:

In [24]:
# Exceute query (will return an object)
query_results = client.stream_point_timeseries(timeseries_query)

# Convert to dataframe
sensor_data = points_df_from_streaming_timeseries(query_results)

In [25]:
sensor_data

Unnamed: 0,timestamp,285719,288153,284803,286085,289575,289701,284841,285964,284822,286353
0,2022-09-05T23:15:08.350000Z,,,,,,77.0,,,,
1,2022-09-05T23:15:10.307000Z,,,,,77.0,,,,,
2,2022-09-05T23:15:35.186000Z,75.599998,,,,,,,,,
3,2022-09-05T23:15:41.962000Z,,,,,,,,74.800003,,
4,2022-09-05T23:15:42.640000Z,,72.400002,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...
413529,2022-10-05T23:12:05.351000Z,,,,,,,70.0,,,
413530,2022-10-05T23:12:08.740000Z,73.900002,,,,,,,,,
413531,2022-10-05T23:12:17.636000Z,,73.300003,,,,,,,,
413532,2022-10-05T23:12:20.653000Z,,,70.0,,,,,,,


And we have the time-series for our selected points. You can export it as csv:

In [None]:
sensor_data.to_csv("sensor_data.csv", index=False)

Finally, let's visualize our sensors data! (we will get a bit deeper on this subject in following notebooks)

In [105]:
import matplotlib.pyplot as plt
import seaborn as sns

# This is for the visual stype, I like "ggplot"
plt.style.use('ggplot')

# This for the figure size
plt.rcParams["figure.figsize"] = (20,9)

In [None]:
sns.lineplot(data=sensor_data, x = "timestamp")

In [None]:
timeseries_query