## OAI_PI_Connector

Welcome to the readme.md file for OAI PIConnector. At the end of this session, you should have learned the following Optimus principles:
    
- understanding pi web api and its importance
- configuring server params
- explore pi server
- create a tag dictionary
- generate unified sample data for multiple sensors
- stream all recorded values for sensors
---

## where to find OAI connector 
To understand how the optimus code base is structured, it is very helpful to have worked through the [kedro docs and tutorials](https://kedro.readthedocs.io/en/stable/).


```
└── src
    └─── optimus_core
        └─── core
        │   ├── data_ingestion  
        │   │   └── __init__.py
        │   │   └── pi_connector.py
        │   │   └── streams.py
        │   │   └── utils.py

```
---

## Overview
- OSIsoft is the manufacturer of application software for real-time data management called the PI System, popular in industrial production of many goods e.g. mining, pharmaceutical drug production, Basic material production,  etc.
- Pi system collects, stores and provides opportunities to extract and analyze data. 
- OSIsoft software suite include PI processbook, PI Vision, PI BatchView, PI System Management Tools, PI DataLink and PI WebAPI which allows interaction with PI server through an api from any language 

## why PI web API connector

Now that we understand how to find connector code, let's talk about why to use them.

- collect metadata for all sensors without SME input i.e. generate intial **tag dictionary**
- alternatively, one can make a single api query to retrieve data for multiple sensors. **Note: helps decrease server load**
(https://devdata.osisoft.com/piwebapi/help/controllers/batch/actions/execute).
- 3 different types of data `streams`
    - `summary` - summarized all data captured within time interval. summary includes total, mean, std dev, last, etc.
    - `recorded pi points` - all data changes captured within timeframe specified
    - `current value` - get last data capture

### PI webapi concepts
##### webid
Resources in PI Web API are addressed by WebIDs, which are persistent, URL-safe identifiers that encode the GUIDs and/or paths associated with objects in the PI System. WebIDs are often used in links, and can also be found inside JSON responses. Client applications can then use these WebIDs as opaque identifiers in other URLs or query parameters. Because WebIDs are persistent, clients may cache URLs containing WebIDs for future use



## Parametrization
To test this API we would use Osisoft dev api endpoint

#### example params_ingestion.yml
```yaml
# user name created on pi server
pi_api_user: "webapiuser"
# password created on PI server
pi_api_password: "!try3.14webapi!"
# pi server endpoint
pi_endpoint: "devdata.osisoft.com"
# timezone to retrun data in. full list available in pytz
timezone: "UTC"
# api security certificate for the server(.pem file).
api_certificate: False

# Time should be ISO 8601 datetime format or 
# https://devdata.osisoft.com/piwebapi/help/topics/time-strings for more
# stream start time 
start_time: "*-7d"    # 7 days from now
# stream end time
end_time: "*",    # `*` represents current

# interval is paired with hourly flag 
# e.g interval=1&hourly=True means 1hr, interval=1&hourly=False means 1min
interval: 1    
hourly: True

# type of aggregation to use on the sensor data
# https://devdata.osisoft.com/piwebapi/help/topics/summary-type for others
aggregation: "Average"

# list of tags to stream data
tag_list: ["CDEP158", "CDF144_Repeated24h_forward", "CDM158"]
```

### Example Usage

#### Authenticate and validate PI webapi

In [2]:
# set param vars
pi_api_user, pi_api_password = "webapiuser", "!try3.14webapi!"
pi_endpoint = "devdata.osisoft.com"
tag_list = ["CDEP158", "CDF144_Repeated24h_forward", "CDM158"]

In [15]:
from optimus_core.core.data_ingestion.streams import OAICurrentValueStreams
current_value_stream = OAICurrentValueStreams(pi_endpoint, pi_api_user, pi_api_password)

#### Using OAIConnector to retrieve all plant sensor properties which can be used to create tag dictionary

In [16]:
tag_dict = current_value_stream.plant_data_attrs

In [17]:
tag_dict

Unnamed: 0,web_id,name,description,data_type,eng_units,low_limit,high_limit,value_link,data_server_id,data_server_name
0,F1DPW6Wlk0_Utku9vWTvxg45oACQAAAAUElTUlYxXEJBOk...,BA:ACTIVE.1,Batch Active Reactor 1,Digital,STATE,1,1,https://devdata.osisoft.com/piwebapi/streams/F...,F1DSW6Wlk0_Utku9vWTvxg45oAUElTUlYx,PISRV1
1,F1DPW6Wlk0_Utku9vWTvxg45oACAAAAAUElTUlYxXEJBOk...,BA:CONC.1,Concentration Reactor 1,Float32,DEG. C,0,200,https://devdata.osisoft.com/piwebapi/streams/F...,F1DSW6Wlk0_Utku9vWTvxg45oAUElTUlYx,PISRV1
2,F1DPW6Wlk0_Utku9vWTvxg45oABwAAAAUElTUlYxXEJBOk...,BA:LEVEL.1,Level Reactor 1,Float32,,0,100,https://devdata.osisoft.com/piwebapi/streams/F...,F1DSW6Wlk0_Utku9vWTvxg45oAUElTUlYx,PISRV1
3,F1DPW6Wlk0_Utku9vWTvxg45oACgAAAAUElTUlYxXEJBOl...,BA:PHASE.1,Phase Reactor 1,Digital,STATE,2,7,https://devdata.osisoft.com/piwebapi/streams/F...,F1DSW6Wlk0_Utku9vWTvxg45oAUElTUlYx,PISRV1
4,F1DPW6Wlk0_Utku9vWTvxg45oABgAAAAUElTUlYxXEJBOl...,BA:TEMP.1,Temperature Reactor 1,Float32,,0,100,https://devdata.osisoft.com/piwebapi/streams/F...,F1DSW6Wlk0_Utku9vWTvxg45oAUElTUlYx,PISRV1
...,...,...,...,...,...,...,...,...,...,...
995,F1DPW6Wlk0_Utku9vWTvxg45oAzegAAAUElTUlYxXENJVF...,CityBikes_austin_84d65c63f179f3dacf87104409dbf...,,Int32,,0,100,https://devdata.osisoft.com/piwebapi/streams/F...,F1DSW6Wlk0_Utku9vWTvxg45oAUElTUlYx,PISRV1
996,F1DPW6Wlk0_Utku9vWTvxg45oAzugAAAUElTUlYxXENJVF...,CityBikes_austin_8723bfa08ec83b133f6a9aeecd075...,,Int32,,0,100,https://devdata.osisoft.com/piwebapi/streams/F...,F1DSW6Wlk0_Utku9vWTvxg45oAUElTUlYx,PISRV1
997,F1DPW6Wlk0_Utku9vWTvxg45oAz-gAAAUElTUlYxXENJVF...,CityBikes_austin_8723bfa08ec83b133f6a9aeecd075...,,Int32,,0,100,https://devdata.osisoft.com/piwebapi/streams/F...,F1DSW6Wlk0_Utku9vWTvxg45oAUElTUlYx,PISRV1
998,F1DPW6Wlk0_Utku9vWTvxg45oA0OgAAAUElTUlYxXENJVF...,CityBikes_austin_889e7ef356b0f9c187e604560c022...,,Int32,,0,100,https://devdata.osisoft.com/piwebapi/streams/F...,F1DSW6Wlk0_Utku9vWTvxg45oAUElTUlYx,PISRV1


### query server with a list of tags for current sensor data
- params needed, list of sensors
- batch api call
#### Extra Params
`timezone`: timezone to retrieve data in <br>

In [18]:
current_data_stream = current_value_stream.query_sensor_data(tag_list)
current_data_stream

Unnamed: 0,timestamp,CDF144_Repeated24h_forward,CDEP158,CDM158
0,2020-08-24 19:38:37.190334,17719.2832,47,1


### query server for unified stream of multiple sensors at specified interval for summary data
For most Optimus projects ingesting data with the right summary(average, max, stdev, etc) is critical to the success of the model. With this module, we can stream data and speciify the level of aggregation to use, expecting PI to do the calculations for us

- params needed, list of sensors
- other aggregations (None, Total, Average, Minimum, Maximum, Range, StdDev, PopulationStdDev, Count)

#### Extra Params
`timezone`: timezone to retrieve data in. Default is utc <br>
`start_time`: start time of stream (iso format date) <br>
`end_time`: end time of stream (iso format date) <br>
`interval`: optional stream interval. Default is 1hour <br>
`hourly`: optional houlry stream. Default is True <br>
`aggregation`: option aggregation emthood. Default is Average

In [4]:
from optimus_core.core.data_ingestion.streams import OAISummaryStreams
summary_data_stream = OAISummaryStreams(pi_endpoint, pi_api_user, pi_api_password)
tag_list = ["CDEP158", "CDF144_Repeated24h_forward", "CDM158"]
current_data_stream = summary_data_stream.query_sensor_data(tag_list)

current_data_stream

Unnamed: 0,timestamp,CDF144_Repeated24h_forward,CDEP158,CDM158
0,2020-08-23 21:00:00,17459.1,11.1243,1
1,2020-08-23 22:00:00,17516.5,25.9469,2
2,2020-08-23 23:00:00,17481.9,30.0573,0
3,2020-08-24 00:00:00,17601.3,24.7097,2
4,2020-08-24 01:00:00,17640.3,26.0903,3
5,2020-08-24 02:00:00,17434.4,26.5488,3
6,2020-08-24 03:00:00,17596.1,14.6329,4
7,2020-08-24 04:00:00,17516.4,21.2328,1
8,2020-08-24 05:00:00,17558.4,26.396,0
9,2020-08-24 06:00:00,17533.0,28.5607,2


### Query server for stream of recorded values from multiple sensors.
This is mostly used to validate the data flow from the sensor e.g. validating a control loop to analyze all recorded sensoor changes 

`Note: this returns a array of dataframes. Access each element of the array to geth required data`


- params needed, list of sensors
#### Extra Params
`timezone`: timezone to retrieve data in. Default is utc <br>
`start_time`: start time of stream (iso format date) <br>
`end_time`: end time of stream (iso format date) <br>

In [6]:
from optimus_core.core.data_ingestion.streams import OAIRecordedStreams
recorded_data_stream = OAIRecordedStreams(pi_endpoint, pi_api_user, pi_api_password)
current_data_stream = recorded_data_stream.query_sensor_data(tag_list)

In [8]:
# first sensor
current_data_stream[0]

Unnamed: 0,timestamp,CDEP158
0,2020-08-23 20:15:58,7
1,2020-08-23 22:01:28,34
2,2020-08-23 23:13:58,24
3,2020-08-24 01:39:28,28
4,2020-08-24 02:40:28,9
5,2020-08-24 03:03:28,19
6,2020-08-24 05:00:28,29
7,2020-08-24 07:14:58,27
8,2020-08-24 08:51:58,38
9,2020-08-24 10:07:58,48


In [9]:
# first sensor
current_data_stream[1]

Unnamed: 0,timestamp,CDF144_Repeated24h_forward
0,2020-08-23 20:01:28,17523.7
1,2020-08-23 20:01:58,17145.5
2,2020-08-23 20:02:58,17215.8
3,2020-08-23 20:03:28,17529.7
4,2020-08-23 20:03:58,17523.8
...,...,...
1027,2020-08-24 19:51:58,17640.1
1028,2020-08-24 19:55:28,17552.8
1029,2020-08-24 19:55:58,17117.8
1030,2020-08-24 19:57:58,17283.2
