# Beer Cooling Modeling - Solution - Bifrost-OCS Interpolation 

---
## NOTE: This notebook has been modified to support the latest version of Dataviews and SDS interpolation. More specifically:

## * Digital state error are returned as null (empty string) from Dataviews, they are mapped to 'Bad Input' string before storage into a panda dataframe; 

## * User-defined digital states are returned as integer value instead of a string. 
---

Using the same dataset as for the ADF Prediction Demo notebook, this time we'll model the cooling phase. Similarly to the ADF Prediction with fermentation stages, this time we should identify cooling stages and compute elapsed times to align the data for regression and comparison. 

![Beer Cooling](https://academicpi.blob.core.windows.net/software/beer-cooling-setting.png)

In [22]:
# For interaction with OCS
# from ocs_academic_hub import OCSClient, timer
from ocs_datascience import OCSClient, timer

import configparser
import datetime as dt
from dateutil import parser
import functools
import time
from enum import Enum
from pathlib import Path

import plotly.graph_objs as go
import plotly.io as po
import numpy as np
import pandas as pd
from scipy.optimize import curve_fit

pd.set_option('display.expand_frame_repr', False)
pd.options.mode.chained_assignment = None

The main function is `compute_cooling_predictions` with the following specification: 

### Input parameters:

* Brand of beer
* Which set of temperature sensor to use: bottom, middle, top
* Training days: how many days (starting at 2017-03-17) to consider for cooling curve regression

### Output: 

* Data used for regression
* Data for regression curve 
* Number of fermentation found (must be at least 1) 

### Function steps (the number are referred to in the function body)

| Step # | Function called | Description |  
|-------|-----------------|:-------------:|
| 0 | `get_all_brand_data` | get data for all 6 fermenter (this step happens before calling `compute_cooling_predictions`) |
| 1 | none | keep only data of the selected brand given in input | 
| 2 | `brand_df_cleanup` | clean data: remove bad values, keep only right stages | 
| 3 | `fermentation_starts` | identify all fermentation starts | 
| 4 | `cooling_data_extraction` | build a dataframe with all cooling data 
 
All possible beer brands are:
* Realtime Hops (3)
* 5450
* Alistair
* Kerberos
* Red Wonder 
* Grey Horse 

We'll start with the following input parameters: 

* Brand: Realtime Hops
* Temperature sensor: Middle
* Training days: 20 days starting at 2017-03-17T07:00
* Interval: 2 minutes (00:02:00)

## Your task 

Function `compute_cooling_predictions` in the next cell contains `TODO` items in comments. Complete each of them to get a working notebook. If your code is correct, you should see the following graph appear at the bottom of this notebook:

![Beer Cooling Prediction](https://academicpi.blob.core.windows.net/software/beer-cooling-prediction.png)

## Function `compute_cooling_predictions`

In [23]:
# %%debug
# import pdb
# from pdb import set_trace as bp
@timer
def compute_cooling_predictions(all_brands_df, brand, temp_sensors, training_days, interval='00:01:00'):
    """
    Input parameters:
    * brand to consider
    * temperature sensor position to use for computation
    * number of days to compute prediction parameters
    """
    # All possible brands, start with Realtime Hops 
    # ['5450' 'Bad Input' 5450 nan 'Alistair' 'Kerberos' 'Realtime Hops'
    #  'Red Wonder' 'Grey Horse']
    use_temp_position = {
        Pos.bottom: temp_sensors['bottom'],
        Pos.middle: temp_sensors['middle'],
        Pos.top: temp_sensors['top']
    }
    # STEP 1: Keep only data for input brand
    # TODO: write filter expression for all_brands_df, return result in brand_df
    # 
    # =========== STUDENT BEGIN ==========
    brand_df = all_brands_df[all_brands_df['Brand'] == brand] 
    # =========== STUDENT END ==========
    # 
    # STEP 2: clean data: remove bad values, keep only right stages
    # TODO: complete code block within function brand_df_cleanup 
    # 
    brand_status_df = brand_df_cleanup(brand_df)
    # 
    # STEP 3: identify all fermentation starts
    # TODO: complete code of function fermentation_starts 
    #
    fermentation_df = fermentation_starts(brand_status_df)
    #
    if len(fermentation_df) == 0:  
        raise Exception('!!! No fermentation data for brand:', brand)
    else:
        print(f'  @@@ Number of fermentation for brand {brand}: {len(fermentation_df)}')
    # 
    # STEP 4: build a dataframe with all cooling data 
    # TODO: complete code of function cooling_data_extraction
    #
    cooling_data = cooling_data_extraction(fermentation_df, brand_status_df, use_temp_position)
    # print(cooling_data)
    # 
    # Verify that it was possible to extract the data for a complete cooling phase 
    # 
    if len(cooling_data) == 0:
        raise Exception('!!! Error, no cooling data for brand:', brand)
    else:       
        ############### CURVE FIT REGRESSION BEGIN - DO NOT CHANGE #############
        # Get all cooling data in a single dataframe
        cool_df = pd.concat(cooling_data)

        # sort the temperatures in a descending fashion
        cool_df = cool_df.sort_values(by=['temperature'], ascending=False)

        # get the y value for the x, this will be used in curve fitting
        cool_df['temp_y'] = cool_df['temperature'].shift(-1)
        cool_df = cool_df[:-1]  # drop the last row

        # Select first label which has cooling data
        cool_df_training = pd.DataFrame()
        lbl = 0
        while cool_df_training.empty:
            cool_df_training = cool_df[cool_df.label == lbl]
            lbl += 1
        print(cool_df_training.Volume.unique())    
        x1_train = cool_df_training.temperature.values  # training temperature feature
        x2_train = cool_df_training.Volume.values.astype(float)  # training Volume feature
        x = [x1_train, x2_train]  # [temperature, volume]

        # Training of non-linear least squares model
        # Nonlinear curve-fitting pass a tuple in curve fitting
        popt, pcov = curve_fit(temperature_profile, x, cool_df_training.temp_y.values) 
        
        a = popt[0]  # get the coefficient a (alpha) in the model
        b = popt[1]  # get the coefficient b (beta) in the model
 
        # Get the initial point of all temperature curves
        # y_first = [x1_train[0] + i for i in range(-8, 9, 4)]  # plot on either side of the initial temperature
        y_first = [x1_train[0]]  # if you want to plot a single data field

        # Compute the prediction for each individual start temperature
        for y_predicted in y_first:
            y_pred = [y_predicted]
            cool_df_training = cool_df_training.sort_values(by=['tsc'])
            for i in range(1, len(x2_train)):
                y_predicted = y_predicted * (1 + (a / x2_train[i])) - (a * b / x2_train[i])
                y_pred.append(y_predicted)
                
        ############### CURVE FIT REGRESSION END - DO NOT CHANGE #############

    return cool_df, y_pred, cool_df_training, y_first[0], len(fermentation_df), len(cooling_data)

### Standard OCS initialization code

In [24]:
config = configparser.ConfigParser()
config.read('config.ini')

ocs_client = OCSClient(config.get('Access', 'ApiVersion'),config.get('Access', 'Tenant'), config.get('Access', 'Resource'), 
                     config.get('Credentials', 'ClientId'), config.get('Credentials', 'ClientSecret'))

namespace_id = config.get('Configurations', 'Namespace')
headers = ocs_client.authorization_headers(namespace_id)
headers

{'Authorization': 'bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IjJDQjI4MzFEREJFRDc1NzAyM0NCMTM5OUVBRjRDMjkxQzE3MkQ5RjQiLCJ0eXAiOiJKV1QiLCJ4NXQiOiJMTEtESGR2dGRYQWp5eE9aNnZUQ2tjRnkyZlEifQ.eyJuYmYiOjE1NTgxMDIxODAsImV4cCI6MTU1ODEwNTc4MCwiaXNzIjoiaHR0cHM6Ly9kYXQtYi5vc2lzb2Z0LmNvbS9pZGVudGl0eSIsImF1ZCI6WyJodHRwczovL2RhdC1iLm9zaXNvZnQuY29tL2lkZW50aXR5L3Jlc291cmNlcyIsIm9jc2FwaSJdLCJjbGllbnRfaWQiOiIxNDE1ZjgzZC01OTQwLTRmYjctYTJjNy1lYTE1ODU1OGE2YmMiLCJ0aWQiOiI2NTI5MmI2Yy1lYzE2LTQxNGEtYjU4My1jZTdhZTA0MDQ2ZDQiLCJqdGkiOiIxMmNkNDVkN2RkZTQ3OWJiYzBiZGEyZGY1ZmYxMmI2OSIsInNjb3BlIjpbIm9jc2FwaSJdfQ.M5ArIuSMILj4Rw8TT3dPmHQXptx1YpOEXk37mQgObmwstCjQS44KA4JEUlgDa_2FnUOH5k02XLD2wo3p0fR6YlMVTU2Uy0YyevIpT5Ui5phNMRvbOiRc8_M56dy_2rS6wUi0NQv5xsaViUsCeYvnCBDwn59HNua9TDZzGuwdEyPNN2KB5l5opSWA6toBhdJEvT2I3XnIpXsV2M7ZRiQi7Z0IWQKqE8aynF4P_sIxAw6inT7O0DScnj9w_ktJUklnq0L7yeUGYtBsx3rqyA7H-YorXIEtPdNgk6KXI-Dt3XGAyFhn5xvItqhDt4JiUdBlLPw19pQK0uxRK8R9xYLymA',
 'Content-type': 'application/json',
 'Accept': 'text/plain',
 'Request-Timeout

### Auxiliary variables to make code more readable 

In [25]:
# Sensor positions 
class Pos(Enum):
    bottom = 1
    middle = 2
    top = 3

# Legend: 
# TIC == Temperature Indicator Controller, PV == Process Value

# TIC PV column names 
TIC_PV_COLUMNS = ['Bottom TIC PV', 'Middle TIC PV', 'Top TIC PV']
# Dictionary of column names indexed by position 
process_value = {Pos.bottom: 'Bottom TIC PV', Pos.middle: 'Middle TIC PV', Pos.top: 'Top TIC PV'}

# TIC OUT column names 
TIC_OUT_COLUMNS = ['Bottom TIC OUT', 'Middle TIC OUT', 'Top TIC OUT'] 

# Digital states - present in Dataview results, indicates a problem
BAD_INPUT = 'Bad Input'

## NOTE: USING_OCS_DATAVIEWS should be set to True for the modifications described at the top of this notebook to kick in 

In [26]:
USING_OCS_DATAVIEWS = True
if not USING_OCS_DATAVIEWS: 
    FERMENTATION_STAGE = 'Fermentation'
    IO_TIMEOUT = 'I/O Timeout'
    COMM_FAIL = 'Comm Fail'
    # All stages associated to the full cooling phase 
    #                                 7              9              10           11
    POST_FERMENTATION_STAGES = ['Fermentation', 'Free Rise', 'Diacetyl Rest', 'Cooling']
else:
    # All 'null' values indicating a system digital state are mapped to 'Bad Input' by our code
    IO_TIMEOUT = BAD_INPUT
    COMM_FAIL = BAD_INPUT
    
    # As of now, SDS returns the numerical value of user-defined digital state instead of the string name
    FERMENTATION_STAGE = 7
    POST_FERMENTATION_STAGES = [7, 9, 10, 11]

### STEP 0 Cell: get fermenter vessels data 

Complete function `get_all_brand_data` using what you've seen in the ADF Prediction notebook

In [27]:
@timer
def get_all_brand_data(num_days, start_timestamp, interval):
    #
    # 
    # TODO: complete code to return a single dataframe with all the required data 
    #   
    # =========== STUDENT BEGIN ==========
    start_time = parser.parse(start_timestamp)
    delta_time = dt.timedelta(days=num_days)
    end_timestamp = (start_time + delta_time).isoformat()
    df = ocs_client.get_all_fermenters_dataviews(start_timestamp, end_timestamp, interval)
    # =========== STUDENT END ==========
    
    return df 

# Test code 
# all_brands_df = get_all_brand_data(20, '2017-03-17T07:00', '00:01:00')
# all_brands_df

### STEP 2 Cell: clean data 

Complete each `TODO` section in the function `brand_df_cleanup`

In [28]:
@timer
def brand_df_cleanup(brand_df):
    # TODO: Remove all data point with bad input. 
    # All the following columns can have value BAD_INPUT: 
    #   Brand, Status, Bottom TIC PV, Middle TIC PV, Top TIC PV, Volume
    #     
    brand_df = brand_df.drop(brand_df[brand_df['Brand'] == BAD_INPUT].index)
    brand_df = brand_df.drop(brand_df[brand_df['Status'] == BAD_INPUT].index)
    brand_df = brand_df.drop(brand_df[brand_df['Top TIC PV'] == BAD_INPUT].index)
    # =========== STUDENT BEGIN ==========
    brand_df = brand_df.drop(brand_df[brand_df['Middle TIC PV'] == BAD_INPUT].index) 
    brand_df = brand_df.drop(brand_df[brand_df['Bottom TIC PV'] == BAD_INPUT].index)
    brand_df = brand_df.drop(brand_df[brand_df['Volume'] == BAD_INPUT].index)
    # =========== STUDENT END ==========

    # Keep only fermentation or post-fermentation stages
    brand_status_df = brand_df[brand_df['Status'].isin(POST_FERMENTATION_STAGES)]

    # Remove all data points from brand_status_df dataframe with communication issues
    # TODO: for columns in TIC_PV_COLUMNS, remove all rows with communication failures status (COMM_FAIL)
    #            and IO timeout (IO_TIMEOUT) 
    for tic_pv in TIC_PV_COLUMNS:
        # =========== STUDENT BEGIN ==========
        brand_status_df = brand_status_df.drop(brand_status_df[brand_status_df[tic_pv] == IO_TIMEOUT].index)
        brand_status_df = brand_status_df.drop(brand_status_df[brand_status_df[tic_pv] == COMM_FAIL].index)
        # =========== STUDENT END ==========
        brand_status_df[tic_pv] = brand_status_df[tic_pv].astype(float)

    return brand_status_df

### STEP 3 Cell: get the list of rows when fermentation starts 

You need to identify rows where the Status is 'Fermentation' and the previous row is not 'Fermentation'. The syntax to access the status of the previous row is:

    brand_df['Status'].shift(1)
    
Moreover it is possible to combine conditions to select dataframe rows with the syntax:

    (condition1) & (condition2)

In [29]:
# Return the list of rows when fermentation start for a brand
@timer 
def fermentation_starts(brand_df):
    # =========== STUDENT BEGIN ==========
    df = brand_df[(brand_df['Status'] == FERMENTATION_STAGE) & (brand_df['Status'].shift(1) != FERMENTATION_STAGE)]
    # =========== STUDENT END ==========
    fermentation_starts = [row for _, row in df.iterrows()]
    return fermentation_starts

### STEP 4: Extract all rows related to cooling phase

In [30]:
@timer
def cooling_data_extraction(fermentation_df, brand_status_df, use_temp_position):
    # Provides the corrected time offset post fermentation
    brand_status_df = fermentation_times(brand_status_df, fermentation_df, brand)

    for tic_out in TIC_OUT_COLUMNS:
        brand_status_df[tic_out] = pd.to_numeric(brand_status_df[tic_out], errors='coerce')
        
    # condition for it to be in cooling phase
    # TODO: the condition is that 'Top TIC OUT', 'Middle TIC OUT' and 'Bottom TIC OUT' are above 99.99
    #          
    # =========== STUDENT BEGIN ==========
    cool_stage = brand_status_df[
        (brand_status_df['Top TIC OUT'] > 99.99) &
        (brand_status_df['Middle TIC OUT'] > 99.99) &
        (brand_status_df['Bottom TIC OUT'] > 99.99)
    ]
    # =========== STUDENT END ==========

    # get the first cooling step for each fermentation stage
    cooling_start_frame = cool_stage.groupby('label').first().reset_index()

    # Collect data only for the selected temperature position 
    cooling_data = []
    for position in use_temp_position:
        if use_temp_position[position]:
            cooling_data.append(get_cooling_frames(cool_stage, cooling_start_frame, position))

    return cooling_data

## Legacy code cell --- do not change unless you know what you're doing

In [31]:
def get_cooling_frames(cool_stage, cooling_start_frame, position): 
    start_time = 0
    end_time = 3.5  # in days, longest cooling period possible

    cooling_column = 'Time since cooling'
    cool_stage.loc[:, cooling_column] = -1
    cooling_stage = pd.DataFrame()
    if len(cooling_start_frame) > 0:
        for index, row in cooling_start_frame.iterrows():
            label = row['label2']  # get the unique label
            cool_start_time = row['tsf3']  # get the unique start of cooling to each label
            # Each unique label is associated with a fermentation stage for a brand
            mask = (cool_stage['label2'] == label)  # get those rows with that same label
            cool_stage_valid = cool_stage[mask]

            tic_pv = process_value[position]
            # get only frames that have the bottom process variable greater than 50
            if float(row[tic_pv]) > 50:  # and keep [CF]
                # subtract the start of cooling from each individual cooling step
                cool_stage.loc[mask, cooling_column] = cool_stage_valid['tsf3'] - cool_start_time

                cool_stage_current = cool_stage[(cool_stage[cooling_column] >= start_time) &
                                                (cool_stage[cooling_column] < end_time)]
                # make sure the labels are all positive, make sure these are post fermentation stages
                cool_stage_current = cool_stage_current[cool_stage_current['label'] >= 0]
                # get only the max of the post fermentation stages
                cool_stage_current[tic_pv] = cool_stage_current.groupby([cooling_column])[tic_pv].transform(max)
                cool_stage_current = cool_stage_current.rename(columns={tic_pv: 'temperature', cooling_column: 'tsc'})
                cool_stage_current = cool_stage_current[['temperature', 'tsc', 'Brand', 'label', 'Volume']]
                cooling_stage = cool_stage_current
    else:
        print("!!! Sorry no cooling stage found!")

    return cooling_stage

# Get the time since fermentation
@timer
def fermentation_times(brand_frame, fermentation_frames, brand):
    brand_frame['tsf2'] = 100000  # initializing the temp variables
    brand_frame['tsf3'] = 100000  # init the temp variables
    brand_frame['label'] = -1  # label is to label all fermentation processes
    count = 0
    for index, fermentation_frame in enumerate(fermentation_frames):
        fermentation_time = fermentation_frame['Timestamp']
        brand_frame['label'] = brand_frame['Timestamp'].apply(
            lambda x: count if pd.Timestamp(x) >= pd.Timestamp(fermentation_time) else -1)
        brand_frame['tsf2'] = brand_frame['Timestamp'].apply(
            lambda x: ((pd.Timestamp(x)) - (pd.Timestamp(fermentation_time))).total_seconds() if pd.Timestamp(
                x) >= pd.Timestamp(fermentation_time) else 1000000000)
        brand_frame['tsf2'] = brand_frame['tsf2'].apply(lambda x: x / 86400)  # convert time to days
        if count > 0:
            # the min of the two is the actual time since fermentation start
            brand_frame['tsf2'] = brand_frame[['tsf2', 'tsf3']].min(axis=1)
            mask = (brand_frame['label'] == -1)
            brand_frame_valid = brand_frame[mask]
            brand_frame.loc[mask, 'label'] = brand_frame_valid['label2']

        brand_frame['tsf3'] = brand_frame['tsf2']
        brand_frame['label2'] = brand_frame['label']
        count += 1

    # if there is any zero just remove that
    brand_frame = brand_frame[(brand_frame['tsf3'] <= 100000) & (brand_frame['label2'] >= 0)]

    return brand_frame

## Temperature equation

The cell bellow implementation this equation:

![Cooling equation](https://academicpi.blob.core.windows.net/software/cooling-equation.png)

The curve fitting algorithm finds the value of `a` (alpha) and `b` (beta)

In [32]:
def temperature_profile(x, a, b):
    # Unpack x values
    temperature = x[0]
    volume = x[1]
    return np.multiply(1 + np.multiply(a, np.reciprocal(volume)), temperature) - a * b * np.reciprocal(volume)

---
---
# Main section 
---
---
Once all functions above are fully implemented, below are the cell to:

1. Set the input parameters
2. Read the input dataframe
3. Call `compute_cooling_predictions`
4. Plot result data

Note that each time you touch the code of a function in a cell, you have to execute that cell for that code to become effective. You can come back here and then rerun the 1-2-3-4 sequence to check the new result. 

In [33]:
# Selected brand
if not USING_OCS_DATAVIEWS: 
    brand = 'Realtime Hops'
else:
    brand = '3'
# Temperature sensor position to consider
temp_sensors = {'bottom': False, 'middle': True, 'top': False}
training_days = 20
interval = '00:02:00'

## Fermenter Vessel Dataview Explaination 

### Here are all the streams name of interest for Fermenter Vessel ID 31 with their target column in the Dataview 

| Stream Name | DV Column Name | Description | 
|-------------|----------------|-------------|
| acsbrew.BREWERY.B2_CL_C2_FV31_LT1360/PV.CV | `Volume` | Vessel Volume 
| acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360C/PV.CV | `Top TIC PV` | Vessel Bottom Temperture Indicator Controller Process Value
| acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360C/OUT.CV | `Top TIC OUT` | Vessel Top Temperature Indicator Controller Output
| acsbrew.BREWERY.B2_CL_C2_FV31/Plato | `Plato` | The specific gravity of the vessel in plato
| acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360B/PV.CV | `Middle TIC PV` | Vessel Middle Temperature Indicator Controller Process 
| acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360B/OUT.CV | `Middle TIC OUT` | Vessel Middle Temperature Indicator Controller Output
| acsbrew.BREWERY.B2_CL_C2_FV31/DcrsFvFullPlato | `FV Full Plato` | The specific gravity of the vessel in plato at the end of filling
| acsbrew.BREWERY.FV31.Fermentation ID.194fa814-869f-5f35-3501-0b9198ac52e1 | `Fermentation ID` | Unique ID for fermentation batch 
| acsbrew.BREWERY.B2_CL_C2_FV31/BRAND.CV | `Brand` | Vessel Brand
| acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360A/PV.CV | `Bottom TIC PV` | Vessel Bottom Temperture Indicator Controller Process Value
| acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360A/OUT.CV |`Bottom TIC OUT` | Vessel Bottom Temperature Indicator Controller Output
| acsbrew.BREWERY.FV31.ADF2 | `ADF` | Apparent Degree of Fermentation 
| acsbrew.BREWERY.B2_CL_C2_FV31/STATUS.CV | `Status` | * Vessel Status

The other 5 fermenter vessels (ID 32 up to 36) streams have a similar structure but yet all somewhat different. For example
below are all the stream names for the 'Volume' column: 

| FVID | Stream Name|
|------|            |
| 32 | acsbrew.BREWERY.B2_CL_C2_FV32_LT1380/PV.CV
| 33 | acsbrew.BREWERY.B2_CL_C2_FV33_LT1400/PV.CV
| 34 | acsbrew.BREWERY.B2_CL_C2_FV34_LT1420/PV.CV
| 35 | acsbrew.BREWERY.B2_CL_C2_FV35_LT1440/PV.CV
| 36 | acsbrew.BREWERY.B2_CL_C2_FV36_LT1460/PV.CV

Each fermenter has its own Dataview which is built within function fermenter_dataview_def. The mapping of streams to dataview
column is done by __dv_column_mappings which:
  
1. Extract all streams for a given fermenter vessel ID fv_id
2. Iterate over list full_map (defined below) to extract the required stream (by filtering) and map to the right column

In [34]:
streams, stream_tags = ocs_client.extract_streams_for_fermenter(31)

In [35]:
streams

['acsbrew.BREWERY.B2_CL_C2_FV31/YEAST.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31_PIC1362/SP.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360A/OUT.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31/BRAND.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360B/SP.CV',
 'acsbrew.BREWERY.B2_%@Area%_FV31.OEE.Performance',
 'acsbrew.BREWERY.B2_CL_C2_FV31/STATUS.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31_LT1360/PV.CV',
 'acsbrew.BREWERY.FV31.ADF2',
 'acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360A/SP.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360B/OUT.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31/HARVEST.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31_PIC1362/PV.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360B/PV.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31/PCD.CV',
 'acsbrew.BREWERY.B2_%@Area%_FV31.OEE.Quality',
 'acsbrew.BREWERY.B2_CL_C2_FV31/DcrsFvFullPlato',
 'acsbrew.BREWERY.B2_%@Area%_FV31.OEE.Availability',
 'acsbrew.BREWERY.FV31.Fermentation ID.194fa814-869f-5f35-3501-0b9198ac52e1',
 'acsbrew.BREWERY.B2_CL_C2_FV31/YEASTGEN.CV',
 'acsbrew.BREWERY.B2_CL_C2_FV31/PULLS

In [36]:
import ocs_datascience
full_map = ocs_datascience.full_map
full_map

[('_LT', 'Volume'),
 ('C/PV.CV', 'Top TIC PV'),
 ('C/OUT.CV', 'Top TIC OUT'),
 ('/Plato', 'Plato'),
 ('B/PV.CV', 'Middle TIC PV'),
 ('B/OUT.CV', 'Middle TIC OUT'),
 ('FullPlato', 'FV Full Plato'),
 ('Fermentation', 'Fermentation ID'),
 ('BRAND', 'Brand'),
 ('A/PV.CV', 'Bottom TIC PV'),
 ('A/OUT.CV', 'Bottom TIC OUT'),
 ('ADF2', 'ADF'),
 ('STATUS', 'Status')]

In [37]:
for filtr, column in full_map:
    print(f"{filtr:13}: ", [f"{stream:70} {column}" for stream in streams if filtr in stream])

_LT          :  ['acsbrew.BREWERY.B2_CL_C2_FV31_LT1360/PV.CV                             Volume']
C/PV.CV      :  ['acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360C/PV.CV                           Top TIC PV']
C/OUT.CV     :  ['acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360C/OUT.CV                          Top TIC OUT']
/Plato       :  ['acsbrew.BREWERY.B2_CL_C2_FV31/Plato                                    Plato']
B/PV.CV      :  ['acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360B/PV.CV                           Middle TIC PV']
B/OUT.CV     :  ['acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360B/OUT.CV                          Middle TIC OUT']
FullPlato    :  ['acsbrew.BREWERY.B2_CL_C2_FV31/DcrsFvFullPlato                          FV Full Plato']
Fermentation :  ['acsbrew.BREWERY.FV31.Fermentation ID.194fa814-869f-5f35-3501-0b9198ac52e1 Fermentation ID']
BRAND        :  ['acsbrew.BREWERY.B2_CL_C2_FV31/BRAND.CV                                 Brand']
A/PV.CV      :  ['acsbrew.BREWERY.B2_CL_C2_FV31_TIC1360A/PV.CV               

In [38]:
dv_urls = ocs_client.install_fermenter_dataviews(version='test-v1', last_fvid=31, num_days=20, verbose=True)

DV ID: DVtest-v1_FV31, URL: https://dat-b.osisoft.com/api/v1-preview/Tenants/65292b6c-ec16-414a-b583-ce7ae04046d4/Namespaces/fermenter_vessels/Dataviews/DVtest-v1_FV31, 
  Defn: {
    "Description": "Fermentor 31 DV",
    "GroupRules": [],
    "Id": "DVtest-v1_FV31",
    "IndexConfig": {
        "EndIndex": "2017-04-06T07:00:00+00:00",
        "Interval": "00:01:00",
        "Mode": "Interpolated",
        "StartIndex": "2017-03-17T07:00:00Z"
    },
    "IndexDataType": "DateTime",
    "Mappings": {
        "Columns": [
            {
                "DataType": "DateTime",
                "IsKey": true,
                "MappingRule": {
                    "PropertyPaths": [
                        "Timestamp"
                    ]
                },
                "Name": "Timestamp"
            },
            {
                "MappingRule": {
                    "ItemIdentifier": {
                        "Field": "Name",
                        "Function": "Equals",
               

### Development tip (WARNING: executing the next cell (dataview) takes up to 30 secs)

You've seen that requesting for a Dataview result takes some time. Development of a notebook involves running code over and over, so you'll want to avoid long running steps when possible. This is why you can run the cell below once, with the resulting dataframe saved in variable `all_brands_df`. If you don't change any of its input parameter, `all_brands_df` is still valid and can be reused when you run the main function `compute_cooling_predictions` below. 

In [18]:
all_brands_df = get_all_brand_data(training_days, '2017-03-17T07:00', interval)
all_brands_df.to_csv('all_brands_df.csv', index=False)
all_brands_df

Urls: ['https://dat-b.osisoft.com/api/v1-preview/Tenants/65292b6c-ec16-414a-b583-ce7ae04046d4/Namespaces/fermenter_vessels/Dataviews/DV_FV31/preview/interpolated?startIndex=2017-03-17T07:00&endIndex=2017-04-06T07:00:00&interval=00:02:00&form=csvh&maxcount=200000', 'https://dat-b.osisoft.com/api/v1-preview/Tenants/65292b6c-ec16-414a-b583-ce7ae04046d4/Namespaces/fermenter_vessels/Dataviews/DV_FV32/preview/interpolated?startIndex=2017-03-17T07:00&endIndex=2017-04-06T07:00:00&interval=00:02:00&form=csvh&maxcount=200000', 'https://dat-b.osisoft.com/api/v1-preview/Tenants/65292b6c-ec16-414a-b583-ce7ae04046d4/Namespaces/fermenter_vessels/Dataviews/DV_FV33/preview/interpolated?startIndex=2017-03-17T07:00&endIndex=2017-04-06T07:00:00&interval=00:02:00&form=csvh&maxcount=200000', 'https://dat-b.osisoft.com/api/v1-preview/Tenants/65292b6c-ec16-414a-b583-ce7ae04046d4/Namespaces/fermenter_vessels/Dataviews/DV_FV34/preview/interpolated?startIndex=2017-03-17T07:00&endIndex=2017-04-06T07:00:00&interva

Unnamed: 0,Timestamp,Volume,Top TIC PV,Top TIC OUT,Plato,Middle TIC PV,Middle TIC OUT,FV Full Plato,Fermentation ID,Brand,Bottom TIC PV,Bottom TIC OUT,ADF,Status
0,2017-03-17 07:00:00+00:00,716.566,29.6131516,0,Bad Input,29.35638,0,Bad Input,Fermentor 31201731179653,4,29.8845711,10.9353266,Bad Input,12.0
1,2017-03-17 07:02:00+00:00,716.566,29.497858,0,Bad Input,29.4008923,0,Bad Input,Fermentor 31201731179653,4,29.978569,13.9562464,Bad Input,12.0
2,2017-03-17 07:04:00+00:00,716.566,29.438652,0,Bad Input,29.44079,0,Bad Input,Fermentor 31201731179653,4,30.0834045,25.9587173,Bad Input,12.0
3,2017-03-17 07:06:00+00:00,716.566,29.430151,0,Bad Input,29.4890137,0,Bad Input,Fermentor 31201731179653,4,30.1937943,41.36798,Bad Input,12.0
4,2017-03-17 07:08:00+00:00,716.566,29.4686356,0,Bad Input,29.41609,0,Bad Input,Fermentor 31201731179653,4,30.1847954,48.0421,Bad Input,12.0
5,2017-03-17 07:10:00+00:00,716.566,29.4288158,0,Bad Input,29.4333019,0,Bad Input,Fermentor 31201731179653,4,30.1727276,53.8346825,Bad Input,12.0
6,2017-03-17 07:12:00+00:00,716.566,29.43836,0,Bad Input,29.45985,0,Bad Input,Fermentor 31201731179653,4,30.15905,55.6999245,Bad Input,12.0
7,2017-03-17 07:14:00+00:00,716.566,29.4848652,0,Bad Input,29.4944572,0,Bad Input,Fermentor 31201731179653,4,30.1473484,46.2898865,Bad Input,12.0
8,2017-03-17 07:16:00+00:00,716.566,29.4225044,0,Bad Input,29.4748135,0,Bad Input,Fermentor 31201731179653,4,30.13346,36.87985,Bad Input,12.0
9,2017-03-17 07:18:00+00:00,716.566,29.46285,0,Bad Input,29.44973,0,Bad Input,Fermentor 31201731179653,4,30.12059,33.32891,Bad Input,12.0


## Link to see CSV data before analysis 

**Click this:** [all_brands_df.csv](./all_brands_df.csv)

# Main analysis function 

In [19]:
cool_df, predictions, cool_df_training, start_temp, num_fermentations, num_coolings = \
    compute_cooling_predictions(all_brands_df, brand, temp_sensors, training_days, interval)


elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison



  ==> Finished 'brand_df_cleanup' in               0.1316 secs
  ==> Finished 'fermentation_starts' in            0.0056 secs
  @@@ Number of fermentation for brand 3: 4
  ==> Finished 'fermentation_times' in             2.6558 secs
  ==> Finished 'cooling_data_extraction' in        2.7240 secs
['706.8821' '715.3655']
  ==> Finished 'compute_cooling_predictions' in    2.8966 secs


## Plot prediction curve along with actual data 

Note: you can zoom into the graph to see how the prediction and data actually differ 

In [21]:
# Plotly trace for prediction curve
label = f'Prediction curve, start temp: {start_temp:5.2f}F'
prediction_trace = go.Scatter(x = cool_df_training.tsc.values, 
                              y = predictions, 
                              mode='lines', 
                              name=label, 
                              marker=dict(color='blue'))


data_trace = go.Scatter(x = cool_df.tsc.values, 
                        y = cool_df.temperature.values, 
                        mode='markers', 
                        name='Actual Data', 
                        opacity=0.4,
                        marker=dict(color='orange'))


plot_title = f'OCS/Dataview: Cooling of Realtime Hops ({brand}) beer, {training_days} days, {num_fermentations} fermentation(s),<br>' \
             f'interval={interval} {temp_sensors}'
layout =  go.Layout(xaxis=dict(title='Cooling time (days)'), 
                    yaxis=dict(title='Temperature (F)'), 
                    width=950,
                    title=plot_title)

fig = go.FigureWidget(data=[prediction_trace, data_trace], layout=layout)
fig

FigureWidget({
    'data': [{'marker': {'color': 'blue'},
              'mode': 'lines',
              'name':…

## ---------- Your graph will appear above this line if no error occured ----------
## ---------- Reference graph below ----------

![Beer Cooling Prediction](https://academicpi.blob.core.windows.net/software/beer-cooling-prediction.png)

-----
-----
-----
# Extra Credits

![Beer Cooling Outlier Extra](https://academicpi.blob.core.windows.net/software/beer-cooling-prediction-extra.png)