In [1]:
import json
import os
import pandas as pd
import shutil
from seeq import spy

# Set the compatibility option so that you maximize the chance that SPy will remain compatible with your notebook/script
spy.options.compatibility = 193

In [2]:
# Log into Seeq Server if you're not using Seeq Data Lab:
spy.login(url='http://localhost:34216', credentials_file='../credentials.key', force=False)

# Workbook Jobs

In [spy.workbooks.ipynb](spy.workbooks.ipynb), you can learn to push and pull workbooks (Workbench Analyses and Organizer Topics) to/from the Seeq service/server using SPy.

You may need to do something "in bulk," in one of the following scenarios:

- Re-mapping references (e.g. historian tags/signals) from one datasource to another, or one asset tree to another
- Transferring work from one Seeq service/server to another, possibly including data

The set of functions in the `spy.workbooks.job` module are suitable for this work. Each function operates within a "job folder" that captures the state of the job. Unlike `spy.workbooks.pull()` and `spy.workbooks.push()`, the equivalent commands in `spy.workbooks.job` do not require all workbooks to be held in memory. This allows very large jobs to be executed (as long as there is sufficient disk space). All parts of the process are _resumable_, and SPy will pick up where it left off if the operation is interrupted for any reason (e.g. a network error).

This notebook will walk through the use of this module, referencing the scenarios above. In general, commands are executed in the following order:

1. `spy.workbooks.job.pull()`
2. `spy.workbooks.job.data.pull()` (optional)
3. `spy.workbooks.job.push()`
4. `spy.workbooks.job.data.push()` (optional)

## Establish the Job Folder

The parameter that defines a job is a _job folder_. It is the first argument for all job functions, and it is managed entirely by SPy. The folder is laid out in an intuitive way that allows you to inspect it, and, in some troubleshooting cases, make modifications yourself.

In [3]:
job_folder = 'Output/My First Workbooks Job'

# Remove the job folder so that old file/artifacts don't affect the tutorial
if os.path.exists(job_folder):
    shutil.rmtree(job_folder)

## Let's Make Something to Work With...

We need some Analyses/Topics to work with for the purposes of demonstrating the functionality, so let's make sure the example workbooks have been pushed.

In [4]:
example_workbooks = spy.workbooks.load('Support Files/Example Export.zip')
spy.workbooks.push(example_workbooks,
                   path='SPy Documentation Examples >> Workbook Job Import',
                   label=f'{spy.session.user.name} Workbook Job Example', 
                   refresh=False, 
                   errors='raise')

0,1,2,3,4,5,6,7,8,9,10
,ID,Name,Type,Workbook Type,Count,Time,Errors,Result,Pushed Workbook ID,URL
0.0,D833DC83-9A38-48DE-BF45-EB787E9E8375,Example Analysis,Workbook,Analysis,53,00:00:02.56,0,Success,0EECC4B1-AF85-7580-8F20-D91393026BA1,link
1.0,811B1488-297A-4FD2-AE7C-A1FE0E3B3641,Example Topic,Workbook,Topic,5,00:00:00.60,0,Success,0EECC4B1-C7E3-6230-B51A-DC819224BB74,link


Unnamed: 0,ID,Name,Type,Workbook Type,Count,Time,Errors,Result,Pushed Workbook ID,URL
0,D833DC83-9A38-48DE-BF45-EB787E9E8375,Example Analysis,Workbook,Analysis,53,0:00:02.557404,0,Success,0EECC4B1-AF85-7580-8F20-D91393026BA1,http://localhost:34216/0EECC4B1-AF3C-71A0-A425...
1,811B1488-297A-4FD2-AE7C-A1FE0E3B3641,Example Topic,Workbook,Topic,5,0:00:00.602075,0,Success,0EECC4B1-C7E3-6230-B51A-DC819224BB74,http://localhost:34216/0EECC4B1-AF3C-71A0-A425...


## Pulling Workbooks

Start the job cycle by issuing the `spy.workbooks.job.pull()` to grab a set of workbooks and write them to disk.

As with `spy.workbooks.pull()`, we create a DataFrame full of workbooks to pull by using the `spy.workbooks.search()` function. Then we can supply that DataFrame to `spy.workbooks.job.pull()`, which takes many of the same parameters as `spy.workbooks.pull()`.

In [5]:
workbooks_df = spy.workbooks.search({
    'Path': 'SPy Documentation Examples >> Workbook Job Import'
})

# Store these in variables that we'll use later
example_analysis_workbook_id = workbooks_df[workbooks_df['Name'] == 'Example Analysis'].iloc[0]['ID']
example_topic_workbook_id = workbooks_df[workbooks_df['Name'] == 'Example Topic'].iloc[0]['ID']

workbooks_df

0,1,2
,Count,Time
Results,2,00:00:00.07


Unnamed: 0,Archived,Created At,ID,Name,Path,Pinned,Search Folder ID,Type,Updated At,Workbook Type
0,,2024-02-15 21:42:24.216606+00:00,0EECC4B1-AF85-7580-8F20-D91393026BA1,Example Analysis,SPy Documentation Examples >> Workbook Job Import,,0EECC4B1-AF3C-71A0-A425-D0461BC34EDB,Workbook,2024-02-15 21:42:27.423060+00:00,Analysis
1,,2024-02-15 21:42:26.771990300+00:00,0EECC4B1-C7E3-6230-B51A-DC819224BB74,Example Topic,SPy Documentation Examples >> Workbook Job Import,,0EECC4B1-AF3C-71A0-A425-D0461BC34EDB,Workbook,2024-02-15 21:42:27.453059200+00:00,Topic


In [6]:
spy.workbooks.job.pull(job_folder, workbooks_df)

0,1,2,3,4,5,6,7,8
,ID,Path,Name,Workbook Type,Count,Time,Errors,Result
0.0,0EECC4B1-AF85-7580-8F20-D91393026BA1,SPy Documentation Examples >> Workbook Job Import,Example Analysis,Analysis,53,00:00:01.57,0,Success
1.0,0EECC4B1-C7E3-6230-B51A-DC819224BB74,SPy Documentation Examples >> Workbook Job Import,Example Topic,Topic,9,00:00:00.44,0,Success


Unnamed: 0,ID,Path,Name,Workbook Type,Count,Time,Errors,Result
0,0EECC4B1-AF85-7580-8F20-D91393026BA1,SPy Documentation Examples >> Workbook Job Import,Example Analysis,Analysis,53,0:00:01.573021,0,Success
1,0EECC4B1-C7E3-6230-B51A-DC819224BB74,SPy Documentation Examples >> Workbook Job Import,Example Topic,Topic,9,0:00:00.440331,0,Success


As mentioned earlier, jobs are _resumable_. If you execute the above cell again, you will see that the **Result** column indicates `Already pulled`.

If you would like to force a job to redo its work, supply the `resume=False` argument. You can also inspect the job folder's `Workbooks` subfolder and selectively delete workbook folders therein to force SPy to re-pull workbooks.

## Pushing Workbooks

As mentioned above, there are two primary scenarios where you want to push workbooks in bulk:

- Re-mapping references (e.g. historian tags/signals) from one datasource to another, or one asset tree to another
- Transferring work from one Seeq service/server to another, possibly including data

### Datasource Maps

In either case, it's important to understand the concept of _datasource maps_. These are JSON files that contain instructions for SPy as it maps the identifiers in the pulled workbook definitions to identifiers on the target system. These maps can incorporate relatively complex Regular Expression specifications that allow you to re-orient workbooks from one set of input data to another.

The `spy.workbooks.job.pull()` command will create a `Datasource Maps` folder inside the job folder. There will be one file for every datasource that was encountered during the pull operation -- if a workbook touched a datasource in some way, there will be a file for it.

Here's what a typical file looks like:

```
{
    "Datasource Class": "Time Series CSV Files",
    "Datasource ID": "Example Data",
    "Datasource Name": "Example Data",
    "Item-Level Map Files": [],
    "RegEx-Based Maps": [
        {
            "Old": {
                "Type": "(?<type>.*)",
                "Datasource Class": "Time Series CSV Files",
                "Datasource Name": "Example Data",
                "Data ID": "(?<data_id>.*)"
            },
            "New": {
                "Type": "${type}",
                "Datasource Class": "Time Series CSV Files",
                "Datasource Name": "Example Data",
                "Data ID": "${data_id}"
            }
        }
    ]
}
```

You can make modifications to these files by loading them into an editor, including Jupyter's text editor. Generally the most common action is to add or change entries in the `RegEx-Based Maps` block. That section is a _list_ of _dictionaries_ that each have an `Old` and a `New` subsection. Within the `Old` block, you can specify properties to match on. The key is the property name and the value is a [regular expression](https://en.wikipedia.org/wiki/Regular_expression), often employing a _capture group_. In the example above, the `Data ID` field is matching using the `.*` regex and storing it in a capture group called `data_id`. The `New` block then contains the properties and values to search upon to "map" to target items. In the example above, the `"Data ID": "${data_id}"` specification just means that the Data ID is being used "as-is" without any alteration.

(If you happen to be familiar with [Connector Property Transforms](https://telemetry.seeq.com/support-link/kb/latest/cloud/connector-property-transforms), this regex approach may feel familiar.)

Let's look at a more complicated example:

```
{
    "Datasource Class": "Time Series CSV Files",
    "Datasource ID": "Example Data",
    "Datasource Name": "Example Data",
    "Item-Level Map Files": [],
    "RegEx-Based Maps": [
        {
            "Old": {
                "Type": "(?<type>.*)",
                "Datasource Class": "Time Series CSV Files",
                "Datasource Name": "Example Data",
                "Data ID": "(?<data_id>.*)"
            },
            "New": {
                "Type": "${type}",
                "Datasource Class": "Time Series CSV Files",
                "Datasource Name": "Example Data",
                "Data ID": "${data_id}"
            }
        },
        {
            "Old": {
                "Type": "(?<type>.*)",
                "Path": "Example >> Cooling Tower 1",
                "Asset": "Area (?<subarea>[ABC])",
                "Name": "(?<name>.*)"
            },
            "New": {
                "Type": "${type}",
                "Path": "Example >> Cooling Tower 2",
                "Asset": "Area ${subarea}",
                "Name": "${name}"
            }
        }
    ]
}
```

In this example there are two RegEx-Based Maps specified. The first map is identical to the previous example, and it will be used first-- if there is not a match on the `Old` regex specifications, then SPy will move on to the next. The next map matches on a particular asset path (`Example >> Cooling Tower 1`) and a set of subareas (`A`, `B`, or `C`) and then maps them to the same area underneath `Example >> Cooling Tower 2`.

In this manner, you can use arbitrarily-complex mapping logic to accomplish the goal of re-mapping a workbook within the same Seeq server or properly mapping from one Seeq server to another.

#### Datasource Mapping in Action

Let's run through an actual mapping scenario to see how it works and how to troubleshoot it when it goes wrong.

First we have to grab a couple of signal IDs so that we can use them later to illustrate some functionality.

In [7]:
area_a_temperature_id = spy.search({'Datasource Name': 'Example Data', 'Name': 'Area A_Temperature'}).iloc[0]['ID']
area_a_optimizer_id = spy.search({'Datasource Name': 'Example Data', 'Name': 'Area A_Optimizer'}).iloc[0]['ID']

0,1,2,3,4,5,6
,Datasource Name,Name,Time,Count,Pages,Result
0.0,Example Data,Area A_Optimizer,00:00:00.02,1,1,Success


Now we will write out a datasource map file that has, as its first map, a `New` block that will map to a `Name` that does not exist. This will let us see what happens both when the mapping is successful and when there are errors.

In [8]:
datasource_map = {
    "Datasource Class": "Time Series CSV Files",
    "Datasource ID": "Example Data",
    "Datasource Name": "Example Data",
    "Item-Level Map Files": [],
    "RegEx-Based Maps": [
        {
            "Old": {
                "Type": "(?<type>.*)",
                "Datasource Class": "Time Series CSV Files",
                "Datasource Name": "Example Data",
                "Name": "Area A_Optimizer"
            },
            "New": {
                "Type": "${type}",
                "Datasource Class": "Time Series CSV Files",
                "Datasource Name": "Example Data",
                "Name": "Area NonExistent_Optimizer"
            },
            # In this contrived example, if we match on the "Old" criteria, we don't want to continue to the next regex map
            "On Match": "Stop"
        },
        {
            "Old": {
                "Type": "(?<type>.*)",
                "Datasource Class": "Time Series CSV Files",
                "Datasource Name": "Example Data",
                "Data ID": "(?<data_id>.*)"
            },
            "New": {
                "Type": "${type}",
                "Datasource Class": "Time Series CSV Files",
                "Datasource Name": "Example Data",
                "Data ID": "${data_id}"
            }        
        }
    ]
}

with open(os.path.join(job_folder, 'Datasource Maps', 'Datasource_Map_Time Series CSV Files_Example Data_Example Data.json'), 'w') as f:
    json.dump(datasource_map, f)

Now we push to server using a label that is guaranteed to differentiate our activity from other users. As you will see, there will be errors reported in the `Results` column because one item won't be mapped.

In [9]:
push_df = spy.workbooks.job.push(job_folder,
                                 path='SPy Documentation Examples >> Workbook Jobs',
                                 label=f'{spy.session.user.name} Workbook Job Example',
                                 errors='catalog')

0,1,2,3,4,5,6,7,8,9,10
,ID,Name,Type,Workbook Type,Count,Time,Errors,Result,Pushed Workbook ID,URL
0.0,0EECC4B1-AF85-7580-8F20-D91393026BA1,Example Analysis,Workbook,Analysis,52,00:00:03.00,1,"Success, but with errors: StoredSignal ""Area A_Optimizer"" (0EECC4AC-CDE9-7710-8028-B7DE1EA451C8) not mapped, only override maps used Using overrides from \\?\C:\dev\develop\sdk\pypi\seeq\spy\docs\Documentation\Output\My First Workbooks Job\Datasource Maps:Used ""\\?\C:\dev\develop\sdk\pypi\seeq\spy\docs\Documentation\Output\My First Workbooks Job\Datasource Maps\Datasource_Map_Time Series CSV Files_Example Data_Example Data.json""RegEx-Based Map 0: Item not found on server. Details:  ""Type""  regex ""(?.*)""  matched on ""StoredSignal""  searched for ""Signal""  ""Datasource Class""  regex ""Time Series CSV Files""  matched on ""Time Series CSV Files""  searched for ""Time Series CSV Files""  ""Name""  regex ""Area A_Optimizer""  matched on ""Area A_Optimizer""  searched for ""Area NonExistent_Optimizer""  ""Datasource ID""  searched for ""Example Data""  Capture groups:  type ""StoredSignal""",0EECC4B1-E608-66C0-9E13-B2F80838C5AC,link
1.0,0EECC4B1-C7E3-6230-B51A-DC819224BB74,Example Topic,Workbook,Topic,5,00:00:00.60,0,Success,0EECC4B2-020D-6460-87D5-93CF2AC495AF,link


You can see the error, but it's not formatted very well. So let's use a troubleshooting tool-- the "explain" function on the returned DataFrame:

In [10]:
print(push_df.spy.item_map.explain(area_a_optimizer_id))

StoredSignal "Area A_Optimizer" (0EECC4AC-CDE9-7710-8028-B7DE1EA451C8) not mapped, only override maps used
Using overrides from \\?\C:\dev\develop\sdk\pypi\seeq\spy\docs\Documentation\Output\My First Workbooks Job\Datasource Maps:
- Used "\\?\C:\dev\develop\sdk\pypi\seeq\spy\docs\Documentation\Output\My First Workbooks Job\Datasource Maps\Datasource_Map_Time Series CSV Files_Example Data_Example Data.json"
- RegEx-Based Map 0: Item not found on server. Details:
    "Type"
        regex          "(?<type>.*)"
        matched on     "StoredSignal"
        searched for   "Signal"
    "Datasource Class"
        regex          "Time Series CSV Files"
        matched on     "Time Series CSV Files"
        searched for   "Time Series CSV Files"
    "Name"
        regex          "Area A_Optimizer"
        matched on     "Area A_Optimizer"
        searched for   "Area NonExistent_Optimizer"
    "Datasource ID"
        searched for   "Example Data"
    Capture groups:
        type           "Sto

This detailed explanation is intended to give you a starting point for troubleshooting. You can see the **regex** that was specified, the property values that were **matched on**, the **Capture groups** that resulted from the RegEx specifications, and the property values that were subsequently **searched for**. Since `Area NonExistent_Optimizer` does not exist, the explanation for `RegEx-Based Map 0` says _Item not found on server_.

Now let's look at the explanation for a successful map (`Area A_Temperature`):

In [11]:
print(push_df.spy.item_map.explain(area_a_temperature_id))

Using overrides from \\?\C:\dev\develop\sdk\pypi\seeq\spy\docs\Documentation\Output\My First Workbooks Job\Datasource Maps:
- Used "\\?\C:\dev\develop\sdk\pypi\seeq\spy\docs\Documentation\Output\My First Workbooks Job\Datasource Maps\Datasource_Map_Time Series CSV Files_Example Data_Example Data.json"
- RegEx-Based Map 0: Unsuccessful match. Details:
    "Name"
        regex          "Area A_Optimizer"
        does not match "Area A_Temperature"
- RegEx-Based Map 1: Successfully matched. Details:
    "Type"
        regex          "(?<type>.*)"
        matched on     "StoredSignal"
        searched for   "Signal"
        and found      "StoredSignal"
    "Datasource Class"
        regex          "Time Series CSV Files"
        matched on     "Time Series CSV Files"
        searched for   "Time Series CSV Files"
        and found      "Time Series CSV Files"
    "Data ID"
        regex          "(?<data_id>.*)"
        matched on     "[Tag] Area A_Temperature.sim.ts.csv"
        searched

### Dummy Items

In cases where we couldn't map successfully, we can tell SPy to create "dummy" items. A dummy item is a signal, condition or scalar that has all the properties of the original item but has no data. (We'll show how to push data to dummy items later...)

Note the use of `create_dummy_items=True`, and also note `resume=False` so that SPy tries to push the workbooks again:

In [12]:
push_df = spy.workbooks.job.push(
    job_folder,
    path='SPy Documentation Examples >> Workbook Jobs',
    label=f'{spy.session.user.name} Workbook Job Example',
    create_dummy_items=True,
    resume=False,
    errors='catalog')

0,1,2,3,4,5,6,7,8,9,10
,ID,Name,Type,Workbook Type,Count,Time,Errors,Result,Pushed Workbook ID,URL
0.0,0EECC4B1-AF85-7580-8F20-D91393026BA1,Example Analysis,Workbook,Analysis,53,00:00:02.98,0,Success,0EECC4B1-E608-66C0-9E13-B2F80838C5AC,link
1.0,0EECC4B1-C7E3-6230-B51A-DC819224BB74,Example Topic,Workbook,Topic,5,00:00:01.14,0,Success,0EECC4B2-020D-6460-87D5-93CF2AC495AF,link


Now there are no errors, because any item that couldn't be mapped would be replaced by a dummy item.

Find the `Example Analysis` row of the output table above and click on the _link_ in the `URL` column to take a look at the resulting. You'll see that the **Details Pane** worksheet contains **Area A_Optimizer**, which is a blank "dummy item". 

We can look at what dummy items were created by inspecting `push_df.spy.item_map.dummy_items`. You can see that the `Name` is the same as the original and important properties like `Maximum Interpolation` have made their way to the dummy item.

In [13]:
push_df.spy.item_map.dummy_items

Unnamed: 0,Archived,Cache Enabled,Cache ID,Data ID,Enabled,Interpolation Method,Key Unit Of Measure,Maximum Interpolation,Name,Source Maximum Interpolation,...,Datasource Name,Original ID,Original Datasource Class,Original Datasource ID,Original Data ID,Scoped To,Datasource Class,Datasource ID,Formula Parameters,ID
0,False,False,0eecc4ac-cde9-7710-9ba8-bdbd7cda94ba,[0EECC4B2-09C4-75F0-A556-91E2096CC581] {Signal...,True,Linear,ns,2min,Area A_Optimizer,2min,...,Example Data,0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,Time Series CSV Files,Example Data,[Tag] Area A_Optimizer.sim.ts.csv,0EECC4B2-09C4-75F0-A556-91E2096CC581,Seeq Data Lab,Seeq Data Lab,[],0EECC4B2-0F1B-E8F0-9654-58C4BB0AD540


## Including Data

Dummy items are helpful, but they are "blank," they do not have any data associated with them. If you are transferring workbooks between servers and the destination server doesn't have access to the same datasources, it is useful to be able transfer the data itself from the source server to the dummy items on the destination server. A set of SPy functions is provided in `spy.workbooks.job.data` for this purpose.

As `spy.workbooks.job.pull()` pulls workbook information, it also tracks the usage of data items on Workbench Worksheets and in Organizer Topic Documents. This information is collated and saved to disk as the _data manifest_. You can inspect the manifest like so:

In [14]:
manifest_df = spy.workbooks.job.data.manifest(job_folder)

# Simplify the DataFrame so that it fits on the screen better
manifest_df[['ID', 'Path', 'Asset', 'Name', 'Start', 'End', 'Calculation']]

Unnamed: 0,ID,Path,Asset,Name,Start,End,Calculation
0,0EECC4AC-CD98-EE00-A7AE-F057000E4F51,,,Area A_Temperature,2018-11-01 12:36:56.759000+00:00,2024-02-15 23:42:28.015433+00:00,"$within = condition(\n capsule(""2018-11-01T1..."
1,0EECC4AC-D00E-EC20-8933-BEFAD4423372,,,Area A_Compressor Stage,2018-11-01 12:36:56.759000+00:00,2024-02-15 21:42:28.015433+00:00,"$within = condition(\n capsule(""2018-11-01T1..."
2,0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,,,Area A_Optimizer,2018-11-11 04:22:45.084000+00:00,2024-02-15 21:42:28.015433+00:00,"$within = condition(\n capsule(""2018-11-11T0..."
3,0EECC4AC-D2B5-7780-99B9-17F6440D6464,,,Area A_Compressor Power,2018-11-10 16:31:09.311000+00:00,2024-02-15 21:42:28.015433+00:00,"$within = condition(\n capsule(""2018-11-10T1..."
4,0EECC4AC-D293-64A0-B188-6B0363BC3917,,,Area A_Wet Bulb,2018-11-11 04:22:45.084000+00:00,2018-12-17 06:18:49.287000+00:00,"$within = condition(\n capsule(""2018-11-11T0..."
5,0EECC4AC-CE3A-6020-B5E1-30F6D0EC3E6A,,,Area A_Relative Humidity,2018-11-01 12:36:56.759000+00:00,2018-11-12 04:22:45.084000+00:00,"$within = condition(\n capsule(""2018-11-01T1..."
6,0EECC4AC-D1F4-F990-B25B-871ACD3DC386,,,Area C_Temperature,2018-11-10 16:31:09.311000+00:00,2024-02-15 21:42:28.015433+00:00,"$within = condition(\n capsule(""2018-11-10T1..."
7,0EECC4AC-CC98-E870-973E-05647554A760,Example >> Cooling Tower 1,Area A,Temperature,2019-09-07 13:23:27.130000+00:00,2019-10-07 23:23:27.130000+00:00,"$within = condition(\n capsule(""2019-09-07T1..."


You can see `Start` and `End` columns that provide the overall time bounds that were detected in the workbook data references. There is also a `Calculation` column that refines this broad time period into specific "chunks" of data, defined as individual capsules and using the `within()` and `touches()` Seeq Formula functions to pull data only for those time periods -- not just everything between `Start` and `End`.

This manifest DataFrame can be fed directly into `spy.pull()` but it is recommended that you use `spy.workbooks.job.data.pull()` like so:

In [15]:
spy.workbooks.job.data.pull(job_folder)

0,1,2,3,4,5,6,7,8,9
,ID,Path,Asset,Name,Time,Count,Pages,Data Processed,Result
0.0,0EECC4AC-CD98-EE00-A7AE-F057000E4F51,,,Area A_Temperature,00:00:03.21,13330,1,22 MB,Success
1.0,0EECC4AC-D00E-EC20-8933-BEFAD4423372,,,Area A_Compressor Stage,00:00:02.50,6518,1,22 MB,Success
2.0,0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,,,Area A_Optimizer,00:00:02.54,756,1,22 MB,Success
3.0,0EECC4AC-D2B5-7780-99B9-17F6440D6464,,,Area A_Compressor Power,00:00:02.81,3242,1,22 MB,Success
4.0,0EECC4AC-D293-64A0-B188-6B0363BC3917,,,Area A_Wet Bulb,00:00:00.66,1446,1,415 KB,Success
5.0,0EECC4AC-CE3A-6020-B5E1-30F6D0EC3E6A,,,Area A_Relative Humidity,00:00:00.63,5766,1,122 KB,Success
6.0,0EECC4AC-D1F4-F990-B25B-871ACD3DC386,,,Area C_Temperature,00:00:03.16,2886,1,22 MB,Success
7.0,0EECC4AC-CC98-E870-973E-05647554A760,Example >> Cooling Tower 1,Area A,Temperature,00:00:01.04,21903,1,350 KB,Success


Unnamed: 0,Result,ID,Type,Path,Asset,Name,Time,Count,Pages,Data Processed
0EECC4AC-CD98-EE00-A7AE-F057000E4F51,Success,0EECC4AC-CD98-EE00-A7AE-F057000E4F51,,,,Area A_Temperature,0:00:03.212532,13330,1,22 MB
0EECC4AC-D00E-EC20-8933-BEFAD4423372,Success,0EECC4AC-D00E-EC20-8933-BEFAD4423372,,,,Area A_Compressor Stage,0:00:02.499201,6518,1,22 MB
0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,Success,0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,,,,Area A_Optimizer,0:00:02.543195,756,1,22 MB
0EECC4AC-D2B5-7780-99B9-17F6440D6464,Success,0EECC4AC-D2B5-7780-99B9-17F6440D6464,,,,Area A_Compressor Power,0:00:02.805694,3242,1,22 MB
0EECC4AC-D293-64A0-B188-6B0363BC3917,Success,0EECC4AC-D293-64A0-B188-6B0363BC3917,,,,Area A_Wet Bulb,0:00:00.664129,1446,1,415 KB
0EECC4AC-CE3A-6020-B5E1-30F6D0EC3E6A,Success,0EECC4AC-CE3A-6020-B5E1-30F6D0EC3E6A,,,,Area A_Relative Humidity,0:00:00.631141,5766,1,122 KB
0EECC4AC-D1F4-F990-B25B-871ACD3DC386,Success,0EECC4AC-D1F4-F990-B25B-871ACD3DC386,,,,Area C_Temperature,0:00:03.159527,2886,1,22 MB
0EECC4AC-CC98-E870-973E-05647554A760,Success,0EECC4AC-CC98-E870-973E-05647554A760,,Example >> Cooling Tower 1,Area A,Temperature,0:00:01.044136,21903,1,350 KB


Data has now been added to the job folder for the time periods identified in the manifest and can be pushed to the dummy items like so:

In [16]:
spy.workbooks.job.data.push(job_folder)

0,1,2,3,4,5,6,7
,ID,Type,Name,Count,Pages,Time,Result
0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,0EECC4B2-0F1B-E8F0-9654-58C4BB0AD540,StoredSignal,Area A_Optimizer,753,1,00:00:00.04,Success


Unnamed: 0,Result,ID,Type,Path,Asset,Name,Time,Count,Pages
0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,Success,0EECC4B2-0F1B-E8F0-9654-58C4BB0AD540,StoredSignal,,,Area A_Optimizer,0:00:00.042000,753,1


If you refresh the page of the **Details Pane** within the **Example Analysis**, you'll now see data for **Area A_Optimizer**.

However, if you move the Display Range to the left, you'll see that **Area A_Optimizer** data only exists for the time period that was originally on the screen.

If you want to expand how much data is pulled for a particular item, you can execute a command to alter the manifest and then pull/push again:

In [17]:
# Expand the time periods for Area A_Optimizer by 2 weeks on either side
spy.workbooks.job.data.expand(job_folder, {'Name': 'Area A_Optimizer'}, by='2w')

spy.workbooks.job.data.pull(job_folder, resume=False)
spy.workbooks.job.data.push(job_folder, resume=False)

0,1,2,3,4,5,6,7
,ID,Type,Name,Count,Pages,Time,Result
0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,0EECC4B2-0F1B-E8F0-9654-58C4BB0AD540,StoredSignal,Area A_Optimizer,30993,1,00:00:00.31,Success


Unnamed: 0,Result,ID,Type,Path,Asset,Name,Time,Count,Pages
0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,Success,0EECC4B2-0F1B-E8F0-9654-58C4BB0AD540,StoredSignal,,,Area A_Optimizer,0 days 00:00:00.313617,30993,1


There are a series of functions to alter the manifest, and

- `spy.workbooks.job.data.expand()` - Expand the existing time periods.
- `spy.workbooks.job.data.add()` - Add a specific time period.
- `spy.workbooks.job.data.remove()` - Remove a specific time period.
- `spy.workbooks.job.data.calculation()` - Apply a specific calculation, such as resample().

Documentation for these functions is found under the **Detailed Help** section below.

## Redoing Specific Workbooks/Items

In the process of pushing and pulling, you will usually use the `errors='catalog'` flag, which means that errors will be enumerated but the operation will keep going if at all possible. When you resume an operation, those items that had errors will not be re-attempted, because SPy (by default) assumes that you don't care about them.

But you will often care about errors, and you will figure out how to fix them (say, by altering a Datasource Map). You can force SPy to redo a push or pull operation for a particular item or set of items using the `redo` family of functions:

In [18]:
spy.workbooks.job.redo(job_folder, example_analysis_workbook_id)

0,1,2
,ID,Result
0.0,0EECC4B1-AF85-7580-8F20-D91393026BA1,Pull will be redone


In [19]:
spy.workbooks.job.data.redo(job_folder, area_a_optimizer_id)

0,1,2
,ID,Result
0.0,0EECC4AC-CDE9-7710-8028-B7DE1EA451C8,Data pull will be redone


## Zip/Unzip the Job Folder

If you are intending to transfer workbook information to another Seeq server, it is convenient to package up the job folder as a zip file. There are two functions for this purpose:

In [20]:
spy.workbooks.job.zip(job_folder, overwrite=True)

In [21]:
spy.workbooks.job.unzip(job_folder + '.zip', overwrite=True)

## Detailed Help

All SPy functions have detailed documentation to help you use them. Just execute `help(spy.<func>)` like
you see below.

**Make sure you re-execute the cells below to see the latest documentation. It otherwise might be from an
earlier version of SPy.**

In [22]:
help(spy.workbooks.job.pull)

Help on function pull in module seeq.spy.workbooks.job._pull:

pull(job_folder: 'str', workbooks_df: 'Union[pd.DataFrame, str]', *, resume: 'bool' = True, include_referenced_workbooks: 'bool' = True, include_rendered_content: 'bool' = False, errors: 'Optional[str]' = None, quiet: 'Optional[bool]' = None, status: 'Optional[Status]' = None, session: 'Optional[Session]' = None) -> 'pd.DataFrame'
    Pulls the definitions for each workbook specified by workbooks_df on to
    disk, in a restartable "job"-like fashion.
    
    Parameters
    ----------
    job_folder : {str}
        A full or partial path to a folder on disk where the workbooks
        definitions will be saved. If the folder does not exist, it will be
        created. If the folder exists, the job will continue where it left off.
    
    workbooks_df : {str, pd.DataFrame}
        A DataFrame containing 'ID', 'Type' and 'Workbook Type' columns that
        can be used to identify the workbooks to pull. This is usually crea

In [23]:
help(spy.workbooks.job.push)

Help on function push in module seeq.spy.workbooks.job._push:

push(job_folder, *, resume: 'bool' = True, path: 'str' = None, owner: 'str' = None, label: 'str' = None, datasource=None, use_full_path: 'bool' = False, access_control: 'str' = None, override_max_interp: 'bool' = False, scope_globals_to_workbook: 'bool' = False, create_dummy_items: 'bool' = False, errors: 'Optional[str]' = None, quiet: 'Optional[bool]' = None, status: 'Optional[Status]' = None, session: 'Optional[Session]' = None) -> 'pd.DataFrame'
    Pushes the definitions for each workbook that was pulled by the
    spy.workbooks.job.pull() function, in a restartable "job"-like fashion.
    
    Parameters
    ----------
    job_folder : {str}
        A full or partial path to the job folder created by
        spy.workbooks.job.pull().
    
    resume : bool, default True
        True if the push should resume from where it left off, False if it
        should push everything again.
    
    path : str, default None
    

In [24]:
help(spy.workbooks.job.data.pull)

Help on function pull in module seeq.spy.workbooks.job.data._pull:

pull(job_folder, *, resume: 'bool' = True, errors: 'Optional[str]' = None, quiet: 'Optional[bool]' = None, status: 'Optional[Status]' = None, session: 'Optional[Session]' = None) -> 'pd.DataFrame'
    Pulls all the data that is used by the workbooks according to the Data
    Usages sections of the data_usage.json file in the job folder.
    
    Parameters
    ----------
    job_folder : {str}
        A full or partial path to the job folder created by
        spy.workbooks.job.pull().
    
    resume : bool, default True
        True if the pull should resume from where it left off, False if it
        should pull everything again.
    
    errors : {'raise', 'catalog'}, default 'raise'
        If 'raise', any errors encountered will cause an exception. If
        'catalog', errors will be added to a 'Result' column in the status.df
        DataFrame (errors='catalog' must be combined with
        status=<Status objec

In [25]:
help(spy.workbooks.job.data.manifest)
help(spy.workbooks.job.data.expand)
help(spy.workbooks.job.data.add)
help(spy.workbooks.job.data.remove)
help(spy.workbooks.job.data.calculation)

Help on function manifest in module seeq.spy.workbooks.job.data._pull:

manifest(job_folder, *, reset=False)
    Generates and returns a DataFrame with the list of items and data to be
    pulled. The manifest is initially generated by spy.workbooks.job.pull(),
    and is constructed by examining all of the Analyses and Topics that "touch"
    a signal or condition and noting the display ranges at play.
    
    You can modify the manifest using its sibling functions: expand(), add()
    or remove().
    
    The DataFrame returned is in a format suitable for spy.pull(), but in
    general it is expected that you will just use
    spy.workbooks.job.data.pull(), which has all the resume-ability of the
    spy.workbooks.job family of functions.
    
    Parameters
    ----------
    job_folder : {str}
        A full or partial path to the job folder created by
        spy.workbooks.job.pull().
    
    reset: {bool}, default False
        If True, the manifest will be reset to the origin

In [26]:
help(spy.workbooks.job.data.push)

Help on function push in module seeq.spy.workbooks.job.data._push:

push(job_folder, *, resume: 'bool' = True, replace: 'Optional[dict]' = None, datasource: 'str' = None, errors: 'Optional[str]' = None, quiet: 'Optional[bool]' = None, status: 'Optional[Status]' = None, session: 'Optional[Session]' = None) -> 'pd.DataFrame'
    Pulls all the data that is used by the workbooks according to the Data
    Usages sections of the data_usage.json file in the job folder.
    
    Parameters
    ----------
    job_folder : {str}
        A full or partial path to the job folder created by
        spy.workbooks.job.pull() and populated with data by
        spy.workbooks.job.pull_data().
    
    resume : bool, default True
        True if the pull should resume from where it left off, False if it
        should pull everything again.
    
    replace : dict, default None
        A dict with the keys 'Start' and 'End'. If provided, any existing samples
        or capsules with the start date in the

In [27]:
help(spy.workbooks.job.redo)
help(spy.workbooks.job.data.redo)

Help on function redo in module seeq.spy.workbooks.job._redo:

redo(job_folder: 'str', workbooks_df: 'Union[pd.DataFrame, str, list]', action: 'Optional[str]' = None, *, quiet: 'Optional[bool]' = None, status: 'Optional[Status]' = None)
    Creates a zip file of the job folder for easy sharing.
    
    Parameters
    ----------
    job_folder : {str}
        A full or partial path to the job folder to be zipped.
    
    workbooks_df : {pd.DataFrame, str, list}
        A DataFrame containing an 'ID' column that can be used to identify the
        workbooks to affect. These IDs are based on the source system (not the
        destination system). Alternatively, you can supply a workbook ID
        directly as a str or list of strs.
    
    action : str
        If supplied, limits the redo to the specified actions. You can specify
        'pull' or 'push'. If not supplied, both pull and push are affected.
        Note that 'pull' automatically includes 'push'.
    
    quiet : bool
    