In [1]:
from seeq import spy
import pandas as pd

In [2]:
# Log into Seeq Server if you're not using Seeq Data Lab:
spy.login(url='http://localhost:34216', credentials_file='../credentials.key', force=False)

# spy.search

Finds signals (tags), conditions (with capsules), scalars (constants), assets or any other type of item that Seeq indexes or keeps track of.

You will generally use this command before executing `spy.pull()`.

`spy.search(query, all_properties=False, workbook='Data Lab >> Data Lab Analysis')`

## Query Syntax

The `query` parameter is a dictionary of _property_: _filter_ values that are applied using AND logic by Seeq Server. Let's use the following examples to illustrate the important parts:

In [3]:
spy.search({
    'Name': 'Humid',
    'Path': 'Example >> Cooling Tower 1'
})

0,1,2,3,4,5
,Name,Path,Time,Count,Result
0.0,Humid,Example >> Cooling Tower 1,00:00:00.09,8,Success


Unnamed: 0,ID,Path,Asset,Name,Description,Type,Value Unit Of Measure,Datasource Name,Archived
0,BA0FA2B3-29D2-43B4-86C8-A58D44297781,Example >> Cooling Tower 1,Area B,Relative Humidity,,StoredSignal,%,Example Data,False
1,42233FDA-BCA9-44C9-A26E-FFAB86EA7AEC,Example >> Cooling Tower 1,Area G,Relative Humidity,,StoredSignal,%,Example Data,False
2,642A67B0-AE03-4E8E-98FD-4F2A7D28B6DD,Example >> Cooling Tower 1,Area C,Relative Humidity,,StoredSignal,%,Example Data,False
3,ABFBE527-AADC-480A-ACFA-49798B069D9D,Example >> Cooling Tower 1,Area J,Relative Humidity,,StoredSignal,%,Example Data,False
4,4207957F-BAC1-4392-80A0-0579E733CB7A,Example >> Cooling Tower 1,Area I,Relative Humidity,,StoredSignal,%,Example Data,False
5,174CE5D8-560C-408F-93FA-11E77A21C45E,Example >> Cooling Tower 1,Area H,Relative Humidity,,StoredSignal,%,Example Data,False
6,0641FF9A-B7A3-408A-8892-1AE57A564A29,Example >> Cooling Tower 1,Area A,Relative Humidity,,StoredSignal,%,Example Data,False
7,285DCED1-1BDA-4A7B-9FE9-ADB5F185FD0B,Example >> Cooling Tower 1,Area K,Relative Humidity,,StoredSignal,%,Example Data,False


This query returns anything with `Humid` in its `Name` property that also lives somewhere under `Example >> Cooling Tower 1` in an asset tree.

There are several capabilities and some constraints associated with querying:

1. The `Name` and `Description` properties are queried upon with the same wildcard and RegEx support as the Data tab in Seeq Workbench: https://seeq12.atlassian.net/wiki/spaces/KB/pages/146472969/Searching+for+Items

2. The `Path` property is a _virtual_ property that specifies the path through an asset tree from its root, with `>>` as delimiters for each level in the asset tree.

3. The returned `Asset` property is just the leaf asset node in the tree. It can only be used in tandem with the `Path` property.

4. For the `Type` property, you can specify just `Signal`, `Condition` or `Scalar` if you want to return both _stored_ and _calculated_ items.

5. You can specify `Datasource Name` _or_ you can specify both `Datasource ID` and `Datasource Class` to differentiate between datasources with the same name.

6. You can filter on `Data ID`, which is a unique identifier that is generally the same across instances of Seeq Server.

7. You can filter on `Archived` and `Cache Enabled` using either `True` or `False` Python boolean values.


## Retrieving all properties

`spy.search()` only returns a subset of item properties as can be seen in the output above. If you want to retrieve all properties, use `all_properties=True`. Note that this can be an expensive (slow) operation for queries that return many rows.

In [4]:
spy.search({
    'Name': 'Area A_*Humid*',
    'Datasource Name': 'Example Data'
}, all_properties=True)

0,1,2,3,4,5
,Datasource Name,Name,Time,Count,Result
0.0,Example Data,Area A_*Humid*,00:00:00.04,1,Success


Unnamed: 0,ID,Name,Description,Type,Value Unit Of Measure,Datasource Name,Archived,Scoped To,Cache Enabled,Data ID,Datasource Class,Datasource ID,Interpolation Method,Key Unit Of Measure,Maximum Interpolation,Source Maximum Interpolation,Source Value Unit Of Measure,Sync Token
0,08B8564B-EDD4-461D-B8D4-01022BE06AA6,Area A_Relative Humidity,,StoredSignal,%,Example Data,False,,True,[Tag] Area A_Relative Humidity.sim.ts.csv,Time Series CSV Files,Example Data,Linear,ns,2min,2min,%,2020-07-21T15:15:25.060879700Z


## Estimating signal sample period
The `estimate_sample_period` parameter adds an `Estimated Sample Period` column with a value for each signal 
returned in search query for a specified time frame.

In [5]:
spy.search({
    'Name': 'Area ?_Compressor Stage',
    'Datasource Name': 'Example Data'
}, estimate_sample_period=dict(Start='2019-01-01', End='2019-01-30'))

0,1,2,3,4,5
,Datasource Name,Name,Time,Count,Result
0.0,Example Data,Area ?_Compressor Stage,00:00:05.31,11,Success


Unnamed: 0,ID,Name,Description,Type,Value Unit Of Measure,Datasource Name,Archived,Estimated Sample Period
0,2E322239-B5AD-4E49-BB8A-AD3C6FB80660,Area Z_Compressor Stage,,StoredSignal,string,Example Data,False,00:00:02
1,26CA21C5-104C-4D82-8808-0E8CA9F92430,Area J_Compressor Stage,,StoredSignal,string,Example Data,False,00:02:00
2,7470F359-F502-4768-827C-4B0C04C94364,Area G_Compressor Stage,,StoredSignal,string,Example Data,False,00:02:00
3,0B64B51F-4D92-4E24-BCCF-5E1BB005C82C,Area B_Compressor Stage,,StoredSignal,string,Example Data,False,00:02:00
4,BBBA3D24-2818-47A5-87C8-FC9D50A6BC7C,Area K_Compressor Stage,,StoredSignal,string,Example Data,False,00:02:00
5,67E1DC8D-5FC0-4731-B105-1097E22F1B17,Area E_Compressor Stage,,StoredSignal,string,Example Data,False,00:02:00
6,C31BD387-101B-419F-9D99-E15D42A39D81,Area A_Compressor Stage,,StoredSignal,string,Example Data,False,00:02:00
7,BFB4C09D-9913-4DC1-8D6F-012602EC4422,Area D_Compressor Stage,,StoredSignal,string,Example Data,False,00:02:00
8,0D445008-7882-4626-A480-86DC0E68CD48,Area H_Compressor Stage,,StoredSignal,string,Example Data,False,00:02:00
9,114808AF-5AFF-41FA-ABBA-45AE564BA562,Area C_Compressor Stage,,StoredSignal,string,Example Data,False,00:02:00


## Workbook scoping

The `workbook=<workbook_path>` parameter allows you to include items in your results that are scoped to a particular workbook. If you exclude the argument, then by default your search results will include both globally-scoped items and workbook-scoped items from the **Data Lab >> Data Lab Analysis** workbook (folder path delimiter by `>>`, workbook name at the end). This workbook generally will only exist if you have previous called `spy.push()`. This default behavior forces your `spy.push()` and subsequent `spy.search()` activities to be _sandboxed_, meaning that they will only be visible to you within a particular workbook.

If you want to only return globally-scoped items, specify `workbook=None` as the argument.

Another option is to specify `Scoped To` within your query block. You must supply a Workbook ID -- not a workbook path like the `workbook` parameter. This approach will limit your search to just those items scoped to a particular workbook (it will _exclude_ globally-scoped items).

## DataFrame as input

Instead of a Python dictionary for the `query` parameter, you can supply a Pandas DataFrame.

This is generally useful when you have a DataFrame full of tag names but don't know the Seeq `ID` value and therefore can't retrieve data via `spy.pull()`. Calling `spy.search(data_frame)` effectively "fills in" the `ID` field for you wherever possible.

The column headers specify the properties to search on and the column values specify the match criteria.

If you don't specify wildcards or use a RegEx, the match must be exact. (This behavior is in contrast to the dictionary case, where the non-wildcard/RegEx match is a "contains" comparison.) It is assumed that your DataFrame property values should match exactly so that you can have a large set of items to query for and there won't be ambiguity between item names like `F1843CC` and `F1843CC.SP`.

In [6]:
my_items = pd.DataFrame({
    'Name': [
        'Area A_Temperature',
        'Area B_Compressor Power',
        'Optimize'
    ],
    'Datasource Name': 'Example Data'
})

my_items

Unnamed: 0,Name,Datasource Name
0,Area A_Temperature,Example Data
1,Area B_Compressor Power,Example Data
2,Optimize,Example Data


In [7]:
spy.search(my_items)

0,1,2,3,4,5
,Datasource Name,Name,Time,Count,Result
0.0,Example Data,Area A_Temperature,00:00:00.03,1,Success
1.0,Example Data,Area B_Compressor Power,00:00:00.01,1,Success
2.0,Example Data,Optimize,00:00:00.01,0,Success


Unnamed: 0,ID,Name,Description,Type,Value Unit Of Measure,Datasource Name,Archived
0,0181190A-01B7-4720-9ED4-C54FC6A7E696,Area A_Temperature,,StoredSignal,°F,Example Data,False
1,F0C9C217-C0CF-4D10-BF02-A213C9FB4AC1,Area B_Compressor Power,,StoredSignal,kW,Example Data,False


Notice that there are no results with `Optimize` in the name because it does not exactly match any items.

## Workbench Analysis URL as input
Alternatively, you can supply the URL of a Workbench Analysis worksheet to the `query` parameter. The URL must be 
provided as a `str`.

This is generally useful when you already have an Analysis worksheet loaded with signals, scalars or conditions 
and want to programmatically know what is loaded in that particular worksheet from Seeq Data Lab. A common scenario is 
that you want to pull all signals or conditions from a worksheet into the Data Lab notebook but you want to see first 
how many or which signals and conditions are loaded in that worksheet. 

`worksheet_url = 'http://localhost:34216/workbook/<workbookID>/worksheet/<worksheetID>'`

`spy.search(worksheet_url)`

## Properties stored in the output DataFrame
If you assigned the results of `spy.search()` to a variable you can also access properties stored in the DataFrame 
that contain metadata of the `spy.search()` function call. 

One scenario where this is useful is for expensive searches that take a long time to complete.
In that case, it is common to pickle the resulting DataFrame for later use. Knowing what values for input parameters 
were used to obtain that pickled DataFrame can then be helpful when it is re-used. 

The properties stored in the output DataFrame of `spy.search()` are


Property         |   Description
-----------------|---------------------------------------------------------------------------------------------------
func             |   A str value of 'spy.search'
kwargs           |   A dict with the values of the input parameters passed to spy.search to get the output DataFrame 
status           | A spy.Status object with the status of the spy.search call


In [8]:
my_items = spy.search({
    'Name': 'Area ?_Compressor Stage',
    'Datasource Name': 'Example Data'
}, estimate_sample_period=dict(Start='2019-01-01', End='2019-01-30'))
my_items.to_pickle('pickled_search.pkl')

0,1,2,3,4,5
,Datasource Name,Name,Time,Count,Result
0.0,Example Data,Area ?_Compressor Stage,00:00:00.78,11,Success


To know what function generated this DataFrame use

In [9]:
unpickled_search = pd.read_pickle('pickled_search.pkl')
unpickled_search.func

'spy.search'

To access the values of input parameters passed to `spy.search`

In [10]:
unpickled_search.kwargs

{'query': {'Name': 'Area ?_Compressor Stage'},
 'all_properties': False,
 'workbook': 'Data Lab >> Data Lab Analysis',
 'recursive': True,
 'include_archived': False,
 'estimate_sample_period': {'Start': '2019-01-01', 'End': '2019-01-30'},
 'quiet': False,
 'status': None}

You can even recycle the kwargs

In [11]:
my_items_again = spy.search(**unpickled_search.kwargs)

0,1,2,3,4
,Name,Time,Count,Result
0.0,Area ?_Compressor Stage,00:00:00.25,11,Success


To access the `spy.status` object use `unpickled_search.status`. For example, if you want to know what queries were 
successful and what queries failed, access the status DataFrame with

In [12]:
unpickled_search.status.df

Unnamed: 0,Datasource Name,Name,Time,Count,Result
0,Example Data,Area ?_Compressor Stage,0:00:00.778813,11,Success
