In [1]:
from seeq import spy
import pandas as pd

# Set the compatibility option so that you maximize the chance that SPy will remain compatible with your notebook/script
spy.options.compatibility = 193

In [2]:
# Log into Seeq Server if you're not using Seeq Data Lab:
spy.login(url='http://localhost:34216', credentials_file='../credentials.key', force=False)

# spy.search

Finds signals (tags), conditions (with capsules), scalars (constants), assets or any other type of item that Seeq indexes or keeps track of.

You will generally use this command before executing `spy.pull()`.

`spy.search(query, all_properties=False, workbook='Data Lab >> Data Lab Analysis')`

## Query Syntax

The `query` parameter is a dictionary of `'Property': 'Filter'` values that are applied using AND logic by Seeq Server. Let's use the following examples to illustrate the important parts:

In [3]:
spy.search({
    'Name': 'Humid',
    'Path': 'Example >> Cooling Tower 1'
})

0,1,2,3,4,5,6
,Name,Path,Time,Count,Pages,Result
0.0,Humid,Example >> Cooling Tower 1,00:00:00.04,8,1,Success


Unnamed: 0,ID,Path,Asset,Name,Description,Type,Value Unit Of Measure,Datasource Name,Archived
0,0EF58FC2-7E6E-7160-ACE8-E1D5408B23E1,Example >> Cooling Tower 1,Area A,Relative Humidity,,StoredSignal,%,Example Data,False
1,0EF58FC2-7EA1-75B0-B352-19D4D8CD997E,Example >> Cooling Tower 1,Area B,Relative Humidity,,StoredSignal,%,Example Data,False
2,0EF58FC2-7F95-77F0-96CD-7C9D1C3AA19A,Example >> Cooling Tower 1,Area C,Relative Humidity,,StoredSignal,%,Example Data,False
3,0EF58FC2-81A7-6480-BBBC-E2C2978378BC,Example >> Cooling Tower 1,Area G,Relative Humidity,,StoredSignal,%,Example Data,False
4,0EF58FC2-8945-FF70-B3B6-A1D7044F3E0E,Example >> Cooling Tower 1,Area K,Relative Humidity,,StoredSignal,%,Example Data,False
5,0EF58FC2-89C4-EEB0-B2A7-B8E1DA1D6274,Example >> Cooling Tower 1,Area J,Relative Humidity,,StoredSignal,%,Example Data,False
6,0EF58FC2-89F3-64E0-92BC-972EE275A66D,Example >> Cooling Tower 1,Area I,Relative Humidity,,StoredSignal,%,Example Data,False
7,0EF58FC2-8A80-EE80-87A2-FC5D1E032DB1,Example >> Cooling Tower 1,Area H,Relative Humidity,,StoredSignal,%,Example Data,False


This query returns anything with `Humid` in its `Name` property that also lives somewhere under `Example >> Cooling Tower 1` in an asset tree.

There are several capabilities and some constraints associated with querying:

1. The `Name` and `Description` properties are queried upon with the [same wildcard and RegEx support as the Data tab in Seeq Workbench](https://telemetry.seeq.com/support-link/kb/latest/cloud/basic-item-search-and-filtering).

2. The `Path` property is a _virtual_ property that specifies the path through an asset tree from its root, with `>>` as delimiters for each level in the asset tree.

3. The returned `Asset` property is just the leaf asset node in the tree. It can only be used in tandem with the `Path` property.

4. For the `Type` property, you can specify just `Signal`, `Condition` or `Scalar` if you want to return both _stored_ and _calculated_ items.

5. You can specify `Datasource Name` _or_ you can specify both `Datasource ID` and `Datasource Class` to differentiate between datasources with the same name.

6. You can filter on `Data ID`, which is a unique identifier that is generally the same across instances of Seeq Server.

7. You can filter on `Archived` and `Cache Enabled` using either `True` or `False` Python boolean values.


## Retrieving all properties

`spy.search()` only returns a subset of item properties as can be seen in the output above. If you want to retrieve all properties, use `all_properties=True`. Note that when interfacing with Seeq Server version R61 and earlier, this can be an expensive (slow) operation for queries that return many rows. In R62 and later, this query has been optimized to be much faster.

In [4]:
spy.search({
    'Name': 'Area A_*Humid*',
    'Datasource Name': 'Example Data'
}, all_properties=True)

0,1,2,3,4,5,6
,Name,Datasource Name,Time,Count,Pages,Result
0.0,Area A_*Humid*,Example Data,00:00:00.02,1,1,Success


Unnamed: 0,ID,Name,Description,Type,Value Unit Of Measure,Datasource Name,Archived,Source Value Unit Of Measure,Datasource ID,Sync Token,...,Cache ID,Datasource Class,Cache Enabled,Data ID,Interpolation Method,Stored Series Cache Version,Unsearchable,Source Maximum Interpolation,Locked,Maximum Interpolation
0,0EF58FC2-83AA-66B0-8B9F-E021808A2620,Area A_Relative Humidity,,StoredSignal,%,Example Data,False,%,Example Data,2024-08-12T22:42:29.096271800Z,...,0EF58FC2-83AA-66B0-97B4-77D1DFC3C715,Time Series CSV Files,False,[Tag] Area A_Relative Humidity.sim.ts.csv,Linear,1,False,2min,False,2min


You can also use the `include_properties=['Property Name 1', 'Property Name 2']` argument if you want to explicitly return certain properties.

## Estimating signal sample period
The `estimate_sample_period` parameter adds an `Estimated Sample Period` column with a value for each signal 
returned in search query for a specified time frame. This period is calculated using the `estimateSamplePeriod()` Formula function.

In [5]:
spy.search({
    'Name': 'Area ?_Compressor Stage',
    'Datasource Name': 'Example Data'
}, estimate_sample_period=dict(Start='2019-01-01', End='2019-01-30'))

0,1,2,3,4,5,6
,Name,Datasource Name,Time,Count,Pages,Result
0.0,Area ?_Compressor Stage,Example Data,00:00:00.04,11,1,Success


Unnamed: 0,ID,Name,Description,Type,Value Unit Of Measure,Datasource Name,Archived,Estimated Sample Period
0,0EF58FC2-85EA-F970-941E-20CF2C149E2D,Area A_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00
1,0EF58FC2-83F6-71A0-AD5D-9221D44C09F7,Area B_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00
2,0EF58FC2-82B6-6470-8624-89662770C268,Area C_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00
3,0EF58FC2-87A4-77C0-834E-6FC3BE50DF96,Area D_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00
4,0EF58FC2-8862-EEA0-B745-5FB881E17467,Area E_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00
5,0EF58FC2-8C3F-EAF0-A54D-2EDBDA836AD2,Area G_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00
6,0EF58FC2-827B-EAF0-ABEB-B847CFA93BC5,Area H_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00
7,0EF58FC2-8AEE-EC50-9B3A-ADC758E457C3,Area I_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00
8,0EF58FC2-87EB-6490-B2D3-F03E6A073359,Area J_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00
9,0EF58FC2-8561-FDF0-8A83-A26B3E1B3A2B,Area K_Compressor Stage,,StoredSignal,string,Example Data,False,0 days 00:02:00


## Workbook scoping

The `workbook=<workbook_path>` parameter allows you to include items in your results that are scoped to a particular workbook. If you exclude the argument, then by default your search results will include both globally-scoped items and workbook-scoped items from the **Data Lab >> Data Lab Analysis** workbook (folder path delimiter by `>>`, workbook name at the end). This workbook generally will only exist if you have previous called `spy.push()`. This default behavior forces your `spy.push()` and subsequent `spy.search()` activities to be _sandboxed_, meaning that they will only be visible to you within a particular workbook.

If you want all items regardless of scope, specify `workbook=spy.GLOBALS_AND_ALL_WORKBOOKS`.

If you want only globally-scoped items, specify `workbook=spy.GLOBALS_ONLY`.
        
Another option is to specify `Scoped To` within your query block. You must supply a Workbook ID -- not a workbook path like the `workbook` parameter. This approach will limit your search to just those items scoped to a particular workbook (it will _exclude_ globally-scoped items).

## DataFrame as input

Instead of a Python dictionary for the `query` parameter, you can supply a Pandas DataFrame.

This is generally useful when you have a DataFrame full of tag names but don't know the Seeq `ID` value and therefore can't retrieve data via `spy.pull()`. Calling `spy.search(data_frame)` effectively "fills in" the `ID` field for you wherever possible.

The column headers specify the properties to search on and the column values specify the match criteria.

If you don't specify wildcards or use a RegEx, the match must be exact. (This behavior is in contrast to the dictionary case, where the non-wildcard/RegEx match is a "contains" comparison.) It is assumed that your DataFrame property values should match exactly so that you can have a large set of items to query for and there won't be ambiguity between item names like `F1843CC` and `F1843CC.SP`.

In [6]:
my_items = pd.DataFrame({
    'Name': [
        'Area A_Temperature',
        'Area B_Compressor Power',
        'Optimize'
    ],
    'Datasource Name': 'Example Data'
})

my_items

Unnamed: 0,Name,Datasource Name
0,Area A_Temperature,Example Data
1,Area B_Compressor Power,Example Data
2,Optimize,Example Data


In [7]:
spy.search(my_items)

0,1,2,3,4,5,6
,Name,Datasource Name,Time,Count,Pages,Result
0.0,Area A_Temperature,Example Data,00:00:00.02,1,1,Success
1.0,Area B_Compressor Power,Example Data,00:00:00.01,1,1,Success
2.0,Optimize,Example Data,00:00:00.01,0,1,Success


Unnamed: 0,ID,Name,Description,Type,Value Unit Of Measure,Datasource Name,Archived
0,0EF58FC2-8206-77F0-8519-ECD251B886A6,Area A_Temperature,,StoredSignal,°F,Example Data,False
1,0EF58FC2-85A1-7590-AA5F-C66CA82059F8,Area B_Compressor Power,,StoredSignal,kW,Example Data,False


Notice that there are no results with `Optimize` in the name because it does not exactly match any items.

## Workbench Analysis URL as input
Alternatively, you can supply the URL of a Workbench Analysis worksheet to the `query` parameter. The URL must be 
provided as a `str`.

This is generally useful when you already have an Analysis worksheet loaded with signals, scalars or conditions 
and want to programmatically know what is loaded in that particular worksheet from Seeq Data Lab. A common scenario is 
that you want to pull all signals or conditions from a worksheet into the Data Lab notebook but you want to see first 
how many or which signals and conditions are loaded in that worksheet. 

```
spy.search('http://localhost:34216/workbook/<workbookID>/worksheet/<worksheetID>')
```

## Pickling
Sometimes a search is "expensive" and takes a long time to complete. In that case, it is common to
"pickle" the resulting DataFrame for later use, like so:

In [8]:
my_items = spy.search({
    'Name': 'Area ?_Compressor Stage',
    'Datasource Name': 'Example Data'
}, estimate_sample_period=dict(Start='2019-01-01', End='2019-01-30'))
my_items.to_pickle('pickled_search.pkl')

0,1,2,3,4,5,6
,Name,Datasource Name,Time,Count,Pages,Result
0.0,Area ?_Compressor Stage,Example Data,00:00:00.03,11,1,Success


Now you can "unpickle" the DataFrame and use it for a pull without incurring the cost of re-executing the search:

In [9]:
unpickled_search = pd.read_pickle('pickled_search.pkl')

You can access some of the original context of the search via the `spy` attribute of the unpickled
DataFrame. For example, the `spy.status` tells you what queries were successful and what queries failed.

In [10]:
unpickled_search.spy.status.df

Unnamed: 0,Name,Datasource Name,Time,Count,Pages,Result
0,Area ?_Compressor Stage,Example Data,0 days 00:00:00.034308,11,1,Success


## Detailed Help

All SPy functions have detailed documentation to help you use them. Just execute `help(spy.<func>)` like
you see below.

**Make sure you re-execute the cell below to see the latest documentation. It otherwise might be from an
earlier version of SPy.**

In [11]:
help(spy.search)

Help on function search in module seeq.spy._search:

search(query, *, all_properties=False, include_properties: 'List[str]' = None, workbook: 'Optional[str]' = 'Data Lab >> Data Lab Analysis', recursive: 'bool' = True, ignore_unindexed_properties: 'bool' = True, include_archived: 'bool' = False, include_swappable_assets: 'bool' = False, estimate_sample_period: 'Optional[dict]' = None, old_asset_format: 'bool' = None, order_by: 'Union[str, List[str]]' = None, limit: 'Optional[int]' = -1, errors: 'str' = None, quiet: 'bool' = None, status: 'Status' = None, session: 'Optional[Session]' = None) -> 'pd.DataFrame'
    Issues a query to the Seeq Server to retrieve metadata for signals,
    conditions, scalars and assets. This metadata can then be used to retrieve
    samples, capsules for a particular time range via spy.pull().
    
    Parameters
    ----------
    query : {str, dict, list, pd.DataFrame, pd.Series}
        A mapping of property / match-criteria pairs or a Seeq Workbench URL
