# Kestrel Huntflow Compilation Across Multiple Execution Environments

#### Execution Environments in This Demo
- The first SQL-supported data store: `sqlite:///onpremise.db`
- The second SQL-supported data store: `sqlite:///cloud.db`
- A Python analytics `ask-AI` which could run in a sandbox environment such as AWS Lambda

#### How It Works

1. Kestrel compiles the entire huntflow into an Intermediate Representation Graph (IRGraph)
2. Kestrel segments the IRGraph based on execution boundaries of each environment
3. Kestrel sends the subgraphs to a Kestrel interface associated with each execution environment to execute
4. Each Kestrel interface compiles IRGraph into native execution language, e.g., SQL
5. Each Kestrel interface executes the subgraph in its specific environment and return results as DataFrames
6. Kestrel cache (a special store interface) per session maintains the intermediate results between interfaces

In [1]:
# first query into the `windows` table of `sqlite:///onpremise.db`

wid_named_pipe = GET file FROM sqlalchemy://GoldenSAML-WindowsEvents
                 WHERE name LIKE r'%##SSEE\sql\query%' # WID 2008
                    OR name LIKE r'%##WID\tsql\query%' # WID 2021+

In [2]:
# Python analytics does not run in the data store
# Kestrel detects the execution boundary of the Python analytics and put it in another executed subgraph
# this subgraph will be executed after the first one (SQL query) executed

APPLY python://ask-AI ON wid_named_pipe WITH prompt='What is the following pipe in Windows?', field='name'

In [3]:
# push the enriched data back to the data store (a temp table) and create a query for the following
# Kestrel will take care of entity identification, relation resolution, mapping, etc.

reader = FIND process CREATED wid_named_pipe

In [4]:
# another execution boundary detected since we switch to `sqlite:///cloud.db` (where the `msdefender` table is)
# Kestrel will handle the data movement between two data stores and use temp table
# note Kestrel will not execute the query due to its lazy evaluation feature

susp_mde_events = GET event FROM sqlalchemy://GoldenSAML-Microsoft365DefenderEvents
                  WHERE actor.process.pid = reader.pid
                    AND device.hostname = reader.endpoint.hostname

In [5]:
# this resides in the same data store as the last query
# Kestrel will put it in the same execution boundary of the last
# resulting a single nested query in `sqlite:///cloud.db`

queries = FIND query_info RESPONDED susp_mde_events

In [6]:
# instead of executing, giving the instructions for execution

EXPLAIN queries

Analytics
~/workspace/kestrel/black-hat-us-2024/analytics/genai_emulator.py::analytics


In [7]:
# let's execute the entire graph

DISP queries

uid,attr_list,search_filter
113597,"[""objectClass""]",(objectClass=*)
113598,"[""thumbnailphoto""]",(&(objectclass=contact)(!name=CryptoPolicy)(ThumbnailPhoto=*))
113608,"[""""]",(name=CryptoPolicy)
113616,"[""thumbnailphoto""]",(l=9736f74f-fd37-4b02-80e8-8120a72ad6c2)
113771,"[""""]","(&(objectCategory=user)(memberOf=CN=Domain Admins,CN=Users,DC=simulandlabs,DC=com))"
