# Tutorial: Manage graphs with Kusto Persistent Graph & Graphistry

## Microsoft Azure Data Explorer

**Microsoft has introduced Persistent Graph Semantics in Azure Data Explorer (ADX)**.  
This feature, announced in [this official Azure update](https://azure.microsoft.com/en-us/updates?id=495985), allows users to define **graph projections as persistent database objects**. 

### Key Benefits:
- Define graph relationships using persisted `graph semantics` that live alongside your ADX tables.
- Simplifies repeatable graph analysis workflows.
- Enables performance optimizations by reusing defined graph topologies.

This capability is especially powerful for users working with large-scale telemetry, security, and relational datasets natively in Kusto Query Language (KQL).

---

## Graphistry Adopts ADX Persistent Graph Semantics

The graph visualization and investigation platform **Graphistry** has embraced this enhancement in its latest release.

### What’s New in Graphistry:
- **Native KQL Support**: You can now use KQL directly within Graphistry to run ADX queries and return dataframes.
- **Persistent Graph Queries**: Pass the name of your persistent graph and perform investigations in Graphistry's GPU-accelerated visual interface..


### Why It Matters:
With persistent graph semantics and Graphistry’s GPU-accelerated visual interface, analysts can now: Query, explore, and visualize ADX graph data with minimal friction.

---


# Setup

Install pygraphistry and the [Kusto python client](link), and get a free [Graphistry Hub GPU API key](https://hub.graphistry.com) or run your own [server](https://www.graphistry.com/get-started)


Install Graphistry and start your Kusto graph journey.
```bash
# Just Graphistry; bring your own Kusto install
pip install graphistry

# Bundled Kusto install
pip install graphistry[kusto]

```
---
Python:
```python

graphistry.configure_kusto(cluster=KUSTO_CONF['cluster'], database=KUSTO_CONF['database'])

df_5 = graphistry.kql("YourTable | take 5")

```

# Taking it for a spin

To learn more about different authentication methods check out [API authentication to Graphistry servers](https://pygraphistry.readthedocs.io/en/latest/server/register.html)

In [1]:
import graphistry

In [43]:
KUSTO_CONF = {
    "cluster": "https://<clustername>.<region>.kusto.windows.net",
    "database": "YourDatabase"
}

GRAPHISTRY_CONF = {
    "personal_key_id": "YOUR_KEY_ID",
    "personal_key_secret": "YOUR_SECRET",
    "server": "hub.graphistry.com"
}

In [3]:
graphistry.register(api=3, **GRAPHISTRY_CONF)

graphistry.configure_kusto(**KUSTO_CONF)

<graphistry.pygraphistry.GraphistryClient at 0x7fc6a1b5efd0>

## Ingest data into your Azure Data Explorer cluster.

Import the RedTeam50k dataset used in our [UMAP cyber demo notebook](https://github.com/graphistry/pygraphistry/blob/master/demos/ai/cyber/cyber-redteam-umap-demo.ipynb) into your Azure Data Explorer cluster.

The dataset is a massaged version of the dataset publish by Alexander D. Kent.


### Executing using graphistry


With your registered and configured pygraphistry object it is now easy to execute Kusto.

We load the redteam50k dataset into our cluster.

The "kql" function returns a list of dataframes. 

In [None]:
graphistry.kql(""".execute script <|
.create-or-alter function graphistryRedTeam50k () {
    externaldata(index:long, event_time:long, src_domain:string, dst_domain:string, src_computer:string, dst_computer:string, auth_type:string, logontype:string, authentication_orientation:string, success_or_failure:string, RED:int, feats:string, feats2:string)
    [
        h@"https://raw.githubusercontent.com/graphistry/pygraphistry/master/demos/data/graphistry_redteam50k.csv"
    ]
    with(format="csv", ignoreFirstRecord=true)
    | extend event_time = datetime(2024-01-01) + event_time * 1s
}
""")

### Grabbing a sample of data

In [6]:
# Grabbing the first dataframe

df = graphistry.kql("graphistryRedTeam50k | take 100")

df.head(10)

Query returned 1 results shapes: [(100, 13)] in 0.217 sec


Unnamed: 0,index,event_time,src_domain,dst_domain,src_computer,dst_computer,auth_type,logontype,authentication_orientation,success_or_failure,RED,feats,feats2
0,30526246,2024-01-02 19:16:45+00:00,C7048$@DOM1,C7048$@DOM1,C7048,TGT,?,?,TGS,Success,0,C7048 TGT ? ?,C7048 TGT
1,5928201,2024-01-01 10:28:10+00:00,C15034$@DOM1,C15034$@DOM1,C15034,C467,?,?,TGS,Success,0,C15034 C467 ? ?,C15034 C467
2,21160461,2024-01-02 08:29:52+00:00,U2075@DOM1,U2075@DOM1,C529,C529,?,Network,LogOff,Success,0,C529 C529 ? Network,C529 C529
3,2182328,2024-01-01 06:06:59+00:00,C3547$@DOM1,C3547$@DOM1,C457,C457,?,Network,LogOff,Success,0,C457 C457 ? Network,C457 C457
4,28495743,2024-01-02 16:26:12+00:00,C567$@DOM1,C567$@DOM1,C574,C523,Kerberos,Network,LogOn,Success,0,C574 C523 Kerberos Network,C574 C523
5,32107688,2024-01-02 21:58:08+00:00,C567$@DOM1,C567$@DOM1,C1065,C1065,?,Network,LogOff,Success,0,C1065 C1065 ? Network,C1065 C1065
6,8110749,2024-01-01 13:00:38+00:00,U7039@DOM1,U7039@DOM1,C467,C467,?,Network,LogOff,Success,0,C467 C467 ? Network,C467 C467
7,500380,2024-01-01 01:25:21+00:00,U762@DOM1,U762@DOM1,C467,C467,?,Network,LogOff,Success,0,C467 C467 ? Network,C467 C467
8,342574,2024-01-01 00:58:40+00:00,ANONYMOUS LOGON@C586,ANONYMOUS LOGON@C586,C2578,C586,NTLM,Network,LogOn,Success,0,C2578 C586 NTLM Network,C2578 C586
9,38998325,2024-01-03 09:09:08+00:00,U2872@DOM1,U2872@DOM1,C2740,C612,Kerberos,Network,LogOn,Success,0,C2740 C612 Kerberos Network,C2740 C612


## Building the schema and persisting the graph

A graph model defines the specifications of a graph stored in your database metadata.

Schema definition: 
* Node and edge types with their properties
* Data source mappings: Instructions for building the graph from tabular data
* Labels: Both static (predefined) and dynamic (generated at runtime) labels for nodes and edges
* Graph models contain the blueprint for creating graph snapshots, not the actual graph data.

Read more: [Kusto Graph models](https://learn.microsoft.com/en-us/kusto/management/graph/graph-persistent-overview?view=microsoft-fabric#graph-models)

In [None]:
GRAPH_NAME = "graphistryRedTeamGraph"
graphistry.kql(f".create-or-alter graph_model {GRAPH_NAME}" + """```
{
    "Schema": {
        "Nodes": {
            "Computer": {"computerName": "string", "RED":"int"},
            "Domain": {"domainName": "string", "RED":"int"}
        },
        "Edges": {
            "AUTHENTICATES": {
                "event_time": "datetime",
                "src_computer": "string",
                "dst_computer": "string",
                "src_domain": "string",
                "dst_domain": "string",
                "auth_type": "string",
                "logontype": "string",
                "authentication_orientation": "string",
                "success_or_failure": "string",
                "RED": "int"
            }
        }
    },
    "Definition": {
        "Steps": [
            {
                "Kind": "AddNodes",
                "Query": "graphistryRedTeam50k | project computerName = src_computer, RED, nodeType = 'Computer'",
                "NodeIdColumn": "computerName",
                "Labels": ["Computer"],
                "LabelsColumn": "nodeType"
            },
            {
                "Kind": "AddNodes",
                "Query": "graphistryRedTeam50k | project computerName = dst_computer, RED, nodeType = 'Computer'",
                "NodeIdColumn": "computerName",
                "Labels": ["Computer"],
                "LabelsColumn": "nodeType"
            },
            {
                "Kind": "AddNodes",
                "Query": "graphistryRedTeam50k | project domainName = src_domain, nodeType = 'Domain',RED",
                "NodeIdColumn": "domainName",
                "Labels": ["Domain"],
                "LabelsColumn": "nodeType"
            },
            {
                "Kind": "AddNodes",
                "Query": "graphistryRedTeam50k | project domainName = dst_domain, nodeType = 'Domain',RED",
                "NodeIdColumn": "domainName",
                "Labels": ["Domain"],
                "LabelsColumn": "nodeType"
            },
            {
                "Kind": "AddEdges",
                "Query": "graphistryRedTeam50k | project event_time, src_computer, dst_computer, src_domain, dst_domain, auth_type, logontype, authentication_orientation, success_or_failure, RED",
                "SourceColumn": "src_computer",
                "TargetColumn": "dst_computer",
                "Labels": ["AUTHENTICATES"]
            }
        ]
    }
}```
""")

## Making the snapshot

A graph snapshot is the actual graph instance materialized from a graph model. It represents:

* A specific point-in-time view of the data as defined by the model
* The nodes, edges, and their properties in a queryable format
* A self-contained entity that persists until explicitly removed

Snapshots are the entities you query when working with persistent graphs. 
Read more: [Kusto Graph snapshot](https://learn.microsoft.com/en-us/kusto/management/graph/graph-persistent-overview?view=microsoft-fabric#graph-snapshots)


In [None]:
snapshot_name = "InitialSnap"
graph_snapshot_query = f".make graph_snapshot {snapshot_name} from {GRAPH_NAME}"

graphistry.kql(graph_snapshot_query)

# Graph Visualization


Once your **data**, **persistent graph** and **snapshot** is created in your Azure Data Explorer cluster it is time to see the power of Graphistry's GPU-accelerated visual interface.

The kusto_graph function accepts two parameters. 
The name of the graph, and the name of your snapshot **(snap_name="name")**. If you don't provide a snapshot it will grab the latest snapshot.

The function returns a Graphistry plottable object.

You can inspect the nodes and edges, add customizations or .plot() it as is.

In [None]:
snapshot_name = "InitialSnap"
g = graphistry.kusto_graph(GRAPH_NAME, snap_name=snapshot_name)

Query returned 2 results shapes: [(21984, 5), (50749, 12)] in 1.133 sec


## Plotting your object

In [29]:
g.plot()

### Changing colors, icons and more


Our data consists of two datasets where one contains verified red team activity. In the dataset these are tagged with the value 1 in the column **RED**.

Let's make our red nodes pop out in our visualization.
As our data is split into two different type of nodes **"Computer"** and **"Domain"**
We also add some icons to make it easier to distinguish the different nodetypes we have.


Learn more here: [Graphistry Visualization](https://pygraphistry.readthedocs.io/en/latest/notebooks/visualization.html)

In [47]:
g2 = g.encode_point_color(
    "RED",
    categorical_mapping={
        1: "red"
    },
    default_mapping='silver'
)
g3 = g2.encode_point_icon(
    'nodeType',
    shape="circle",
    categorical_mapping={
        "Computer": "laptop", 
        "Domain": "server"
    },
    default_mapping="question")

g3.plot()

## Next steps

* [Kusto Graph](https://learn.microsoft.com/en-us/kusto/query/graph-semantics-overview?view=microsoft-fabric)
* [10 Minutes to PyGraphistry](https://pygraphistry.readthedocs.io/en/latest/10min.html)
* [10 Min to GFQL (graph query)](https://pygraphistry.readthedocs.io/en/latest/gfql/about.html)
* [GenAI investigations with Louie.ai](https://louie.ai/)

Data citation:
```
A. D. Kent, “Comprehensive, Multi-Source Cybersecurity Events,”
Los Alamos National Laboratory, http://dx.doi.org/10.17021/1179829, 2015.

@Misc{kent-2015-cyberdata1,
  author =     {Alexander D. Kent},
  title =      {{Comprehensive, Multi-Source Cyber-Security Events}},
  year =       {2015},
  howpublished = {Los Alamos National Laboratory},
  doi = {10.17021/1179829}
}
```