# Template Query

This notebook demostrates the use of [template query](https://docs.tigergraph.com/graph-ml/current/using-an-algorithm/#_packaged_template_queries), which is a new feature since TigerGraph Database `3.9` and pyTigerGraph `1.3`. That means, this notebook only runs with DB 3.9 and above and pyTigerGraph 1.3 and above.

## What are template queries?

Template queries, in this context, are the "static" version of the [graph algorithms](https://docs.tigergraph.com/graph-ml/current/intro/). "Static" means that a query is bound to the vertex type(s) and/or edge type(s) given to a query as input parameters at installation time. If you change the input vertex or edge types later, a new query will be generated and installed. 

But note not every graph algorithm has a template query currently. More template queries will be added in future versions.

## How is current user experience impacted?

As a user, there is not much difference in calling a template graph algorithm (See below for examples). You will only notice the query installation when you change input vertex or edge types. Changing other query parameters such as `iterations` won't generate a new query. 

## What is the benefit of using template queries?

As a template query is bound to certain vertex and edge types, it runs  faster than the "schema-less" version. Therefore, it is useful when speed is the main concern. However, there is a tradeoff of flexibility when you are experimenting with vertex and edge types.  

## Examples

### Connection to Database

The `TigerGraphConnection` class represents a connection to the TigerGraph database. Under the hood, it stores the necessary information to communicate with the database. It is able to perform quite a few database tasks. Please see its [documentation](https://docs.tigergraph.com/pytigergraph/current/intro/) for details.

To connect your database, modify the `config.json` file accompanying this notebook. Set the value of `getToken` based on whether token auth is enabled for your database. Token auth is always enabled for tgcloud databases. 

In [1]:
from pyTigerGraph import TigerGraphConnection
import json

# Read in DB configs
with open('../config.json', "r") as config_file:
    config = json.load(config_file)
    
conn = TigerGraphConnection(
    host=config["host"],
    username=config["username"],
    password=config["password"]
)

### Ingest Data

In [2]:
from pyTigerGraph.datasets import Datasets

dataset = Datasets("ldbc_snb")

A folder with name ldbc_snb already exists in ./tmp. Skip downloading.


In [3]:
conn.ingestDataset(dataset, getToken=config["getToken"])

---- Checking database ----
A graph with name ldbc_snb already exists in the database. Skip ingestion.
Graph name is set to ldbc_snb for this connection.


### Visualize Schema

In [4]:
from pyTigerGraph.visualization import drawSchema

drawSchema(conn.getSchema(force=True))

CytoscapeWidget(cytoscape_layout={'name': 'circle', 'animate': True, 'padding': 1}, cytoscape_style=[{'selecto…

### Featurizer

`pyTigerGraph` provides the `featurizer` as a friendly interface to the graph algorithms. Please see the `feature_engineering` notebook for details on the `featurizer` and the notebooks under `algos` folder for details on the algorithms. Below we briefy review how to run a non-template graph algorithm with the featurizer first, and then we will learn how to run the template version with just one change of the parameters.

### Example 1: PageRank

#### Non-Template Query 

In [5]:
# Create a featurizer
f = conn.gds.featurizer()

# Run an algorithm with paramters
params = {
    'v_type': 'Person', 
    'e_type': 'Knows', 
    'max_change': 0.001, 
    'maximum_iteration': 25, 
    'damping': 0.85,
    'top_k': 10, 
    'print_results': True, 
    'result_attribute': '', 
    'file_path': '', 
    'display_edges': False}

res = f.runAlgorithm(
    'tg_pagerank', 
    params=params
)

Cannot read manifest file. Trying master branch.


In [6]:
# Check result
res

[{'@@top_scores_heap': [{'Vertex_ID': '2199023262543', 'score': 24.85992},
   {'Vertex_ID': '6597069777240', 'score': 23.86707},
   {'Vertex_ID': '17592186053137', 'score': 23.6497},
   {'Vertex_ID': '4398046513018', 'score': 23.56558},
   {'Vertex_ID': '30786325585162', 'score': 23.43321},
   {'Vertex_ID': '2199023259756', 'score': 22.87003},
   {'Vertex_ID': '24189255819727', 'score': 22.31711},
   {'Vertex_ID': '19791209302403', 'score': 20.59326},
   {'Vertex_ID': '8796093029267', 'score': 20.49563},
   {'Vertex_ID': '4139', 'score': 20.41319}]}]

In [23]:
#Rerun the algorithm and record its run time for comparison later
import time

start_time = time.perf_counter()
res = f.runAlgorithm(
    'tg_pagerank', 
    params=params
)
non_template_time = time.perf_counter() - start_time
print("Time elapsed: {:.3} seconds".format(non_template_time))

Time elapsed: 1.36 seconds


#### Template Query

To use template query, there is only one change: set `templateQuery` to `True` when running an algorithm with the featurizer.

In [9]:
# Create a featurizer
f = conn.gds.featurizer()

# Run an algorithm with paramters
params = {
    'v_type': 'Person', 
    'e_type': 'Knows', 
    'max_change': 0.001, 
    'maximum_iteration': 25, 
    'damping': 0.85,
    'top_k': 10, 
    'print_results': True, 
    'result_attribute': '', 
    'file_path': '', 
    'display_edges': False}

res = f.runAlgorithm(
    'tg_pagerank', 
    params=params,
    templateQuery=True # Set this to True to use template query. Default False.
)

Cannot read manifest file. Trying master branch.


In [10]:
# Check result
res

[{'@@top_scores_heap': [{'score': 24.85992, 'Vertex_ID': '2199023262543'},
   {'score': 23.86707, 'Vertex_ID': '6597069777240'},
   {'score': 23.6497, 'Vertex_ID': '17592186053137'},
   {'score': 23.56558, 'Vertex_ID': '4398046513018'},
   {'score': 23.4332, 'Vertex_ID': '30786325585162'},
   {'score': 22.87003, 'Vertex_ID': '2199023259756'},
   {'score': 22.3171, 'Vertex_ID': '24189255819727'},
   {'score': 20.59327, 'Vertex_ID': '19791209302403'},
   {'score': 20.49563, 'Vertex_ID': '8796093029267'},
   {'score': 20.41318, 'Vertex_ID': '4139'}]}]

In [24]:
# Rerun the template query and record its run time.

start_time = time.perf_counter()
res = f.runAlgorithm(
    'tg_pagerank', 
    params=params,
    templateQuery=True
)
template_time = time.perf_counter() - start_time
print("Time elapsed: {:.3} seconds".format(template_time))

Time elapsed: 0.708 seconds


### Example 2: Breadth-First Search

#### Non-Template Query 

In [15]:
# Create a featurizer
f = conn.gds.featurizer()

# Run an algorithm with paramters
params = {
    "v_type_set": ["Person"],
    "e_type_set": ["Knows"],
    "max_hops": 2,
    "v_start": {"id": "21990232556463", "type": "Person"}, ##{"id": "vertex_id", "type": "vertex_type"}
    "print_results": True,
    "result_attribute": "",
    "file_path": "",
    "display_edges": False
}

res = f.runAlgorithm(
    'tg_bfs', 
    params=params
)

Cannot read manifest file. Trying master branch.


In [16]:
# Check result
res[0]['Start'][:10]

[{'v_id': '30786325580605',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}},
 {'v_id': '13194139540951',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}},
 {'v_id': '6597069769055',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}},
 {'v_id': '15393162796423',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}},
 {'v_id': '15393162792715',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}},
 {'v_id': '28587302332123',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}},
 {'v_id': '6597069774914',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}},
 {'v_id': '13194139542969',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}},
 {'v_id': '15393162795179',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}},
 {'v_id': '4398046519923',
  'v_type': 'Person',
  'attributes': {'Start.@sum_step': 2}}]

In [24]:
#Rerun the algorithm and record its run time for comparison later
import time

start_time = time.perf_counter()
res = f.runAlgorithm(
    'tg_bfs', 
    params=params
)
bfs_non_template_time = time.perf_counter() - start_time
print("Time elapsed: {:.3} seconds".format(bfs_non_template_time))

Time elapsed: 0.14 seconds


#### Template Query

To use template query, there is only one change: set `templateQuery` to `True` when running an algorithm with the featurizer.

In [18]:
# Create a featurizer
f = conn.gds.featurizer()

# Run an algorithm with paramters
params = {
    "v_type_set": ["Person"],
    "e_type_set": ["Knows"],
    "max_hops": 2,
    "v_start": {"id": "21990232556463", "type": "Person"}, ##{"id": "vertex_id", "type": "vertex_type"}
    "print_results": True,
    "result_attribute": "",
    "file_path": "",
    "display_edges": False
}

res = f.runAlgorithm(
    'tg_bfs', 
    params=params,
    templateQuery=True # Set this to True to use template query. Default False.
)

Cannot read manifest file. Trying master branch.
Running the algorithm. It might take a minute to install the query if this is the first time it runs.


In [19]:
# Check result
res[0]['Start'][:10]

[{'v_id': '30786325580605',
  'attributes': {'Start.@sum_step': 2},
  'v_type': 'Person'},
 {'v_id': '13194139540951',
  'attributes': {'Start.@sum_step': 2},
  'v_type': 'Person'},
 {'v_id': '6597069769055',
  'attributes': {'Start.@sum_step': 2},
  'v_type': 'Person'},
 {'v_id': '15393162796423',
  'attributes': {'Start.@sum_step': 2},
  'v_type': 'Person'},
 {'v_id': '15393162792715',
  'attributes': {'Start.@sum_step': 2},
  'v_type': 'Person'},
 {'v_id': '28587302332123',
  'attributes': {'Start.@sum_step': 2},
  'v_type': 'Person'},
 {'v_id': '6597069774914',
  'attributes': {'Start.@sum_step': 2},
  'v_type': 'Person'},
 {'v_id': '9079', 'attributes': {'Start.@sum_step': 2}, 'v_type': 'Person'},
 {'v_id': '21990232561273',
  'attributes': {'Start.@sum_step': 2},
  'v_type': 'Person'},
 {'v_id': '15393162792433',
  'attributes': {'Start.@sum_step': 2},
  'v_type': 'Person'}]

In [25]:
# Rerun the template query and record its run time.

start_time = time.perf_counter()
res = f.runAlgorithm(
    'tg_bfs', 
    params=params,
    templateQuery=True
)
bfs_template_time = time.perf_counter() - start_time
print("Time elapsed: {:.3} seconds".format(bfs_template_time))

Running the algorithm. It might take a minute to install the query if this is the first time it runs.
Time elapsed: 0.146 seconds


### Takeaways

In [25]:
print(
    "The template version of PageRank is {}% faster than the non-template version.".format(
    int(100*(non_template_time-template_time)/non_template_time)))



The template version of PageRank is 47% faster than the non-template version.


In [29]:
print(
    "The template and non-template versions of BFS show almost the same performance ({} v.s. {}) as this graph is small.".format(
    bfs_template_time, bfs_non_template_time))

The template and non-template versions of BFS show almost the same performance (0.14555528794880956 v.s. 0.14016598195303231) as this graph is small.
