In [1]:
from pyTigerGraph import TigerGraphConnection

#### Initialize connection to database

In [4]:
conn = TigerGraphConnection(
    host="http://127.0.0.1",
    username="tigergraph",
    password="tigergraph",
    graphname="Cora")

In [6]:
%tb
# The graph in the database is the Cora graph with vertex "Paper2" 
# and edge "Cite2". The schema information is shown below.

print(conn.gsql("ls"))

SystemExit: 1

Connection Failed check your Username/Password [Errno 61] Connection refused
Couldn't Initialize the client see Above Error


SystemExit: 1

#### Get graph sizes

In [4]:
# Get number of vertices of every type
conn.getVertexCount('*')

{'Paper2': 2708}

In [5]:
# Get number of vertices of a specific type
conn.getVertexCount('Paper2')

2708

In [6]:
# Get number of edges of every type
conn.getEdgeCount('*')

{'Cite2': 10556}

In [7]:
# Get number of edges of a specific type
conn.getEdgeCount('Cite2')

10556

#### Randomly split edges to train/test sets

The split results are stored in the provided edge attributes. Each boolean attribute indicates which part an edge belongs to. The attributes have to be present in the database.

In the code below, a random 80% of edges will have their attribute "train_mask" set to True, and a random 20% of edges will have their attribute "val_mask" set to True. The two parts are disjoint.

In [8]:
splitter = conn.gds.edgeSplitter(train_mask=0.8, val_mask=0.2)
splitter.run()

Installing and optimizing queries. It might take a minute if this is the first time you use this loader.
Query installation finished.
Splitting edges...
Edge split finished successfully.


#### Get subgraphs for each partition of edges

We first get the subgraph from edges in the training set by setting `filter_by="train_mask"`. Since the graph is small, we can pull the whole subgraph at once `num_batches=1`; otherwise, we can pull the subgraph in batches.

In [9]:
train_loader = conn.gds.graphLoader(
    v_in_feats=["x"],
    v_out_labels=[],
    v_extra_feats=[],
    e_in_feats=["time"],
    e_out_labels=[],
    e_extra_feats=["train_mask", "val_mask"],
    num_batches=1,
    shuffle=False,
    filter_by="train_mask",
    output_format="PyG",
    add_self_loop=False,
    loader_id=None,
    buffer_size=4
)

Installing and optimizing queries. It might take a minute if this is the first time you use this loader.
Query installation finished.


In [10]:
# Get data from the data loader
train_graph = train_loader.data
# Edge features provided by `e_in_feats` are converted to 
# tensor `edge_feat`
train_graph

Data(edge_index=[2, 8426], edge_feat=[8426], train_mask=[8426], val_mask=[8426], x=[2689, 1433])

We then get the subgraph from edges in the validation set by setting `filter_by="val_mask"`. Since the graph is small, we can pull the whole subgraph at once `num_batches=1`; otherwise, we can pull the subgraph in batches.

In [11]:
val_loader = conn.gds.graphLoader(
    v_in_feats=["x"],
    v_out_labels=[],
    v_extra_feats=[],
    e_in_feats=["time"],
    e_out_labels=[],
    e_extra_feats=["train_mask", "val_mask"],
    num_batches=1,
    shuffle=False,
    filter_by="val_mask",
    output_format="PyG",
    add_self_loop=False,
    loader_id=None,
    buffer_size=4
)

In [12]:
# Get data from the validation data loader
val_graph = val_loader.data
# Edge features provided by `e_in_feats` are converted to 
# tensor `edge_feat`
val_graph

Data(edge_index=[2, 2130], edge_feat=[2130], train_mask=[2130], val_mask=[2130], x=[1887, 1433])