# Connecting and Uploading Data to TigerGraph
This notebook will demonstrate connecting to an existing TigerGraph database instance, publishing a global graph schema, creating a new graph, and uploading data to the graph. To get started, have an active database instance launched and a graph published in GraphStudio named 'MyGraph'.

In [1]:
# Imports
import pyTigerGraph as tg
import json
import os

# import TigerGraph instance config
os.chdir('../config/')
with open('tigergraph.json', 'r') as f:
    config = json.load(f)

# Connection parameters
hostName = config['host']
secret = config['secret']

conn = tg.TigerGraphConnection(host=hostName, gsqlSecret=secret, graphname="MyGraph")
conn.getToken(secret)

('ialravn8hrorjlah61plt14mach58tnj', 1677228480, '2023-02-24 08:48:00')

### Define and Publish Graph Schema
In this section, we will publish a global graph schema to TigerGraph, from which we will construct our Ethereum transaction graph. The global schema is as follows:
* Nodes = ETH Wallets: ID, label <br />
* Directed Edges = Transactions: From_ID, To_ID, ETH_amount, timestamp

In [2]:
# DEFINE / CREATE ALL EDGES AND VERTICES in Global View
results = conn.gsql('''
  USE GLOBAL
  CREATE VERTEX Wallet (PRIMARY_ID id INT, label FLOAT) WITH primary_id_as_attribute="true"
  CREATE DIRECTED EDGE sent_eth (from Wallet, to Wallet, amount FLOAT, sent_date INT) WITH REVERSE_EDGE="reverse_sent_eth"
  CREATE DIRECTED EDGE received_eth (from Wallet, to Wallet, amount FLOAT, receive_date INT) WITH REVERSE_EDGE="reverse_received_eth"
''')
print(results)

Successfully created vertex types: [Wallet].
Successfully created edge types: [sent_eth].
Successfully created reverse edge types: [reverse_sent_eth].
Successfully created edge types: [received_eth].
Successfully created reverse edge types: [reverse_received_eth].


### Create the Ethereum Transaction Graph
We will now create our transaction graph from our global schema, and we will establish a connection to the newly created graph.

In [3]:
# Create a new graph from the global schema
results = conn.gsql('''
  CREATE GRAPH Ethereum(Wallet, sent_eth, reverse_sent_eth, received_eth, reverse_received_eth)
''')
print(results)

The graph Ethereum is created.


In [4]:
# connect to the newly created graph
conn.graphname="Ethereum"
secret = conn.createSecret()
conn = tg.TigerGraphConnection(host=hostName, gsqlSecret=secret, graphname="Ethereum")
conn.getToken(secret)
conn.getSchema()

{'GraphName': 'Ethereum',
 'VertexTypes': [{'Config': {'STATS': 'OUTDEGREE_BY_EDGETYPE',
    'PRIMARY_ID_AS_ATTRIBUTE': True},
   'Attributes': [{'AttributeType': {'Name': 'FLOAT'},
     'AttributeName': 'label'}],
   'PrimaryId': {'AttributeType': {'Name': 'INT'},
    'PrimaryIdAsAttribute': True,
    'AttributeName': 'id'},
   'Name': 'Wallet'}],
 'EdgeTypes': [{'IsDirected': True,
   'ToVertexTypeName': 'Wallet',
   'Config': {'REVERSE_EDGE': 'reverse_sent_eth'},
   'Attributes': [{'AttributeType': {'Name': 'FLOAT'},
     'AttributeName': 'amount'},
    {'AttributeType': {'Name': 'INT'}, 'AttributeName': 'sent_date'}],
   'FromVertexTypeName': 'Wallet',
   'Name': 'sent_eth'},
  {'IsDirected': True,
   'ToVertexTypeName': 'Wallet',
   'Config': {'REVERSE_EDGE': 'reverse_received_eth'},
   'Attributes': [{'AttributeType': {'Name': 'FLOAT'},
     'AttributeName': 'amount'},
    {'AttributeType': {'Name': 'INT'}, 'AttributeName': 'receive_date'}],
   'FromVertexTypeName': 'Wallet',
   

### Create Loading Jobs
We will now create custom loading jobs to map the values from our datasets to vertex and edge attributes for our transaction graph.

#### Wallets

In [5]:
# Custom loading job that maps the values of nodes.csv to VERTEX attributes
results = conn.gsql('''
  USE GRAPH Ethereum
  BEGIN
  CREATE LOADING JOB load_wallets FOR GRAPH Ethereum {
  DEFINE FILENAME MyDataSource;
  LOAD MyDataSource TO VERTEX Wallet VALUES($0, $1) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  }
  END
  ''')
print(results)

Using graph 'Ethereum'
Successfully created loading jobs: [load_wallets].


#### Transactions

In [6]:
# Custom loading job that maps the values of edges.csv to EDGE attributes
results = conn.gsql('''
  USE GRAPH Ethereum
  BEGIN
  CREATE LOADING JOB load_transactions FOR GRAPH Ethereum {
  DEFINE FILENAME MyDataSource;
  LOAD MyDataSource TO EDGE sent_eth VALUES($1, $0, $2, $3) USING SEPARATOR=",", HEADER="true", EOL="\\n";
  LOAD MyDataSource TO EDGE received_eth VALUES($0, $1, $2, $3) USING SEPARATOR=",", HEADER="true", EOL="\\n";
  }
  END''')
print(results)

Using graph 'Ethereum'
Successfully created loading jobs: [load_transactions].


### Load Data

Using the loading jobs we just created, we will upload wallet (node) and transaction (edge) data from our local system into the graph stored in TigerGraph.

In [8]:
os.chdir('../data/')

# Load the nodes file with the 'load_wallets' job
nodes_file = 'nodes.csv'
results = conn.runLoadingJobWithFile(filePath=nodes_file, fileTag='MyDataSource', jobName='load_wallets')
print(json.dumps(results, indent=2))

[
  {
    "sourceFileName": "Online_POST",
    "statistics": {
      "validLine": 86623,
      "rejectLine": 0,
      "failedConditionLine": 0,
      "notEnoughToken": 0,
      "invalidJson": 0,
      "oversizeToken": 0,
      "vertex": [
        {
          "typeName": "Wallet",
          "validObject": 86622,
          "noIdFound": 0,
          "invalidAttribute": 0,
          "invalidVertexType": 0,
          "invalidPrimaryId": 1,
          "invalidSecondaryId": 0,
          "incorrectFixedBinaryLength": 0
        }
      ],
      "edge": [],
      "deleteVertex": [],
      "deleteEdge": []
    }
  }
]


In [9]:
# Load the edges file with the 'load_transactions' job
edges_file = 'edges.csv'
results = conn.runLoadingJobWithFile(filePath=edges_file, fileTag='MyDataSource', jobName='load_transactions')
print(json.dumps(results, indent=2))

[
  {
    "sourceFileName": "Online_POST",
    "statistics": {
      "validLine": 401177,
      "rejectLine": 0,
      "failedConditionLine": 0,
      "notEnoughToken": 0,
      "invalidJson": 0,
      "oversizeToken": 0,
      "vertex": [],
      "edge": [
        {
          "typeName": "sent_eth",
          "validObject": 401176,
          "noIdFound": 0,
          "invalidAttribute": 0,
          "invalidVertexType": 0,
          "invalidPrimaryId": 1,
          "invalidSecondaryId": 0,
          "incorrectFixedBinaryLength": 0
        },
        {
          "typeName": "received_eth",
          "validObject": 401176,
          "noIdFound": 0,
          "invalidAttribute": 0,
          "invalidVertexType": 0,
          "invalidPrimaryId": 1,
          "invalidSecondaryId": 0,
          "incorrectFixedBinaryLength": 0
        }
      ],
      "deleteVertex": [],
      "deleteEdge": []
    }
  }
]


### Exploring the Graph

In [None]:
# TODO - ETHAN

# write GSQL queries to get some summary statistics on edges/nodes
# https://colab.research.google.com/drive/1JhYcnGVWT51KswcXZzyPzKqCoPP5htcC?usp=sharing#scrollTo=Zerob6pDgmq2 

In [None]:
# Get Node/Edge Counts
print("Vertex Counts")
for vertex in conn.getVertexTypes():
  print(f"There are {conn.getVertexCount(vertex)} {vertex} vertices in the graph")

print("--------------")
print("Edge Counts")
for edge in conn.getEdgeTypes():
  print(f"There are {conn.getEdgeCount(edge)} {edge} edges in the graph")

In [None]:
# GSQL query on amounts

In [None]:
# GSQL query on indegree/outdegree

### Installing Queries
Install user defined queries on TigerGraph instance. 

In [None]:
conn.gsql('''
  USE GRAPH Ethereum
  INSTALL QUERY hashtags_from_person
''')

### Run Queries


In [None]:
results = conn.runInstalledQuery("hashtags_from_person", params={"inPer": "50"})
print(json.dumps(results, indent=2))