<img src="img/GraphAISummitNEW.png" alt="Drawing" width="1000" height="100"/>

# Alex Infanzon & Bob Hardaway<br>
## Professional Sales Engineers, Ametuer Data Scientists<br>
## Intro to Recommendations with Tigergraph, Docker and Python<br>

In the next 40 minutes, we will introduce the PyTigerGraph python package and
develop a simple recommendation engine running on a portable Docker container.


# pyTigerGraph examples

TigerGraph is a graph database software with a multitude of functionality and solutions to some of the issues that have plagued other graph databases. This notebook demonstrates how to use basic commands to connect, create and load data into TigerGraph using the Python pyTigerGraph module.

<img src="img/Architecture_diagram.png" alt="Drawing" width="1000" height="100"/>

## STEP 1: Import Packages

Note: Assuming you have installed the pyTigerGraph package. If not install it using:
```pip install pyTigerGraph```

In [None]:
import pyTigerGraph as tg
import pandas as pd
import json
import re
from IPython.display import display
from PIL import Image

print(tg.__version__)

In [None]:
## Make sure to keep packages up to date

#pip install -U pyTigerGraph pyTigerDriver

## STEP 2: Establishing the connection to a TigerGraph database

<div>
  <img style="vertical-align:top" src="img/connected-icon.png" width="30" height="30"/>
  <span style="">The functionality of pyTigerGraph is implemented by the TigerGraphConnection class. To establish the connection to the database you need to provide the hostname, username and password to access the database.</span>
</div>


<table>
  <tr>
    <th>Connect to localhost</th>
    <th>Connect to TG Cloud</th>
  </tr>
  <tr>
    <td>conn = tg.TigerGraphConnection(<br>host='http://localhost',<br> username="tigergraph",<br> password='tigergraph'<br>) </td>
    <td>conn = tg.TigerGraphConnection(<br>host='https://tgcloud.io/app/solutions',<br> graphname="test",<br> username=userName,<br> password=password,<br> apiToken=apiToken)<br> authToken = conn.getToken(secret)<br> )</td>
   </tr>

</table>

In [None]:
conn = tg.TigerGraphConnection(
    host='http://localhost',
    username="tigergraph",
    password='tigergraph')

## STEP 3: Design Schema

<div>
  <img style="vertical-align:top" src="img/graph_img.png" width="30" height="30"/>
  <span style="">Before data can be loaded into the graph store, the user must define a graph schema. A graph schema is a "dictionary" that defines the types of entities, vertices and edges, in the graph and how those types of entities are related to one another.</span>
</div>

### WARNING: DROP ALL - Will Delete everything in your graph!

Execute this cell if you would like to start the notebook lab from the beginning.

In [None]:
print(conn.gsql('''DROP ALL''', options=[]))

----
<img src="img/graph_sch.png" alt="Drawing" width="500" height="100"/>

The CREATE VERTEX statement defines a new global vertex type, with a name and an attribute list. 

The CREATE EDGE statement defines a new global edge type. There are two forms of the CREATE EDGE statement, one for directed edges and one for undirected edges.  Each edge type must specify that it connects FROM one vertex type TO another vertex type.

In [None]:
print(conn.gsql('''

CREATE VERTEX person (PRIMARY_ID name STRING, gender STRING, name STRING, age INT, state STRING) 

CREATE UNDIRECTED EDGE friendship (FROM person, TO person, connect_day datetime)

CREATE GRAPH social (person, friendship)'''
                
, options=[]))

----
The GSQL command enable sending arbitrary GSQL statements to the database. Next cell show how to test the schema createtion was succesful.
Change to the social graph

In [None]:
conn.graphname = 'social'

## STEP 4: Load data

<div>
  <img style="vertical-align:top" src="img/load_data.png" width="30" height="30"/>
  <span style="">The pyTigerGraph submodule provides results from various built-in endpoints in a Pandas DataFrame. To load data upload the csv file to a dataframe inside the notebook. 
</span>
</div>

In [None]:
people = pd.read_csv('data/people.csv')
people

In [None]:
friendships = pd.read_csv('./data/friendships.csv')
friendships

In [None]:
v_person = conn.upsertVertexDataFrame(
      people, "person", "name"
    , attributes={"name": "name", "gender": "gender", "age": "age", "state": "state"})
print(str(v_person) + " Customer VERTICES Upserted")

In [None]:
numPersons = conn.getVertexCount("person")
print(f"There are currently {numPersons} in of vertex type person, prior to map")

In [None]:
v_friendships = conn.upsertEdgeDataFrame(friendships,"person", "friendship", "person", from_id="person1", to_id="person2", attributes={"connect_day":"date"})
print(str(v_friendships) + " Friendships Edges Upserted")

## STEP 5: Explore Graph

### The Functions

The functions below are grouped by:

- Schema related functions - these functions can be used to get schema information or to load data into the graph
- Query related functions - these two functions are use to run or compile GSQL queries
- Vertex related functions - vertex related functions
- Edge related functions - edge related functions
- Token management - management
- Other functions - some miselaneous functions


| Schema related functions | Query related functions | Vertex related functions | Edge related functions | Token management | Other functions |
| :------------------------| :---------------------- | :----------------------- | :--------------------- | :--------------- | :-------------- |
| getSchema | runInstalledQuery | getVertexTypes | getEdgeTypes | getToken | echo |
| getUDTs | runInterpretedQuery | getVertexType | getEdgeType | refreshToken  | getEndpoints|
| getUDT | | getVertexCount| getEdgeCount|deleteToken | getStatistics |
| upsertData| |  upsertVertex|upsertEdge||getVersion |
| | | upsertVertices | upsertEdges||getVer |
| | | getVertices | getEdges||getLicenseInfo |
| | | getVerticesById | getEdgeStats|| |
| | | getVertexStats | delEdges|||
| | | delVertices| | |
| | | delVerticesById| 

### A simple 'ls' command provides a complete summary of the TigerGraph elements.

In [None]:
print(conn.gsql('''ls''', options=[]))

In [None]:
conn.getVertexTypes()

In [None]:
conn.getVertexType('person')

In [None]:
conn.getVertexStats('person')

In [None]:
conn.getEdgeTypes()

In [None]:
conn.getEdgeStats('friendship', skipNA=False)

## STEP 6: Write Queries

<div>
  <img style="vertical-align:top" src="img/query.png" width="28" height="28"/>
  <span style="">TBD. 
</span>
</div>

In [None]:
print(conn.getEdgesDataframe("person", "Jenny"))
display(Image.open("img/Explore_fig1.png"))

In [None]:
print(conn.gsql('''USE GRAPH social
   SELECT * FROM person LIMIT 3
'''))


In [None]:
conn.runInterpretedQuery('''INTERPRET QUERY () FOR GRAPH social {
    users = {person.*};
    Result = SELECT p FROM users:u-(friendship)->:p WHERE u.name == "Tom";
  PRINT Result; 
}''')

In [None]:
print(conn.gsql('''USE GRAPH social
   SELECT * FROM person WHERE gender=="female"'''))

### Use the built in gewtVertices() fucntion to return individual attributes for all females

In [None]:
res=conn.getVertices('person', select='name,age,gender', where='gender=="female"')
res

### Retrive the individual attributes for each person, in order to create a feature matrix

In [None]:
attrs=[x['attributes'] for x in res]
attrs

### Use the PAI to retrieve Dan's friends

In [None]:
sourceVertexType='person'
sourceVertexId='Dan'
conn.getEdges(sourceVertexType, sourceVertexId, edgeType=None, targetVertexType=None, targetVertexId=None, select="", where="", limit="", sort="", timeout=0)

In [None]:
attrs=res[-2]['attributes']
attrs

### Create a pandas dataframe base don the list returned from TigerGraph

In [None]:
df = pd.DataFrame(res)
df

### You can also directly execute a query via the connection

In [None]:
conn.gsql('''select * from person where primary_id=="Tom"''')

### The power of Tiger is then to begin to traverse relationships in the data, such as retrieving all friends of Dan's friends

In [None]:
conn.runInterpretedQuery('''
  INTERPRET QUERY x() FOR GRAPH social {
  # declaration statements
  STRING uid = "Tom";
  users = {person.*};
  # body statements
  posts = SELECT p
    FROM users:u-(friendship)->:p
    WHERE u.name == uid;
  PRINT posts; 
}
''')

In [None]:
conn.runInterpretedQuery('''
  INTERPRET QUERY () FOR GRAPH social {
    person1 = {person.*};
    Result = SELECT tgt
           FROM person1:s-(friendship:e)-person:tgt;
    PRINT Result; 
}
''')

### Dive deeper into the network....

In [None]:
conn.getEdges('person', 'Jenny'
              , edgeType='friendship'
              , targetVertexType='person'
              , targetVertexId=None, select="connect_day", where="", limit="", sort="", timeout=0)