<a href="https://colab.research.google.com/github/versant2612/jnotebooks/blob/main/tigergraph/pyTigerGraph101.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Install pyTigerGraph

In [1]:
# Setup
!pip install pyTigerGraph



## Add Imports and Establish Initial Connection

In [2]:
# Imports
import pyTigerGraph as tg
import json
import pandas as pd

# Connection parameters
hostName = "https://socialnetwork-tg.i.tgcloud.io"
userName = "tigergraph"
password = "tigergraph"

conn = tg.TigerGraphConnection(host=hostName, username=userName, password=password)

print("Connected")

Connected


**Define and Publish the Schema**

HASHTAG é um nó pq vai ser necessário filtrar por esse valor. Se fosse só um atributo isso não seria possível? 

In [3]:
# DEFINE / CREATE ALL EDGES AND VERTICES 
results = conn.gsql('''
  USE GLOBAL
  CREATE VERTEX Person (PRIMARY_ID id STRING, name STRING, email STRING, username STRING, created_at DATETIME) WITH primary_id_as_attribute="true"
  CREATE VERTEX Post (PRIMARY_ID id STRING, content STRING, posted_date DATETIME, deleted BOOL) WITH primary_id_as_attribute="true"
  CREATE VERTEX Hashtag (PRIMARY_ID tag STRING) WITH primary_id_as_attribute="true"
  CREATE VERTEX Message (PRIMARY_ID id STRING, subject STRING, body STRING) WITH primary_id_as_attribute="true"
  CREATE DIRECTED EDGE posted (From Person, To Post, post_date DATETIME) WITH REVERSE_EDGE="reverse_posted"
  CREATE DIRECTED EDGE liked (From Person, To Post, like_date DATETIME) WITH REVERSE_EDGE="reverse_liked"
  CREATE DIRECTED EDGE has_tag (From Post, To Hashtag) WITH REVERSE_EDGE="reverse_has_tag"
  CREATE DIRECTED EDGE sent_message (From Person, To Message, to_person STRING, sent_date DATETIME) WITH REVERSE_EDGE="reverse_sent_message"
  CREATE DIRECTED EDGE received_message (From Message, To Person, from_person STRING, receive_date DATETIME, opened_date DATETIME) WITH REVERSE_EDGE="reverse_received_message"
''')
print(results)

Successfully created vertex types: [Person].
Successfully created vertex types: [Post].
Successfully created vertex types: [Hashtag].
Successfully created vertex types: [Message].
Successfully created edge types: [posted].
Successfully created reverse edge types: [reverse_posted].
Successfully created edge types: [liked].
Successfully created reverse edge types: [reverse_liked].
Successfully created edge types: [has_tag].
Successfully created reverse edge types: [reverse_has_tag].
Successfully created edge types: [sent_message].
Successfully created reverse edge types: [reverse_sent_message].
Successfully created edge types: [received_message].
Successfully created reverse edge types: [reverse_received_message].


## Create the Graph

In [4]:
results = conn.gsql('CREATE GRAPH MyGraph(Person, Post, Hashtag, Message, posted, reverse_posted, liked, reverse_liked, has_tag, reverse_has_tag, sent_message, reverse_sent_message, received_message, reverse_received_message)')

In [5]:
# conn.graphname="MyGraph"
secret = conn.createSecret()
authToken = conn.getToken(secret)
authToken = authToken[0]
# print(authToken)
# authToken = 'rc7reopbis1667ksgcppq5v5fb99p6s1'
conn = tg.TigerGraphConnection(host=hostName, graphname="MyGraph", username=userName, password=password, apiToken=authToken)

def pprint(string):
  print(json.dumps(string, indent=2))

## Clone the Data

In [6]:
!git clone https://github.com/DanBarkus/TigerGraph-101.git

fatal: destination path 'TigerGraph-101' already exists and is not an empty directory.


## Create Loading Jobs

### Posts

Let's take a look at what one of our files looks like so we can write a loading job.

In [7]:
!head -n 2 /content/TigerGraph-101/posts.csv

id,content,posted_date,by_user,deleted,hashtag_1,hashtag_2,hashtag_3,hashtag_4
1,"Proin interdum mauris non ligula pellentesque ultrices. Phasellus id sapien in sapien iaculis congue. Vivamus metus arcu, adipiscing molestie, hendrerit at, vulputate vitae, nisl. Aenean lectus. Pellentesque eget nunc. Donec quis orci eget orci vehicula condimentum. Curabitur in libero ut massa volutpat convallis. Morbi odio odio, elementum eu, interdum eu, tincidunt in, leo.",2019-08-04 20:43:08,69,False,Compatible,Organized,,workforce


This one and the messages file will look like a mess in this display because they contain sentances of text as dummy data which makes them not show up too well in the csv preview. You can just reference the header through for loading jobs.

Here it's important to note that the `$0`, `$1` values line up with the columns of your data.
In this example:
- `$0` is the `id` column,
- `$1` is `content`,
- `$2` is `posted_date`
- and so on

In [8]:
results = conn.gsql('''
  USE GRAPH MyGraph
  BEGIN
  CREATE LOADING JOB load_posts FOR GRAPH MyGraph {
  DEFINE FILENAME MyDataSource;
  LOAD MyDataSource TO VERTEX Post VALUES($0, $1, $2, $4) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO VERTEX Hashtag VALUES($5) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO VERTEX Hashtag VALUES($6) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO VERTEX Hashtag VALUES($7) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO VERTEX Hashtag VALUES($8) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO EDGE posted VALUES($3, $0, $2) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO EDGE has_tag VALUES($0, $5) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO EDGE has_tag VALUES($0, $6) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO EDGE has_tag VALUES($0, $7) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO EDGE has_tag VALUES($0, $8) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  }
  END
  ''')
print(results)

Using graph 'MyGraph'
Successfully created loading jobs: [load_posts].


### Likes

In [9]:
!head -n 2 /content/TigerGraph-101/likes.csv

id,by_user,liked_post,liked_date
1,65,798,2019-01-02 08:11:14


In [10]:
results = conn.gsql('''
  USE GRAPH MyGraph
  BEGIN
  CREATE LOADING JOB load_likes FOR GRAPH MyGraph {
  DEFINE FILENAME MyDataSource;
  LOAD MyDataSource TO EDGE liked VALUES($1, $2, $3) USING SEPARATOR=",", HEADER="true", EOL="\\n";
  }
  END
  ''')
print(results)

Using graph 'MyGraph'
Successfully created loading jobs: [load_likes].


### Messages

In [11]:
!head -n 2 /content/TigerGraph-101/messages.csv

id,body,subject,by_user,to_user,send_date,read_date
1,"Integer pede justo, lacinia eget, tincidunt eget, tempus vel, pede. Morbi porttitor lorem id ligula. Suspendisse ornare consequat lectus. In est risus, auctor sed, tristique in, tempus sit amet, sem. Fusce consequat. Nulla nisl.",Nulla facilisi.,22,62,2019-05-01 03:40:51,2019-06-10 03:40:51


In [12]:
results = conn.gsql('''
  USE GRAPH MyGraph
  BEGIN
  CREATE LOADING JOB load_messages FOR GRAPH MyGraph {
  DEFINE FILENAME MyDataSource;
  LOAD MyDataSource TO VERTEX Message VALUES($0, $2, $1) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO EDGE sent_message VALUES($3, $0, $4, $5) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  LOAD MyDataSource TO EDGE received_message VALUES($0, $4, $3, $5, $6) USING SEPARATOR=",", HEADER="true", EOL="\\n", QUOTE="double";
  }
  END''')
print(results)

Using graph 'MyGraph'
Successfully created loading jobs: [load_messages].


### Users

In [13]:
!head -n 2 /content/TigerGraph-101/users.csv

id,email,username,name,join_date
1,mgeistmann0@accuweather.com,mgeistmann0,Marvin Geistmann,2018-09-30 07:31:34


In [14]:
results = conn.gsql('''
  USE GRAPH MyGraph
  BEGIN
  CREATE LOADING JOB load_people FOR GRAPH MyGraph {
  DEFINE FILENAME MyDataSource;
  LOAD MyDataSource TO VERTEX Person VALUES($0, $3, $1, $2, $4) USING SEPARATOR=",", HEADER="true", EOL="\\n";
  }
  END
  ''')
print(results)

Using graph 'MyGraph'
Successfully created loading jobs: [load_people].


In [15]:
results = conn.gsql('''
USE GRAPH MyGraph
SHOW JOBS
''')
pprint(results)

"Encountered \" <IDENTIFIER> \"JOBS \"\" at line 3, column 6.\nWas expecting one of:\n\"DATA_SOURCE\" ...\n\"edge\" ...\n\"graph\" ...\n\"group\" ...\n\"job\" ...\n\"loading\" ...\n\"privilege\" ...\n\"query\" ...\n\"role\" ...\n\"secret\" ...\n\"tag\" ...\n\"user\" ...\n\"vertex\" ...\n"


## Load Data

In [16]:
# Load the posts file wiht the 'load_posts' job
posts_file = '/content/TigerGraph-101/posts.csv'
results = conn.uploadFile(posts_file, fileTag='MyDataSource', jobName='load_posts')
print(json.dumps(results, indent=2))

[
  {
    "sourceFileName": "Online_POST",
    "statistics": {
      "validLine": 1001,
      "rejectLine": 0,
      "failedConditionLine": 0,
      "notEnoughToken": 0,
      "invalidJson": 0,
      "oversizeToken": 0,
      "vertex": [
        {
          "typeName": "Post",
          "validObject": 1000,
          "noIdFound": 0,
          "invalidAttribute": 1,
          "invalidAttributeLines": [
            "1:posted_date"
          ],
          "invalidAttributeLinesData": [
            "id,content,posted_date,by_user,deleted,hashtag_1,hashtag_2,hashtag_3,hashtag_4\n"
          ],
          "invalidVertexType": 0,
          "invalidPrimaryId": 0,
          "invalidSecondaryId": 0,
          "incorrectFixedBinaryLength": 0
        },
        {
          "typeName": "Hashtag",
          "validObject": 1001,
          "noIdFound": 0,
          "invalidAttribute": 0,
          "invalidVertexType": 0,
          "invalidPrimaryId": 0,
          "invalidSecondaryId": 0,
          "inco

In [17]:
# Load the likes file wiht the 'load_likes' job
likes_file = '/content/TigerGraph-101/likes.csv'
results = conn.uploadFile(likes_file, fileTag='MyDataSource', jobName='load_likes')
print(json.dumps(results, indent=2))

[
  {
    "sourceFileName": "Online_POST",
    "statistics": {
      "validLine": 5001,
      "rejectLine": 0,
      "failedConditionLine": 0,
      "notEnoughToken": 0,
      "invalidJson": 0,
      "oversizeToken": 0,
      "vertex": [],
      "edge": [
        {
          "typeName": "liked",
          "validObject": 5000,
          "noIdFound": 0,
          "invalidAttribute": 1,
          "invalidAttributeLines": [
            "1:like_date"
          ],
          "invalidAttributeLinesData": [
            "id,by_user,liked_post,liked_date\n"
          ],
          "invalidVertexType": 0,
          "invalidPrimaryId": 0,
          "invalidSecondaryId": 0,
          "incorrectFixedBinaryLength": 0
        }
      ],
      "deleteVertex": [],
      "deleteEdge": []
    }
  }
]


In [18]:
# Load the messages file wiht the 'load_messages' job
messages_file = '/content/TigerGraph-101/messages.csv'
results = conn.uploadFile(messages_file, fileTag='MyDataSource', jobName='load_messages')
print(json.dumps(results, indent=2))

[
  {
    "sourceFileName": "Online_POST",
    "statistics": {
      "validLine": 1001,
      "rejectLine": 0,
      "failedConditionLine": 0,
      "notEnoughToken": 0,
      "invalidJson": 0,
      "oversizeToken": 0,
      "vertex": [
        {
          "typeName": "Message",
          "validObject": 1001,
          "noIdFound": 0,
          "invalidAttribute": 0,
          "invalidVertexType": 0,
          "invalidPrimaryId": 0,
          "invalidSecondaryId": 0,
          "incorrectFixedBinaryLength": 0
        }
      ],
      "edge": [
        {
          "typeName": "sent_message",
          "validObject": 1000,
          "noIdFound": 0,
          "invalidAttribute": 1,
          "invalidAttributeLines": [
            "1:sent_date"
          ],
          "invalidAttributeLinesData": [
            "id,body,subject,by_user,to_user,send_date,read_date\n"
          ],
          "invalidVertexType": 0,
          "invalidPrimaryId": 0,
          "invalidSecondaryId": 0,
          "i

In [19]:
# Load the people file wiht the 'load_people' job
people_file = '/content/TigerGraph-101/users.csv'
results = conn.uploadFile(people_file, fileTag='MyDataSource', jobName='load_people')
print(json.dumps(results, indent=2))

[
  {
    "sourceFileName": "Online_POST",
    "statistics": {
      "validLine": 101,
      "rejectLine": 0,
      "failedConditionLine": 0,
      "notEnoughToken": 0,
      "invalidJson": 0,
      "oversizeToken": 0,
      "vertex": [
        {
          "typeName": "Person",
          "validObject": 100,
          "noIdFound": 0,
          "invalidAttribute": 1,
          "invalidAttributeLines": [
            "1:created_at"
          ],
          "invalidAttributeLinesData": [
            "id,email,username,name,join_date\n"
          ],
          "invalidVertexType": 0,
          "invalidPrimaryId": 0,
          "invalidSecondaryId": 0,
          "incorrectFixedBinaryLength": 0
        }
      ],
      "edge": [],
      "deleteVertex": [],
      "deleteEdge": []
    }
  }
]


## Exploring the Graph

### Get Vertex and Edge Schema

In [20]:
results = conn.getVertexTypes()
print(f"Verticies: {results}")
vertices = results

results = conn.getEdgeTypes()
print(f"Edges: {results}")
edges = results

Verticies: ['Person', 'Post', 'Hashtag', 'Message']
Edges: ['posted', 'liked', 'has_tag', 'sent_message', 'received_message']


In [21]:

print(f"Results for Post vertex")
pprint(conn.getVertexType("Message"))

print("-----------------")
print(f"Results for liked edge")
pprint(conn.getEdgeType("liked"))


Results for Post vertex
{
  "Config": {
    "TAGGABLE": false,
    "STATS": "OUTDEGREE_BY_EDGETYPE",
    "PRIMARY_ID_AS_ATTRIBUTE": true
  },
  "Attributes": [
    {
      "AttributeType": {
        "Name": "STRING"
      },
      "IsPartOfCompositeKey": false,
      "PrimaryIdAsAttribute": false,
      "AttributeName": "subject",
      "HasIndex": false,
      "internalAttribute": false,
      "IsPrimaryKey": false
    },
    {
      "AttributeType": {
        "Name": "STRING"
      },
      "IsPartOfCompositeKey": false,
      "PrimaryIdAsAttribute": false,
      "AttributeName": "body",
      "HasIndex": false,
      "internalAttribute": false,
      "IsPrimaryKey": false
    }
  ],
  "PrimaryId": {
    "AttributeType": {
      "Name": "STRING"
    },
    "IsPartOfCompositeKey": false,
    "PrimaryIdAsAttribute": true,
    "AttributeName": "id",
    "HasIndex": false,
    "internalAttribute": false,
    "IsPrimaryKey": false
  },
  "Name": "Message"
}
-----------------
Results for l

## Counting Data

In [22]:
print("Vertex Counts")
for vertex in vertices:
  print(f"There are {conn.getVertexCount(vertex)} {vertex} vertices in the graph")

print("--------------")
print("Edge Counts")
for edge in edges:
  print(f"There are {conn.getEdgeCount(edge)} {edge} edges in the graph")

Vertex Counts
There are 100 Person vertices in the graph
There are 1001 Post vertices in the graph
There are 308 Hashtag vertices in the graph
There are 1001 Message vertices in the graph
--------------
Edge Counts
There are 1000 posted edges in the graph
There are 4869 liked edges in the graph
There are 2567 has_tag edges in the graph
There are 1000 sent_message edges in the graph
There are 1000 received_message edges in the graph


## Extracting Data

### Vertex/Edge Set Format

#### Getting a Vertex

In [23]:
results = conn.getVerticesById("Person", "30")
pprint(results)

[
  {
    "v_id": "30",
    "v_type": "Person",
    "attributes": {
      "id": "30",
      "name": "Garrot Mattin",
      "email": "gmattint@dion.ne.jp",
      "username": "gmattint",
      "created_at": "2018-12-23 08:52:51"
    }
  }
]


#### Or Multiple Vertices

In [24]:
tdf1 = conn.getVerticesById("Post", ["200","400"])
pprint(tdf1)

[
  {
    "v_id": "200",
    "v_type": "Post",
    "attributes": {
      "id": "200",
      "content": "Fusce consequat. Nulla nisl. Nunc nisl. Duis bibendum, felis sed interdum venenatis, turpis enim blandit mi, in porttitor pede justo eu massa. Donec dapibus.",
      "posted_date": "2021-02-01 17:37:03",
      "deleted": true
    }
  },
  {
    "v_id": "400",
    "v_type": "Post",
    "attributes": {
      "id": "400",
      "content": "Nulla neque libero, convallis eget, eleifend luctus, ultricies eu, nibh. Quisque id justo sit amet sapien dignissim vestibulum. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Nulla dapibus dolor vel est. Donec odio justo, sollicitudin ut, suscipit a, feugiat et, eros.",
      "posted_date": "2020-03-16 11:19:30",
      "deleted": false
    }
  }
]


#### Count Edges Connected to a Vertex

In [25]:
results = conn.getEdgeCountFrom("Person", "38")
pprint(results)

{
  "posted": 6,
  "reverse_posted": 0,
  "liked": 38,
  "reverse_liked": 0,
  "has_tag": 0,
  "reverse_has_tag": 0,
  "sent_message": 13,
  "reverse_sent_message": 0,
  "received_message": 0,
  "reverse_received_message": 13
}


#### Show all Edges Connected to a Vertex

In [26]:
results = conn.getEdges("Post", "345")
pprint(results)

[
  {
    "e_type": "reverse_posted",
    "directed": true,
    "from_id": "345",
    "from_type": "Post",
    "to_id": "59",
    "to_type": "Person",
    "attributes": {
      "post_date": "2021-04-05 22:33:38"
    }
  },
  {
    "e_type": "reverse_liked",
    "directed": true,
    "from_id": "345",
    "from_type": "Post",
    "to_id": "63",
    "to_type": "Person",
    "attributes": {
      "like_date": "2020-12-17 22:14:48"
    }
  },
  {
    "e_type": "reverse_liked",
    "directed": true,
    "from_id": "345",
    "from_type": "Post",
    "to_id": "85",
    "to_type": "Person",
    "attributes": {
      "like_date": "2020-03-16 15:25:18"
    }
  },
  {
    "e_type": "reverse_liked",
    "directed": true,
    "from_id": "345",
    "from_type": "Post",
    "to_id": "30",
    "to_type": "Person",
    "attributes": {
      "like_date": "2019-07-25 11:46:19"
    }
  },
  {
    "e_type": "reverse_liked",
    "directed": true,
    "from_id": "345",
    "from_type": "Post",
    "to_id": 

### As Pandas Dataframe
Supports all of the above in native Pandas Dataframe format.

#### All Vertices of one Type

In [27]:
df1 = conn.getVertexDataframe("Hashtag")
print(df1)

                 v_id               tag
0          middleware        middleware
1          Horizontal        Horizontal
2     next generation   next generation
3           Operative         Operative
4     object-oriented   object-oriented
..                ...               ...
303      service-desk      service-desk
304  Open-architected  Open-architected
305         Ergonomic         Ergonomic
306          software          software
307        complexity        complexity

[308 rows x 2 columns]


#### One or More Vertex

In [28]:
df2 = conn.getVertexDataframeById("Post", ["45"])
print(df2)

  v_id  id  ...          posted_date deleted
0   45  45  ...  2019-05-30 13:14:33   False

[1 rows x 5 columns]


#### Convert Vertex/Edge Set to Dataframe
We'll use the results from the 'Or Multiple Vertices' cell. 

In [29]:
df3 = conn.vertexSetToDataFrame(tdf1)
print(df3)

  v_id   id  ...          posted_date deleted
0  200  200  ...  2021-02-01 17:37:03    True
1  400  400  ...  2020-03-16 11:19:30   False

[2 rows x 5 columns]


#### Get Edges

In [30]:
df4 = conn.getEdgesDataframe("Post", "344", limit=3)
print(df4)

  from_type from_id to_type to_id            post_date            like_date
0      Post     344  Person    92  2019-02-10 11:18:04                  NaN
1      Post     344  Person    69                  NaN  2021-05-08 21:31:39
2      Post     344  Person    73                  NaN  2018-12-30 16:54:19


## Path Finding
Find paths between vertices.

Supported are:
- shortestPath - one shortest path between vertices
- allPaths - all paths within the specified edge limit

In [31]:
results = conn.shortestPath([("Person", "50")], [("Person", "45")])
pprint(results)

[{'type': 'Person', 'id': '50'}]
[{'type': 'Person', 'id': '45'}]
[
  {
    "vertices": [
      {
        "v_id": "50",
        "v_type": "Person",
        "attributes": {
          "id": "50",
          "name": "Hali Dales",
          "email": "hdales1d@comsenz.com",
          "username": "hdales1d",
          "created_at": "2021-02-01 17:45:56"
        }
      },
      {
        "v_id": "45",
        "v_type": "Person",
        "attributes": {
          "id": "45",
          "name": "Agathe Van Zon",
          "email": "avan18@geocities.com",
          "username": "avan18",
          "created_at": "2019-08-04 04:31:23"
        }
      },
      {
        "v_id": "614",
        "v_type": "Post",
        "attributes": {
          "id": "614",
          "content": "Etiam vel augue. Vestibulum rutrum rutrum neque. Aenean auctor gravida sem.",
          "posted_date": "2020-08-27 22:24:11",
          "deleted": false
        }
      }
    ],
    "edges": [
      {
        "e_type": "liked"

## Queries

### All Hashtags from all Posts from Input User

Person -posted> Post -has_tag> Hash_tag ... graph path

In [32]:
results = conn.gsql('''
  USE GRAPH MyGraph
  CREATE QUERY hashtags_from_person(VERTEX<Person> inPer) FOR GRAPH MyGraph SYNTAX v2 { 
  
  person = {inPer};

  tag = SELECT t FROM person:pe - (posted>) - Post:p - (has_tag>) - Hashtag:t;

  // Below line does the same thing and looks cleaner, intermediate nodes can be ignored if not being referenced
  // tag = SELECT t FROM person:pe - (posted>.has_tag>) - Hashtag:t;

  PRINT tag;
  }
  ''')
pprint(results)

"Using graph 'MyGraph'\nSuccessfully created queries: [hashtags_from_person]."


### Posts that Users who Liked the Source Post Also Liked

post <like- Person -liked> post ... graph path, vertice Person in commom


In [33]:
results = conn.gsql('''
  USE GRAPH MyGraph
  CREATE QUERY most_common_mutual_liked_post(VERTEX<Post> inPost, INT maxReturn) FOR GRAPH MyGraph SYNTAX v2 {
  TYPEDEF tuple<STRING post, INT likes> frequency;
  
  // Find the posts most liked by the group of people who liked the input post
  HeapAccum<frequency>(maxReturn, likes DESC) @@topTestResults;
  SumAccum<INT> @likes;
  
  post = {inPost};
  
  ml = SELECT op FROM post - (<liked) - Person - (liked>) - Post:op
  ACCUM
    op.@likes += 1
  POST-ACCUM
    @@topTestResults += frequency(op.id, op.@likes);
  
  PRINT @@topTestResults;
}
  ''')
pprint(results)

"Using graph 'MyGraph'\nSuccessfully created queries: [most_common_mutual_liked_post]."


### Similarity Algo

Jaccard similarity-esque algorithm that will be referenced as a sub-query.

Jaccard é calculado pela interseção / união das hash tags ... cada pessoa tem um conjunto de hash tags de acordo com os posts que deu like

In [34]:
results = conn.gsql('''
  USE GRAPH MyGraph
  CREATE QUERY similarity(Set<STRING> A, Set<STRING> B) FOR GRAPH MyGraph RETURNS (FLOAT){ 
	SetAccum<STRING> @@inter, @@uni;
	FLOAT similarity;
	
	IF A.size() != 0 AND B.size() !=0 THEN
	  @@inter = A INTERSECT B;
	  @@uni = A UNION B;
	
	  IF @@inter.size() == 0 THEN
	    similarity = 0;
	  ELSE 
	    similarity = @@inter.size()*1.0/@@uni.size();
	    END;
	ELSE
	  similarity = 0;
	  END;
	
	PRINT similarity;
	RETURN similarity;
}
  ''')
pprint(results)

"Using graph 'MyGraph'\nSuccessfully created queries: [similarity]."


### Find Users Who Like Similar Hashtags to what Input User Posts

In [35]:
results = conn.gsql('''
  USE GRAPH MyGraph
  CREATE QUERY people_with_similar_tags(VERTEX<Person> inPer, INT maxReturn) FOR GRAPH MyGraph SYNTAX v2{ 
  TYPEDEF tuple<STRING person, FLOAT tag> simTags;
  
  HeapAccum<simTags>(maxReturn, tag DESC) @@topTagResults;
  SetAccum<STRING> @tags;
  SetAccum<STRING> @@inTags;
  person = {inPer};
  people = {Person.*};
  
  // Get the hashtags of our input person's posts
  ourTags = SELECT t FROM person - (posted>.has_tag>) - Hashtag:t
  ACCUM
    @@inTags += t.tag;
  
  // Get the hashtags of all posts all people have liked
  simPeople = SELECT p FROM people:p - (liked>.has_tag>) - Hashtag:t
  ACCUM
    p.@tags += t.tag
  POST-ACCUM
  // Compare the group of hashtags from our input person to that of the person we're currently checking
    @@topTagResults += simTags(p.id,similarity(@@inTags,p.@tags));
  
  PRINT @@topTagResults;
  }
  ''')
pprint(results)

"Using graph 'MyGraph'\nSuccessfully created queries: [people_with_similar_tags]."


## Installing Queries

Installing the queries will take about 6 minutes total.

In [36]:
conn.gsql('''
  USE GRAPH MyGraph
  INSTALL QUERY similarity
''')

'Using graph \'MyGraph\'\nStart installing queries, about 1 minute ...\nsimilarity query: curl -X GET \'https://127.0.0.1:9000/query/MyGraph/similarity?A=VALUE&B=VALUE\'. Add -H "Authorization: Bearer TOKEN" if authentication is enabled.\nSelect \'m1\' as compile server, now connecting ...\nNode \'m1\' is prepared as compile server.\n\nQuery installation finished.'

## Run Queries

### Hashtags a Person has Posted

Returned as a Vertex Set, note that the additional vertex information is returned as well as the name.

In [37]:
results = conn.runInstalledQuery("hashtags_from_person", params={"inPer": "50"})
pprint(results)

HTTPError: ignored

### Posts Most Liked by the People who Liked the Input Post

- `inPost` is the input post, currently set to post id `500`. Feel free to change that and see how other posts relate.
- `maxReturn` is the max number of like posts to return. Currently set to `10`

In [38]:
results = conn.runInstalledQuery("most_common_mutual_liked_post", params={"inPost": "500","maxReturn": "10"})
pprint(results)

HTTPError: ignored

### People who Like Posts with Similar Hashtags to what Input User Posts

- `inPer` is the input person who you are finding similar people to.
- `maxReturn` is the max number of similar people to return.

In [39]:
results = conn.runInstalledQuery("people_with_similar_tags", params={"inPer": "50","maxReturn": "10"})
pprint(results)

HTTPError: ignored

## Clear the Whole Graph
DANGER ZONE

In [67]:
conn.gsql('''
USE GLOBAL
DROP ALL
''')

'Dropping all, about 1 minute ...\nAbort all active loading jobs\nTry to abort all loading jobs on graph MyGraph, it may take a while ...\n[ABORT_SUCCESS] No active Loading Job to abort.\nTry to abort all loading jobs on graph OutroGrafo, it may take a while ...\n[ABORT_SUCCESS] No active Loading Job to abort.\nResetting GPE...\nSuccessfully reset GPE and GSE\nStopping GPE GSE\nSuccessfully stopped GPE GSE in 0.003 seconds\nClearing graph store...\nSuccessfully cleared graph store\nStarting GPE GSE RESTPP\nSuccessfully started GPE GSE RESTPP in 0.157 seconds\nEverything is dropped.'