# PGXをJupyter notebookで実行して可視化する

## はじめに

Oracleのグラフネットワーク分析用ツールの[PGX（Parallel Graph AnalytiX）](https://www.oracle.com/technetwork/oracle-labs/parallel-graph-analytix/overview/index.html)を
Jupyterから実行し、出力結果をグラフとしてJupyter notebook上に可視化させる。

PGX実行には、Javaのライブラリを呼び出して使う。
そのためのライブラリとして[JPype](http://jpype.sourceforge.net/)を利用する。

可視化は[pyvis](https://pyvis.readthedocs.io/en/latest/install.html)を利用する。

参考：   
https://gianniceresa.com/2017/07/pgx-client-tool-language/   
https://blogs.oracle.com/bigdataspatialgraph/using-pgql-in-python

### JVM起動、PGXセッション作成

In [None]:
from jpype import * 
import glob

In [None]:
# set a classpath
filenames = glob.glob('/home/miotakei/Applications/pgx-19.1.0/lib/*')
pgx_jar_classpath = ':'.join(filenames)

In [None]:
# start JVM 
startJVM(getDefaultJVMPath(), "-ea", "-Djava.class.path=" + pgx_jar_classpath )

pgxClass = JClass('oracle.pgx.api.Pgx')

In [None]:
# create a session on a PGX server
session = pgxClass.createSession('http://localhost:7007', 'session')

# close session
#session.close()

### グラフデータ読み込み、分析、可視化用データ作成

In [None]:
# read Graph
graph = session.readGraphWithProperties("<path of json file>")

In [None]:
print(graph)

In [None]:
# check the graph data
checkgraph = graph.queryPgql("""
  SELECT  n.name, m.name, e.times
  MATCH (n)-[e]->(m)
  ORDER BY e.times
  LIMIT 10
""")

it=checkgraph.getResults().iterator()

while (it.hasNext()):
    element = it.next();
    print(element.toString())

In [None]:
# analysis
analyst = session.createAnalyst()
dc = analyst.degreeCentrality(graph)

In [None]:
# check vertex name 
print(graph.getVertexProperties())

In [None]:
# node data
pgxResultSetNode = graph.queryPgql("""
  SELECT id(n), n.name, n.prob, e.times
  MATCH (n)
               ,(x)-[e]->(y)
  WHERE ((n) = (x) OR (n) = (y))
  AND e.times >= 100
""")

it_node = pgxResultSetNode.getResults().iterator()
node_size = []
node_label =[]
node_value = []
for i in it_node:
    size = i.get(0)
    if size not in node_size:
        node_size.append(size)
        node_label.append(i.get(1))
        node_value.append(i.get(2))

In [None]:
dict_node

In [None]:
# edge data
pgxResultSetEdge = graph.queryPgql("""
  SELECT id(x), id(y), e.times/30 
  MATCH (x)-[e]->(y)
  WHERE e.times >= 100
""")

it_edge = pgxResultSetEdge.getResults().iterator()
edge_list = []
for i in it_edge:
    edge_tuble = (i.get(0), i.get(1), i.get(2))
    edge_list.append(edge_tuble)

In [None]:
dict_edge

### 可視化

In [None]:
from pyvis.network import Network

g = Network(notebook=True, height = '800px', width = '100%', directed = True)
g.add_nodes(node_size, value = node_value, label = node_label)
g.add_edges(edge_list)
g.show_buttons()
g.show('graph.html')

In [None]:
#shutdownJVM()
shutdownJVM()