# Apache HugeGraph


## Deploy:

`docker compose up`

## Observações:


- Integração com Spark: Só permite escrever e em Scala/Java ! (https://github.com/apache/incubator-hugegraph-toolchain/tree/master/hugegraph-spark-connector)
- API de Python funciona bem (pyhugegraph), mas não consegui usar a API nativa do Tinkerpop —> integação gremlin via string;
- API REST (suporta Cypher)

In [3]:
! pip3 install hugegraph-python

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


In [1]:
from pyhugegraph.client import PyHugeClient

import pandas as pd


client = PyHugeClient("localhost", "8081", user="", pwd="", graph="hugegraph")


In [5]:
graphs = client.graphs().get_all_graphs()
print(f"Grafos via client: {graphs}")

Grafos via client: {"graphs":["hugegraph"]}


In [25]:
schema = client.schema()
schema.propertyKey("name").asText().ifNotExist().create()
schema.propertyKey("birthDate").asText().ifNotExist().create()
schema.vertexLabel("Person").properties("name", "birthDate").usePrimaryKeyId().primaryKeys("name").ifNotExist().create()
schema.vertexLabel("Movie").properties("name").usePrimaryKeyId().primaryKeys("name").ifNotExist().create()
schema.edgeLabel("ActedIn").sourceLabel("Person").targetLabel("Movie").ifNotExist().create()

print(schema.getVertexLabels())
print(schema.getEdgeLabels())
print(schema.getRelations())


[name: person, primary_keys: ['name'], properties: ['name', 'age', 'city'], name: software, primary_keys: ['name'], properties: ['name', 'lang', 'price'], name: Person, primary_keys: ['name'], properties: ['name', 'birthDate'], name: Movie, primary_keys: ['name'], properties: ['name']]
[name: knows, properties: ['weight', 'date'], name: created, properties: ['weight', 'date'], name: ActedIn, properties: []]
['person--knows-->person', 'person--created-->software', 'Person--ActedIn-->Movie']


In [30]:

# Init Graph
g = client.graph()
v_al_pacino = g.addVertex("Person", {"name": "Al Pacino", "birthDate": "1940-04-25"})
v_robert = g.addVertex("Person", {"name": "Robert De Niro", "birthDate": "1943-08-17"})
v_godfather = g.addVertex("Movie", {"name": "The Godfather"})
v_godfather2 = g.addVertex("Movie", {"name": "The Godfather Part II"})
v_godfather3 = g.addVertex("Movie", {"name": "The Godfather Coda The Death of Michael Corleone"})

In [32]:
g.addEdge("ActedIn", v_al_pacino.id, v_godfather.id, {})
g.addEdge("ActedIn", v_al_pacino.id, v_godfather2.id, {})
g.addEdge("ActedIn", v_al_pacino.id, v_godfather3.id, {})
g.addEdge("ActedIn", v_robert.id, v_godfather2.id, {})

res = g.getVertexById(v_al_pacino.id).label
res

'Person'

In [36]:
g.close()

In [39]:
# Execute a Gremlin query
g = client.gremlin()
res = g.exec("g.V().limit(100)")
for v in res['data']:
    print(v)

{'id': '2:lop', 'label': 'software', 'type': 'vertex', 'properties': {'name': 'lop', 'lang': 'java', 'price': 328}}
{'id': '1:josh', 'label': 'person', 'type': 'vertex', 'properties': {'name': 'josh', 'age': 32, 'city': 'Beijing'}}
{'id': '1:marko', 'label': 'person', 'type': 'vertex', 'properties': {'name': 'marko', 'age': 29, 'city': 'Beijing'}}
{'id': '1:peter', 'label': 'person', 'type': 'vertex', 'properties': {'name': 'peter', 'age': 35, 'city': 'Shanghai'}}
{'id': '1:vadas', 'label': 'person', 'type': 'vertex', 'properties': {'name': 'vadas', 'age': 27, 'city': 'Hongkong'}}
{'id': '2:ripple', 'label': 'software', 'type': 'vertex', 'properties': {'name': 'ripple', 'lang': 'java', 'price': 199}}
{'id': '3:Al Pacino', 'label': 'Person', 'type': 'vertex', 'properties': {'name': 'Al Pacino', 'birthDate': '1940-04-25'}}
{'id': '4:The Godfather', 'label': 'Movie', 'type': 'vertex', 'properties': {'name': 'The Godfather'}}
{'id': '3:Robert De Niro', 'label': 'Person', 'type': 'vertex', 

### Pyspark

In [8]:
! /opt/workspace/mpmg/gbms/hugegraph/hugegraph-spark-connector-1.5.0.jar

/bin/bash: linha 1: /home/lucasmsp/workspace/mpmg/gbms/hugegraph/hugegraph-spark-connector-1.5.0.jar: Permissão negada


In [1]:
from pyspark.sql import SparkSession

spark = SparkSession.builder.config(
    "spark.jars", "/opt/workspace/mpmg/gbms/hugegraph/hugegraph-spark-connector-1.5.0.jar"
).config("spark.driver.extraClassPath","/opt/workspace/mpmg/gbms/hugegraph/hugegraph-spark-connector-1.5.0.jar"
        ).config("spark.executor.extraClassPath","/opt/workspace/mpmg/gbms/hugegraph/hugegraph-spark-connector-1.5.0.jar"
        ).appName(
    "hugegraph-connector"
).getOrCreate()


SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/workspace/mpmg/gbms/hugegraph/hugegraph-spark-connector-1.5.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark-3.3.0/jars/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]


25/09/09 10:55:58 WARN Utils: Your hostname, lucasmsp-Inspiron-7580 resolves to a loopback address: 127.0.1.1; using 192.168.15.13 instead (on interface wlp3s0)
25/09/09 10:55:58 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
25/09/09 10:55:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).


25/09/09 10:55:59 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.


In [None]:
API ainda permite só escrever ! 

In [2]:
(spark.read.format("org.apache.hugegraph.spark.connector.DataSource")
  .option("host", "localhost")
  .option("port", "8081")
  .option("graph", "hugegraph")
  .option("data-type", "vertex")
  .option("label", "Person")
  .option("id", "name")
  .option("batch-size", 2)
  .load()
).show()

AnalysisException: org.apache.hugegraph.spark.connector.DataSource is not a valid Spark SQL Data Source.