Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GIE Doc] Refine Docs for Cypher #2995

Merged
merged 8 commits into from
Jul 12, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion charts/gie-standalone/templates/frontend/statefulset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ spec:
done

$GRAPHSCOPE_HOME/bin/giectl start_frontend ${GRAPHSCOPE_RUNTIME} ${object_id} \
$json_file $runtime_hosts $GREMLIN_SERVER_PORT $EXTRA_CONFIG
$json_file $runtime_hosts $GREMLIN_SERVER_PORT $CYPHER_SERVER_PORT $EXTRA_CONFIG

exit_code=$?
while [ $exit_code -eq 0 ]
Expand All @@ -103,6 +103,8 @@ spec:
value: {{ .Values.executor.service.gaiaRpc | quote }}
- name: GREMLIN_SERVER_PORT
value: {{ .Values.frontend.service.gremlinPort | quote }}
- name: CYPHER_SERVER_PORT
value: {{ .Values.frontend.service.cypherPort | quote }}
- name: DNS_NAME_PREFIX_STORE
value: {{ $storeFullname }}-{}.{{ $storeFullname }}-headless.{{ $releaseNamespace }}.svc.{{ $clusterDomain }}
- name: SERVERSSIZE
Expand All @@ -124,6 +126,8 @@ spec:
ports:
- name: gremlin
containerPort: {{ .Values.frontend.service.gremlinPort }}
- name: cypher
containerPort: {{ .Values.frontend.service.cypherPort }}
{{- if .Values.frontend.readinessProbe.enabled }}
readinessProbe:
tcpSocket:
Expand Down
11 changes: 11 additions & 0 deletions charts/gie-standalone/templates/frontend/svc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,5 +36,16 @@ spec:
nodePort: null
{{- end }}
{{- end }}
- name: cypher
port: {{ .Values.frontend.service.cypherPort }}
protocol: TCP
targetPort: cypher
{{- if and (or (eq .Values.frontend.service.type "NodePort") (eq .Values.frontend.service.type "LoadBalancer")) (not (empty .Values.frontend.service.nodePorts.cypher)) }}
{{- if (not (empty .Values.frontend.service.nodePorts.cypher)) }}
nodePort: {{ .Values.frontend.service.nodePorts.cypher }}
{{- else if eq .Values.frontend.service.type "ClusterIP" }}
nodePort: null
{{- end }}
{{- end }}
selector: {{ include "graphscope-store.selectorLabels" . | nindent 4 }}
app.kubernetes.io/component: frontend
4 changes: 4 additions & 0 deletions charts/gie-standalone/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -363,12 +363,16 @@ frontend:
##
gremlinPort: 8182

## Cypher server port
cypherPort: 7687

## Specify the nodePort value for the LoadBalancer and NodePort service types.
## ref: https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport
##
nodePorts:
service: ""
gremlin: ""
cypher: ""
## Service clusterIP
##
# clusterIP: None
Expand Down
7 changes: 4 additions & 3 deletions docs/interactive_engine/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,11 @@ deployment and management of applications. To deploy GIE standalone using Helm,
kubectl describe svc [YOUR_RELEASE_NAME]-gie-standalone-frontend \
| grep "Endpoints:" | awk -F' ' '{print $2}'
```
You should see the GIE Frontend service endpoint as `<ip>:<gremlinPort>`.
You should see two exposed endpoints for GIE Frontend service, one is `<ip>:<gremlinPort>` for gremlin querying, another is `<ip>:<cypherPort>` for cypher querying.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly use two commands, one for obtaining Cypher endpoint, the other for Gremlin endpoint.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


- Connect to the GIE frontend service using the Tinkerpop's official SDKs or Gremlin console, which
can be found [here](./tinkerpop_gremlin.md).
- Connect to the GIE frontend service by the following two ways:
1. using the Tinkerpop's official SDKs or Gremlin console, which can be found [here](./tinkerpop/tinkerpop_gremlin.md).
2. using the Neo4j's official SDKs or Cypher-Shell, which can be found [here](./neo4j/cypher_sdk.md).

## Remove the GIE Service
```bash
Expand Down
11 changes: 7 additions & 4 deletions docs/interactive_engine/dev_and_test.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,8 +117,11 @@ pegasus.hosts = localhost:1234
# graph schema path
graph.schema = /tmp/<v6d_object_id>.json

## Frontend Config
frontend.service.port = 8182
## Gremlin Server Port
gremlin.server.port = 8182

## Bolt Server Port
neo4j.bolt.server.port = 7687

# disable authentication if username or password is not set
# auth.username = default
Expand All @@ -131,9 +134,9 @@ java -cp ".:$GIE_TEST_HOME/lib/*" -Djna.library.path=$GIE_TEST_HOME/lib com.alib
```

With the frontend service, you can open the gremlin console and set the endpoint to
`localhost:8182`, as given [here](./tinkerpop_gremlin.md#gremlin-console).
`localhost:8182`, as given [here](./tinkerpop/tinkerpop_gremlin.md#connecting-via-gremlin-console). Similarly, you can open the cypher-shell and set the url to `neo4j://localhost:7687` by using `-a` option, as given [here](./neo4j/cypher_sdk.md#connecting-via-cypher-shell).

7. Kill the services of `vineyardd`, `gaia_executor` and `frontend`:
1. Kill the services of `vineyardd`, `gaia_executor` and `frontend`:
longbinlai marked this conversation as resolved.
Show resolved Hide resolved
```
pkill -f vineyardd
pkill -f gaia_executor
Expand Down
2 changes: 1 addition & 1 deletion docs/interactive_engine/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ You could pass additional key-value pairs to customize the startup configuration

```python
# Set the timeout value to 10 min
g = gs.gremlin(graph, params={'pegasus.timeout': 600000})
g = gs.gremlin(graph, params={'query.execution.timeout.ms': 600000})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this may be configured with the new python apis?

```

## What's the Next
Expand Down
3 changes: 2 additions & 1 deletion docs/interactive_engine/neo4j/cypher_sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@ This document will provide you with step-by-step guidance on how to connect your
FrontEnd service, which offers functionalities similar to the official Tinkerpop service.

Your first step is to obtain the Bolt Connector of GIE Frontend service:
longbinlai marked this conversation as resolved.
Show resolved Hide resolved
- Follow the [instruction](./dev_and_test.md#manually-start-the-gie-services) while starting GIE on a local machine.
- Follow the [instruction](../deployment.md#deploy-your-first-gie-service) while deploying GIE in a K8s cluster,
- Follow the [instruction](../dev_and_test.md#manually-start-the-gie-services) while starting GIE on a local machine.

## Connecting via Python Driver

Expand Down
4 changes: 2 additions & 2 deletions docs/interactive_engine/tinkerpop/tinkerpop_gremlin.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ This document will provide you with step-by-step guidance on how to connect your
FrontEnd service, which offers functionalities similar to the official Tinkerpop service.

Your first step is to obtain the endpoint of GIE Frontend service:
- Follow the [instruction](./deployment.md#deploy-your-first-gie-service) while deploying GIE in a K8s cluster,
- Follow the [instruction](./dev_and_test.md#manually-start-the-gie-services) while starting GIE on a local machine.
- Follow the [instruction](../deployment.md#deploy-your-first-gie-service) while deploying GIE in a K8s cluster,
- Follow the [instruction](../dev_and_test.md#manually-start-the-gie-services) while starting GIE on a local machine.

## Connecting via Python SDK

Expand Down
10 changes: 7 additions & 3 deletions docs/overview/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,7 @@ print(ret.to_dataframe(selector={'id': 'v.id', 'distance': 'r'})

## Graph Interactive Query Quick Start
With the `graphscope` package already installed, you can effortlessly engage with a graph on your local machine.
You simply need to create the `gremlin` instance to serve as the conduit for submitting all Gremlin queries.
You simply need to create the `interactive` instance to serve as the conduit for submitting all Gremlin or Cypher queries.

````{dropdown} Example: Run Interactive Queries in GraphScope
```python
Expand All @@ -252,14 +252,18 @@ gs.set_option(show_log=True)
#(modern graph is an example property graph for Gremlin queries given by Apache at https://tinkerpop.apache.org/docs/current/tutorials/getting-started/)
graph = load_modern_graph()

# Hereafter, you can use the `graph` object to create an `gremlin` query session
g = gs.gremlin(graph)
# Hereafter, you can use the `graph` object to create an `interactive` query session, which will start one Gremlin service and one Cypher service simultaneously on the backend.
g = gs.interactive(graph)
# then `execute` any supported gremlin query.
q1 = g.execute('g.V().count()')
print(q1.all().result()) # should print [6]

q2 = g.execute('g.V().hasLabel(\'person\')')
print(q2.all().result()) # should print [[v[2], v[3], v[0], v[1]]]

# or `execute` any supported cypher query.
q3 = g.execute("MATCH (n:person) RETURN count(n)", lang="cypher", routing_=RoutingControl.READ)
print(q3.records[0][0]) # should print 6
```
````

Expand Down
13 changes: 12 additions & 1 deletion docs/overview/graph_interactive_workloads.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Graph interactive workloads primarily focus on exploring complex graph structure
all occurrences (or instances) of the pattern in the graph. Pattern matching often involves relational operations to project, order and group the matched instances.

In GraphScope, the Graph Interactive Engine (GIE) has been developed to handle such interactive workloads,
which provides widely used query languages, such as Gremlin, that allow users to easily
which provides widely used query languages, such as Gremlin or Cypher, that allow users to easily
express both graph traversal and pattern matching queries. These queries will be executed with massive
parallelism in a cluster of machines, providing efficient and scalable solutions to graph interactive
workloads.
Expand Down Expand Up @@ -87,3 +87,14 @@ g.V().match(

The pattern matching query is declarative in the sense that users only describes the pattern using the `match()` step, while the engine determine how to execute the query (i.e. the execution plan) at runtime according to a pre-defined cost model. For example, a [worst-case optimal](https://vldb.org/pvldb/vol12/p1692-mhedhbi.pdf) execution plan may first compute the matches of `v1` and `v2`, and then intersect the neighbors of `v1` and `v2` as the matches of `v3`.

## Neo4j and Cypher
[Neo4j](https://neo4j.com/docs/) is a popular graph database management system known for its native graph processing capabilities. It provides an efficient and scalable solution for storing, querying, and analyzing graph data. One of the key components of Neo4j is the query language [Cypher](https://neo4j.com/docs/cypher-manual/current/introduction/), which is specifically designed for working with graph data. We have fully embraced the power of Neo4j by implementing essential and impactful operators in Cypher, which enables users to leverage the expressive capabilities of Cypher for querying and manipulating graph data. Additionally, we have integrated Neo4j's Bolt server into our system, allowing Cypher users to submit their queries using the open SDK. As a result, Cypher users can easily get started with GIE through the existing [Neo4j ecosystem](../interactive_engine/neo4j_eco.md), including the language wrappers of Python and Cypher-Shell.

### Pattern Matching
The `MATCH` operator in Cypher provides a declarative syntax that allows you to express graph patterns in a concise and intuitive manner. The pattern-based approach aligns well with the structure of graph data, making it easier to understand and write queries. This helps both beginners and experienced users to quickly grasp and work with complex graph patterns. Moreover, The `MATCH` operator allows you to combine multiple patterns, optional patterns, and logical operators to create complex queries, which empowers you to express complex relationships and conditions within a single query. It can be written in Cypher for the above `Triangle` example:
```bash
Match (v1)-[:Knows]-(v2),
(v1)-[:Purchases]->(v3),
(v2)-[:Purchases]->(v3)
Return DISTINCT v1, v2, v3;
```
2 changes: 1 addition & 1 deletion interactive_engine/compiler/ir_k8s_failover_ci.sh
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ wait_role_pods_to_run store ${store_total}

sleep 5

node_port=$(kubectl --namespace=${namespace} get svc ${role_prefix}-frontend -o go-template='{{range.spec.ports}}{{if .nodePort}}{{.nodePort}}{{"\n"}}{{end}}{{end}}')
node_port=$(kubectl --namespace=${namespace} get svc ${role_prefix}-frontend -o go-template='{{range.spec.ports}}{{if .nodePort}}{{.nodePort}}{{"\n"}}{{end}}{{end}}' | head -1)
hostname=$(minikube ip)
python3 ./submit_query.py $hostname:${node_port}

Expand Down
6 changes: 5 additions & 1 deletion interactive_engine/compiler/set_properties.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ hosts="pegasus.hosts: $DNS_NAME_PREFIX_STORE:$GAIA_RPC_PORT";

hosts="${hosts/"{}"/0}";

gremlin_server_port="gremlin.server.port: $GREMLIN_SERVER_PORT";

cypher_server_port="neo4j.bolt.server.port: $CYPHER_SERVER_PORT";

count=1;
while (($count<$SERVERSSIZE))
do
Expand All @@ -37,6 +41,6 @@ done

graph_schema="graph.schema: $GRAPH_SCHEMA"

properties="$worker_num\n$timeout\n$batch_size\n$output_capacity\n$hosts\n$server_num\n$graph_schema"
properties="$worker_num\n$timeout\n$batch_size\n$output_capacity\n$hosts\n$server_num\n$graph_schema\n$gremlin_server_port\n$cypher_server_port"

echo -e $properties > ./conf/ir.compiler.properties
Loading