feat: Add nightly ldbc interactive benchmark by lnfjpt · Pull Request #217 · alibaba/neug

lnfjpt · 2026-04-13T06:56:59Z

Fixes #170

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds an automated nightly GitHub Actions workflow and supporting scripts to load and serve an LDBC SNB dataset for interactive benchmarking (addresses issue #170).

Changes:

Add Python scripts to load LDBC schema/data and start a NeuG server for benchmarking.
Add LDBC SNB schema + COPY script (load.cypher) parameterized by DATA_PATH.
Add a scheduled GitHub Actions workflow to build, load, serve, and trigger the benchmark driver over SSH.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File	Description
`benchmark/interactive/serve_ldbc.py`	Starts a NeuG server pointing at an `ldbc` database directory.
`benchmark/interactive/load_ldbc.py`	Loads schema/data by executing statements from `load.cypher` with `${DATA_PATH}` substitution.
`benchmark/interactive/load.cypher`	Defines LDBC SNB schema and COPY commands for SF datasets.
`.github/workflows/ldbc-nightly-benchmark.yml`	Nightly workflow to build NeuG, load data, launch server, and run benchmark remotely.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-29T09:15:20Z

+    with open(CYPHER_FILE, "r") as f:
+        buf = ""
+        for line_no, line in enumerate(f, 1):
+            stripped = line.strip()
+            # skip empty lines and comments
+            if not stripped or stripped.startswith("//"):
+                continue
+            buf += " " + stripped if buf else stripped
+            if buf.rstrip().endswith(";"):
+                query = buf.rstrip().replace("${DATA_PATH}", DATA_PATH)
+                print(f"[line {line_no}] Executing: {query}")
+                t0 = time.time()
+                try:
+                    conn.execute(query)
+                    print(f"  -> OK ({time.time() - t0:.2f}s)")
+                except Exception as e:
+                    print(f"  -> FAILED: {e}", file=sys.stderr)
+                buf = ""
+
+    conn.close()
+    db.close()


If the file ends without a trailing semicolon, the final buffered statement in buf is silently dropped. After the loop, handle a non-empty buf by either executing it (if valid) or raising an error so load failures don’t go unnoticed.

Suggested change

with open(CYPHER_FILE, "r") as f:

buf = ""

for line_no, line in enumerate(f, 1):

stripped = line.strip()

# skip empty lines and comments

if not stripped or stripped.startswith("//"):

continue

buf += " " + stripped if buf else stripped

if buf.rstrip().endswith(";"):

query = buf.rstrip().replace("${DATA_PATH}", DATA_PATH)

print(f"[line {line_no}] Executing: {query}")

t0 = time.time()

try:

conn.execute(query)

print(f" -> OK ({time.time() - t0:.2f}s)")

except Exception as e:

print(f" -> FAILED: {e}", file=sys.stderr)

buf = ""

conn.close()

db.close()

try:

with open(CYPHER_FILE, "r") as f:

buf = ""

for line_no, line in enumerate(f, 1):

stripped = line.strip()

# skip empty lines and comments

if not stripped or stripped.startswith("//"):

continue

buf += " " + stripped if buf else stripped

if buf.rstrip().endswith(";"):

query = buf.rstrip().replace("${DATA_PATH}", DATA_PATH)

print(f"[line {line_no}] Executing: {query}")

t0 = time.time()

try:

conn.execute(query)

print(f" -> OK ({time.time() - t0:.2f}s)")

except Exception as e:

print(f" -> FAILED: {e}", file=sys.stderr)

buf = ""

if buf.strip():

raise ValueError(

f"Unterminated Cypher statement at end of file {CYPHER_FILE}: {buf}"

)

finally:

conn.close()

db.close()

Copilot · 2026-04-29T09:15:20Z

+    conn = db.connect()
+    print(f"DATA_PATH = {DATA_PATH}")
+
+    with open(CYPHER_FILE, "r") as f:
+        buf = ""
+        for line_no, line in enumerate(f, 1):
+            stripped = line.strip()
+            # skip empty lines and comments
+            if not stripped or stripped.startswith("//"):
+                continue
+            buf += " " + stripped if buf else stripped
+            if buf.rstrip().endswith(";"):
+                query = buf.rstrip().replace("${DATA_PATH}", DATA_PATH)
+                print(f"[line {line_no}] Executing: {query}")
+                t0 = time.time()
+                try:
+                    conn.execute(query)
+                    print(f"  -> OK ({time.time() - t0:.2f}s)")
+                except Exception as e:
+                    print(f"  -> FAILED: {e}", file=sys.stderr)
+                buf = ""
+
+    conn.close()
+    db.close()


Resource cleanup isn’t protected if an exception occurs before conn.close() / db.close() (e.g., file read error or an unhandled exception path). Wrap the load logic in try/finally (or use context-manager support if neug provides it) to ensure connections are always closed.

Suggested change

conn = db.connect()

print(f"DATA_PATH = {DATA_PATH}")

with open(CYPHER_FILE, "r") as f:

buf = ""

for line_no, line in enumerate(f, 1):

stripped = line.strip()

# skip empty lines and comments

if not stripped or stripped.startswith("//"):

continue

buf += " " + stripped if buf else stripped

if buf.rstrip().endswith(";"):

query = buf.rstrip().replace("${DATA_PATH}", DATA_PATH)

print(f"[line {line_no}] Executing: {query}")

t0 = time.time()

try:

conn.execute(query)

print(f" -> OK ({time.time() - t0:.2f}s)")

except Exception as e:

print(f" -> FAILED: {e}", file=sys.stderr)

buf = ""

conn.close()

db.close()

conn = None

try:

conn = db.connect()

print(f"DATA_PATH = {DATA_PATH}")

with open(CYPHER_FILE, "r") as f:

buf = ""

for line_no, line in enumerate(f, 1):

stripped = line.strip()

# skip empty lines and comments

if not stripped or stripped.startswith("//"):

continue

buf += " " + stripped if buf else stripped

if buf.rstrip().endswith(";"):

query = buf.rstrip().replace("${DATA_PATH}", DATA_PATH)

print(f"[line {line_no}] Executing: {query}")

t0 = time.time()

try:

conn.execute(query)

print(f" -> OK ({time.time() - t0:.2f}s)")

except Exception as e:

print(f" -> FAILED: {e}", file=sys.stderr)

buf = ""

finally:

if conn is not None:

conn.close()

db.close()

Copilot · 2026-04-29T09:15:21Z

+
+import neug
+
+DB_DIR = os.path.join(os.getcwd(), "ldbc")


Using os.getcwd() makes the database path depend on the invocation directory, which is brittle outside the workflow. Prefer deriving the path from the script location (e.g., relative to the repo root) or allow overriding via an env var/CLI arg, while keeping the current default.

Suggested change

DB_DIR = os.path.join(os.getcwd(), "ldbc")

DEFAULT_DB_DIR = os.path.join(

os.path.dirname(os.path.abspath(__file__)), "ldbc"

)

DB_DIR = os.environ.get("LDBC_DB_DIR", DEFAULT_DB_DIR)

Copilot · 2026-04-29T09:15:21Z

+      - name: Run benchmark via SSH
+        run: |
+          RESULT_DIR="${{ steps.load.outputs.result_dir }}"
+          ssh -o StrictHostKeyChecking=no client-host "mkdir -p $RESULT_DIR && /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark.sh /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark-sf100.properties 2>&1 | tee $RESULT_DIR/benchmark.log"


Disabling StrictHostKeyChecking makes the SSH step vulnerable to MITM attacks on the runner network. Prefer adding the host key to known_hosts (e.g., via ssh-keyscan into ~/.ssh/known_hosts) or configuring the self-hosted runner image with the correct host key.

Suggested change

ssh -o StrictHostKeyChecking=no client-host "mkdir -p $RESULT_DIR && /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark.sh /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark-sf100.properties 2>&1 | tee $RESULT_DIR/benchmark.log"

mkdir -p ~/.ssh

chmod 700 ~/.ssh

if ! ssh-keygen -F client-host >/dev/null; then

ssh-keyscan -H client-host >> ~/.ssh/known_hosts

fi

chmod 600 ~/.ssh/known_hosts

ssh client-host "mkdir -p $RESULT_DIR && /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark.sh /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark-sf100.properties 2>&1 | tee $RESULT_DIR/benchmark.log"

zhanglei1949

LGTM

lnfjpt added 11 commits April 13, 2026 10:21

add nightly test for interactive benchmark

09ae2fc

add ldbc benchmark

2d57949

update

633d45f

fix runs-on

a204d02

add tmp dir

59cf5bb

update datapath

1e86b9b

fix

5f265c4

update server config

3a0bd6e

fix

cdca4c5

use hugepage to run benchmark

8218d14

run sf100 benchmark

ad8e196

greptile-apps Bot reviewed Apr 13, 2026

View reviewed changes

lnfjpt requested a review from zhanglei1949 April 13, 2026 06:57

lnfjpt added 15 commits April 13, 2026 15:05

Merge branch 'main' into main

ebb2d61

add nightly run

8778e75

fixs property path

ab84aeb

update schedule of workflow

7311b67

Merge branch 'main' into main

680532d

Merge branch 'main' into main

bacaa18

add result for loading graph

3125d26

u

24da14e

open debug

c61fcc4

Merge branch 'main' into main

09cad49

update nightly benchmark

2eab953

use sf10 for test

d4959d1

fix clean up

2a3affe

fix server

d14b49c

Merge branch 'main' into main

f01893c

lnfjpt force-pushed the main branch from 68e7dc5 to f01893c Compare April 28, 2026 07:13

Merge branch 'main' into main

fe586c1

lnfjpt added 5 commits April 28, 2026 15:17

use disown

3a73f25

use nohup & setsid

c1ddcd0

update

8a09a12

run sf100 test

91a6129

update runner label

6ec7239

lnfjpt changed the title ~~feat: Add nightly ldbc interactive benchmark (WIP)~~ feat: Add nightly ldbc interactive benchmark Apr 29, 2026

lnfjpt added 2 commits April 29, 2026 15:56

Merge branch 'main' into main

684d72f

rm unused workflow

8f4e55a

zhanglei1949 reviewed Apr 29, 2026

View reviewed changes

Comment thread benchmark/interactive/load_ldbc.py

zhanglei1949 requested a review from Copilot April 29, 2026 09:05

add license for benchmark

a47acb8

Copilot AI reviewed Apr 29, 2026

View reviewed changes

lnfjpt added 2 commits April 29, 2026 17:16

Merge branch 'main' into main

572f10c

update ssh

fe837a6

zhanglei1949 approved these changes Apr 29, 2026

View reviewed changes

Merge branch 'main' into main

da4798f

Copilot started reviewing on behalf of zhanglei1949 April 29, 2026 10:25 View session

lnfjpt merged commit b1c0778 into alibaba:main May 3, 2026
2 checks passed

lnfjpt mentioned this pull request May 26, 2026

feat: Release v0.1.2 #404

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add nightly ldbc interactive benchmark#217

feat: Add nightly ldbc interactive benchmark#217
lnfjpt merged 38 commits into
alibaba:mainfrom
lnfjpt:main

lnfjpt commented Apr 13, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

zhanglei1949 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-DB_DIR = os.path.join(os.getcwd(), "ldbc")
+DEFAULT_DB_DIR = os.path.join(
+    os.path.dirname(os.path.abspath(__file__)), "ldbc"
+)
+DB_DIR = os.environ.get("LDBC_DB_DIR", DEFAULT_DB_DIR)

-          ssh -o StrictHostKeyChecking=no client-host "mkdir -p $RESULT_DIR && /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark.sh /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark-sf100.properties 2>&1 | tee $RESULT_DIR/benchmark.log"
+          mkdir -p ~/.ssh
+          chmod 700 ~/.ssh
+          if ! ssh-keygen -F client-host >/dev/null; then
+            ssh-keyscan -H client-host >> ~/.ssh/known_hosts
+          fi
+          chmod 600 ~/.ssh/known_hosts
+          ssh client-host "mkdir -p $RESULT_DIR && /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark.sh /data/neng/actions-runner/flex_ldbc_snb/driver/neug/driver/benchmark-sf100.properties 2>&1 | tee $RESULT_DIR/benchmark.log"

Conversation

lnfjpt commented Apr 13, 2026

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

zhanglei1949 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants