Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No results being returned by apoc.export.json.all when writeNodeProperties: true for a large dataset when writing to a stream #647

Open
elan-sfrancies opened this issue Jul 22, 2024 · 1 comment

Comments

@elan-sfrancies
Copy link

I am experiencing issues when running the following query against an on-premise docker instance of neo4j Community Edition.

CALL apoc.export.json.all(null, {stream:true, jsonFormat: "JSON_LINES", writeNodeProperties: true})
YIELD file, nodes, relationships, properties, data
RETURN file, nodes, relationships, properties, data

Expected Behavior

The query returns results or displays an error showing why results could not be returned.

Actual Behavior

No results are returned (the following message is seen in the web interface):

(no changes, no records)

How to Reproduce the Problem

Steps

  1. Generate a neo4j instance with ~100,000 nodes, ~100,000 relationships and ~17,000,000 properties
  2. Run the following query:
CALL apoc.export.json.all(null, {stream:true, jsonFormat: "JSON_LINES", writeNodeProperties: true})
YIELD file, nodes, relationships, properties, data
RETURN file, nodes, relationships, properties, data
  1. Observe that results are not returned
    I have tested this behaviour using the .Net Driver as well as the browser interface.

Screenshots

The results when writeNodeProperties: false:

WriteNodePropertiesFalse

The lack of results when writeNodeProperties: true:

WriteNodePropertiesTrue

Specifications

Memory: 30GB (it appears that the memory use increases during the query before topping out at around 9.5GB and then leveling off.)
CPU: 20

Versions

  • OS: Docker for Windows (WSL) on Windows 10
  • Neo4j: neo4j:5.20-community-bullseye
  • Neo4j-Apoc: "NEO4J_PLUGINS=["apoc"]" (latest)
@gem-neo4j
Copy link
Contributor

Hey! Thanks for writing in, I suspect this is an OOM as APOC does not implement memory tracking. Are you able to check out the debug.log file and see if there are errors there? If so, can you send that here too?

Unfortunately, with how APOC is implemented, this isn't something easy for us to fix at this time. My suggestion would be to use one of the other export.json procedures in which you can feed the data into it using Cypher, then you can control how much data is getting consumed at a given time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants