New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch from neo4jrestclient to neo4j for Neo4j 4.3 compatibility #343
Conversation
I replaced http_driver.query with http_driver.session().run everywhere, but I think it's worth taking a second pass to add transactions. It also appears to be hitting a constraint violation in the tests:
|
@RichardBruskiewich I can't test against 3.5 locally, so I definitely need help with that. I'm also not sure about the duplicate constraint validation messages, for example
I see lots of similar messages when loading an actual kgx file, but my guess is that this might just be something to clean up for tidy output rather than an actual problem? I don't know that it's strictly necessary for this PR, but I think it's probably worth looking at using transactions rather than just This has examples of defining each query as it's own function to use with transactions: https://neo4j.com/docs/python-manual/current/session-api/ which might require a bit of refactoring They're a little less elegant, but it might be reasonable to follow this example for just declaring explicit transactions |
Also, I just realized that the connection problem that's getting hit in the tests might actually be incompatibility with Neo4j 3.5 (uh oh) |
Neo4j 4.* seems to have moved more definitively towards exclusive use of its Bolt port 7687. One suspects that this could be one particular source of gross major release backwards incompatibility with 3.* releases? If KGX needs to support both releases, we may need to parameterize the calling of KGX to know which Neo4j version it is targeting, and this may also require the maintenance of two parallel code implementations, one supporting each version? I can perhaps start taking a closer look at this next week sometime. |
@cmungall @sierra-moxon, I've previously had the suspicion that that there is sufficient incompatibility between 3.5++ and 4.* Neo4j that one size of KGX neo4j sink/source won't fit all. I'm not sure how many KGX users currently depend on 3.5++, for which the pain of upgrade to 4.3++ would be significant. Since freezing KGX at 3.5 does not seem to be a progressive option for the future, do we 1) completely abandon backwards compatibility to 3.5++ in the next KGX release and enforce 4.3++ compliance, or 2) config.yaml parameterize the user choice of Neo4j within a given KGX tool deployment environment, then support both the old (3.5+) and new (4.3+) compliant solutions in the code base (along with unit tests), perhaps marking 3.5++ as deprecated(?). |
It looks like the Neo4j 4.3 driver is compatibly with Neo4j 3.5: https://neo4j.com/developer/kb/neo4j-supported-versions/ The PyPi page lists something about slightly different defaults/details for connection strings that require encryption: https://pypi.org/project/neo4j/ |
@RichardBruskiewich - @kevinschaper and I talked briefly with Justin in the KG hub meeting today. We probably can't support more than one version of neo4j. And the December demo looms large on the Translator side of this fence. What if we plan on migrating to the new version of neo4j, but do the work for it in a branch off master (and don't plan on merging until after December). It sounds like Kevin is ok with this idea, but also suggested we update the KGX README to document the supported version, and give our plan for upgrade. What do you think? |
FWIW, I have been using KGX to load neo4j 3.5.4 graphs and then upgrading the ne04j 3.5.4 data files to neo4j 4.X. I have a script that does it if anyone is interested. |
My one clarification is that I think after switching to the standard neo4j driver, it should still be possible to support 3.5, but I'm 100% on board with waiting until after December to eliminate the chance of downstream disruptions when folks have other priorities. |
OK everyone. Sounds like a sensible plan: we're ultimately only going to support the latest Neo4j (i.e. 4.3) but that work to migrate over to this new version should be in a dedicated branch until after the December demo. We'll document status in the README as required. Thanks all for the feedback. I'll start nibbling away at this starting sometime this week. |
* master: Missing index link to meta-knowledge-graph command
* master: Update requirements.txt of KGX to be more permissive of its linkml version (this eliminates the designated_type error, but still has some other unit test failures)
@kevinschaper (@putmantime @sierra-moxon @cmungall) Moving right along... The Neo4j company may be on a course to make the Community version of Neo4j a bit less palatable, to encourage folks to pay real money to use the technology. I'm hitting a few snags in the unit tests relating to this. I don't yet know how painful they are. Time will tell. Is it the case that Translator will be running KGX only on Community versions of Neo4j or do we just expect serious biomedical systems to use a professional or better version? |
I do not imagine many groups will use the professional/paid version of neo4j. |
…_slate') to 'clean_database'; appended 'test_neo_to_graph_transform' test to 'gets_csv_to_neo_load()' to ensure sequential execution (tweaked 'neo' to 'neo4j' in name)
HI @sierra-moxon (cc: @putmantime @kevinschaper), this branch still has unit test errors but most (all?) of them seem unrelated to the Neo4j 4.3 upgrade target of this branc (and are spun off as PR #351 and issue #353, for remedial attention). I probably now just need to review and update a bit of the documentation, as discussed above, but the code otherwise ready for initial review. I guess we keep this new code tagged for the post-December demo release (but available earlier, for Monarch use)? |
This is awesome, thank you for taking it over @RichardBruskiewich! Waiting until December is still fine, it's only for local development that 3.5 isn't an option for me, so I've been running kgx from this branch when I want to load into a local neo4j. We have a cloud pipeline now too and it's fine using 3.5 with the current kgx release. |
T'was what Tim P. assigned to me (job now "done" after much unavoidable procrastination). @kevinschaper the other unit test failures seem completely unrelated to Neo4j (perhaps rather triggered by recent Biolink model changes plus ???!?). If what you need is the 4.3 Neo4j in your development, should work fine, but let me know if you encounter any issues. |
Yeah. Anyhow, the matter was not as serious as I expected. |
Kudos, SonarCloud Quality Gate passed! |
It's not working yet, rather than querying directly from the driver it looks like a session & transaction are necessary