Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add legacy-style-quoting to BioCypher core for use with Neo4j admin-import #343

Open
slobentanzer opened this issue Apr 24, 2024 · 0 comments
Assignees

Comments

@slobentanzer
Copy link
Collaborator

slobentanzer commented Apr 24, 2024

Quotes are a constant source of trouble in admin import, because most of the commonly used quotes are frequently found in free text annotations in the imported datasets. To avoid import failure because of the quote character being inside the property string, we could use the legacy-style-quoting param, described here: https://neo4j.com/docs/operations-manual/4.3/tools/neo4j-admin/neo4j-admin-import/#import-tool-option-legacy-style-quoting

Task: to modify processing in BioCypher to escape the quote that was chosen by the user as the field delimiter. For instance, if the BioCypher quote_character is a double quote, all double quotes in data fields should be replaced with escaped double quotes: "\"

This behaviour should be available via configuration, default true. It is however only logical to use in the case of Neo4j offline mode. All other modes should probably not use it or at least default to false (not sure how to implement, maybe needs to be a warning). When this setting is active, the admin import call we write should be extended by --legacy-style-quoting=true.

Most logical place for implementation of the replace is the write module for Neo4j.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants