Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration of export CSV is inconsistent with the docs #652

Closed
szarnyasg opened this issue Nov 8, 2017 · 7 comments
Closed

Configuration of export CSV is inconsistent with the docs #652

szarnyasg opened this issue Nov 8, 2017 · 7 comments
Assignees

Comments

@szarnyasg
Copy link
Contributor

Using this single node as the sample dataset:

CREATE (:SomePerson {id: 1, name: 'John'})

Exporting to CSV does not work as specified in the docs (Table 8 in Export/Import):

CALL apoc.export.csv.query('MATCH (x:SomePerson) RETURN x.id, x.name', 'file.csv', {delim: ';'})

This should give a CSV with semicolons and without quotes. Instead, it gives:

"x.id","x.name"
"1","John"

After diving into the code a bit, I realized that I should look around the ExportConfig class:

public ExportConfig(Map<String,Object> config) {
config = config != null ? config : Collections.emptyMap();
this.silent = toBoolean(config.getOrDefault("silent",false));
this.batchSize = ((Number)config.getOrDefault("batchSize", DEFAULT_BATCH_SIZE)).intValue();
this.delim = delim(config.getOrDefault("d", String.valueOf(DEFAULT_DELIM)).toString());
this.quotes = toBoolean(config.get("quotes"));
this.useTypes = toBoolean(config.get("useTypes"));
this.nodesOfRelationships = toBoolean(config.get("nodesOfRelationships"));
this.format = ExportFormat.fromString((String) config.getOrDefault("format", "neo4j-shell"));
this.cypherFormat = CypherFormat.fromString((String) config.getOrDefault("cypherFormat", "create"));
this.config = config;
}

This suggests that the delimiter is defined by d and indeed it is:

CALL apoc.export.csv.query('MATCH (x:SomePerson) RETURN x.id, x.name', 'file.csv', {d: ';'})

This results is:

"x.id";"x.name"
"1";"John"

Quotes are supposed to be defined by the quotes boolean library, but I could not turn them off with quotes: false.

@jexp
Copy link
Member

jexp commented Nov 8, 2017

@szarnyasg Would you mind fixing both (support d and delim) and fix the quotes config
if you are looking at the code already?

@szarnyasg
Copy link
Contributor Author

@jexp I'm looking into it this morning. It seems there are multiple issues: the useTypes field does not work either:

CALL apoc.export.csv.query(
  "MATCH (u:User) return u.age, u.name, u.male, u.kids, labels(u)",
  "query.csv",
  {useTypes: true, d: '|', quotes: false}
)
"u.age"|"u.name"|"u.male"|"u.kids"|"labels(u)"
"42"|"foo"|"true"|"[""a"",""b"",""c""]"|"[""User""]"
"42"|"bar"|""|""|"[""User""]"
"12"|""|""|""|"[""User""]"

@szarnyasg
Copy link
Contributor Author

szarnyasg commented Nov 13, 2017

I looked into it. The delimiter issue was trivial to fix. However, without the quotes, the opencsv 2.3 library does not work correctly when quotes are turned off: the separator character is not escaped.

The opencsv 2.3 library was released in 2011, so I was suspicious that this issue has been fixed since then. To demonstrate this, I created an example project that has both opencsv 2.3 (net.sf.opencsv) and 4.0 (com.opencsv) as a dependency: https://github.com/szarnyasg/opencsv-test/tree/master/src/test/java

The results show the difference.

==> /tmp/my23.csv <==
Foo Bar,Bar, F.,123

==> /tmp/my40.csv <==
Foo Bar,Bar\, F.,123

@jexp do you think it would be possible to update the opencsv dependency in APOC?

@szarnyasg
Copy link
Contributor Author

@szarnyasg
Copy link
Contributor Author

See PR #879

@szarnyasg
Copy link
Contributor Author

szarnyasg commented May 20, 2019

Eventually, PR neo4j/neo4j#12009 was closed as wontfix, so this issue will require another round of thinking.

@jexp
Copy link
Member

jexp commented May 11, 2020

@mneedham we need to expose the config options of export csv in the docs (see above).
Esp. quoting / escaping seems important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

4 participants