Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#965] Escape delimiters in csv strings #1005

Merged
merged 16 commits into from Oct 8, 2018

Conversation

timo95
Copy link
Contributor

@timo95 timo95 commented Oct 5, 2018

fixes #965

Copy link
Contributor

@merando merando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in most parts.

@@ -72,7 +92,7 @@ public void testWriteWithDifferentPropertyTypes() throws Exception {
* Test CSVDataSink to properly separate the metadata
* of edges and vertices using the same label.
*
* @throws Exception
* @throws Exception on failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be more specific.

/**
* Test CSVDataSink to properly escape strings and labels.
*
* @throws Exception on failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be more specific.

/**
* Test CSVDataSink using existing a metadata.csv
*
* @throws Exception on failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be more specific.

* Test IndexedCSVDataSink to properly escape strings and labels. The graph is created manually,
* since GDL does not support special characters.
*
* @throws Exception on failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be more specific.

}

/**
* Test writing and reading a graph with an existing metadata file.
*
* @throws Exception on failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a specific exception msg.

@merando merando merged commit 185bcf6 into dbs-leipzig:develop Oct 8, 2018
@timo95 timo95 deleted the 965-csv-escaper branch October 8, 2018 09:13
@ChrizZz110
Copy link
Contributor

In my opinion, it is a bit over-engineered. Please have a look at the `StringEscapeUtils" class from the apache commons package. They are escaping strings in csv files with quotes, if there is a char inside that has to be escaped. It think, searching for chars with loops in the strings is not a good choice. In my opinion it is too much logic for exceptional cases, which would actually have to be done in a data-cleaning step.

galpha pushed a commit to galpha/gradoop that referenced this pull request Oct 9, 2018
0x002A pushed a commit to ChrizZz110/gradoop that referenced this pull request Feb 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CSVDataSource fails when labels or properties contains separator characters
3 participants