Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added support for saving with a QuoteMode #254

Closed
wants to merge 3 commits into from

Conversation

tobithiel
Copy link
Contributor

I'm dealing with some messy csv files and being able to just quote all fields is very useful, so that other applications don't misunderstand the file because of some sketchy characters.
This also enables disabling quoting all together, which would resolve #119, and only quoting non-numeric fields.

@codecov-io
Copy link

Current coverage is 86.18%

Merging #254 into master will increase coverage by +0.11% as of 765ffc9

@@            master    #254   diff @@
======================================
  Files           12      12       
  Stmts          517     521     +4
  Branches       149     151     +2
  Methods          0       0       
======================================
+ Hit            445     449     +4
  Partial          0       0       
  Missed          72      72       

Review entire Coverage Diff as of 765ffc9

Powered by Codecov. Updated on successful CI builds.

val carsCopy = sqlContext.csvFile(copyFilePath + "/")

assert(carsCopy.count == cars.count)
assert(carsCopy.collect.map(_.toString).toSet == cars.collect.map(_.toString).toSet)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. Does this really check if the output is quoted or not?

@tobithiel
Copy link
Contributor Author

Not sure why it's not rerunning the tests, but I pushed extended tests. I'm not that familiar with Scala, so please feel free to write that in a more elegant way.
Most of the other tests writing csv files don't check what's actually in the file as well, so these kind of checks should probably extended to these tests as well.

@tobithiel
Copy link
Contributor Author

@HyukjinKwon Are you interested in merging this in? Do you have more feedback?

@HyukjinKwon
Copy link
Member

@tobithiel I do not have the permission. cc @falaki

@falaki
Copy link
Member

falaki commented Feb 18, 2016

@tobithiel This is a cool feature to have in spark-csv. Would you make sure it works with the other parse mode as well?

@tobithiel
Copy link
Contributor Author

@falaki I'm not sure I get what you mean? This only deals with writing files and since parsing csv files with quotes is already tested & works, parsing the written files again works out of the box.

@falaki
Copy link
Member

falaki commented Feb 19, 2016

@tobithiel Thanks. This looks good.

@falaki falaki closed this in 3d6cb3a Feb 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants