Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double quotes inside a field aren't handled correctly #18

Closed
oilnam opened this issue Dec 13, 2016 · 3 comments · Fixed by #19
Closed

Double quotes inside a field aren't handled correctly #18

oilnam opened this issue Dec 13, 2016 · 3 comments · Fixed by #19

Comments

@oilnam
Copy link

oilnam commented Dec 13, 2016

PureCSV correctly parses a string like: Jon,"Snow of Winterfell"

but it fails when the quotes are not wrapping the entire field, e.g. Jon,Snow "III" of Winterfell

Steps to reproduce:

import purecsv.unsafe._

case class Person(name: String, surname: String)

scala> s = "Jon,\"Snow of Winterfell\""
s: String = Jon,"Snow of Winterfell"

scala> CSVReader[Person].readCSVFromString(CSVReader[Person].readCSVFromString(s).toCSVLines().head)
res11: List[Person] = List(Person(Jon,Snow of Winterfell))

scala> s = "Jon,Snow \"III\" of Winterfell"
s: String = Jon,Snow "III" of Winterfell

scala> CSVReader[Person].readCSVFromString(CSVReader[Person].readCSVFromString(s).toCSVLines().head)
java.lang.Exception: Line 1 Expected , got 73
  at com.github.marklister.collections.io.CSVReader.processQuotedField(CSVReader.scala:54)
  at com.github.marklister.collections.io.CSVReader.processField(CSVReader.scala:85)
  at com.github.marklister.collections.io.CSVReader.next(CSVReader.scala:97)
  at com.github.marklister.collections.io.CSVReader.next(CSVReader.scala:13)
  at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462)
  at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
  at scala.collection.Iterator$class.foreach(Iterator.scala:893)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
  at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
  at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:183)
  at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:45)
  at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
  at scala.collection.AbstractIterator.to(Iterator.scala:1336)
  at scala.collection.TraversableOnce$class.toList(TraversableOnce.scala:294)
  at scala.collection.AbstractIterator.toList(Iterator.scala:1336)
  at purecsv.unsafe.package$CSVReader$class.readCSVFromString(package.scala:77)
  at purecsv.unsafe.package$CSVReader$$anon$1.readCSVFromString(package.scala:101)
  ... 42 elided
@melrief
Copy link
Collaborator

melrief commented Dec 13, 2016

Thanks for the report. I'm using a library to parse CSV but not to write to CSV and probably that's causing this (very strange) behaviour. Specifically, I think the issue is the toCSVLines. I'll investigate in the next days, thanks!

EDIT: I can confirm the issue

scala> val s = "Jon,Snow \"III\" of Winterfell"
s: String = Jon,Snow "III" of Winterfell
scala> CSVReader[Person].readCSVFromString(s).toCSV()
res4: String = "Jon","Snow "III" of Winterfell"

res4 is clearly not a valid CSV record

@oilnam
Copy link
Author

oilnam commented Dec 14, 2016

Thank you for writing the library! I've used it for a few projects here at The Guardian and it worked great so far ;)

@melrief
Copy link
Collaborator

melrief commented Dec 14, 2016

Nice to know that the library is doing its job :).

I made 19 which should fix this via quoting fields in a way that the underlying parser can parse quote fields. @oilnam do you have sometime to quick check the PR? I made a test which should be representative, can you tell me if this covers the issue? I will publish a new version with the patch as soon as I'm sure it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants