Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV.parse_line when parsing malformed data, drops the stacktrace in from the exception #120

Closed
doliveirakn opened this issue May 4, 2020 · 5 comments

Comments

@doliveirakn
Copy link

doliveirakn commented May 4, 2020

Consider the following code:

class TestClass
   def self.parsing_method
     CSV.parse_line("\"foo")
   end
end

Calling TestClass.parsing_method should raise a CSV::MalformedCSVError because there is an unclosed quoted field which makes perfect sense.

However, if we execute this code on version 3.1, we end up with this:

> TestClass.parsing_method
CSV::MalformedCSVError: Unclosed quoted field in line 1.
	from /Users/kd/code/csv/lib/csv/parser.rb:263:in `block in each'
	from /Users/kd/code/csv/lib/csv/parser.rb:92:in `loop'
	from /Users/kd/code/csv/lib/csv/parser.rb:92:in `each'
	from /Users/kd/.rbenv/versions/2.4.3/bin/bundle:1:in `each'

Notice that the stacktrace is incomplete. We don't see any reference to parsing_method or even irb

I have isolated it to this commit: 11f126e#diff-ad40fdf9392e934708ed9ee9ec79bde3R73
It seems to have something to do with the to_enum call but I haven't been able to figure out what yet.

As of that commit, the stacktrace is partially present, but before that commit, it the stack trace is different:

> TestClass.parsing_method
CSV::MalformedCSVError: Unclosed quoted field in line 1.
	from /Users/kd/code/csv/lib/csv/parser.rb:264:in `block in shift'
	from /Users/kd/code/csv/lib/csv/parser.rb:93:in `loop'
	from /Users/kd/code/csv/lib/csv/parser.rb:93:in `shift'
	from /Users/kd/code/csv/lib/csv.rb:1208:in `shift'
	from /Users/kd/code/csv/lib/csv.rb:1171:in `each'
	from /Users/kd/code/csv/lib/csv.rb:1185:in `to_a'
	from /Users/kd/code/csv/lib/csv.rb:1185:in `read'
	from /Users/kd/code/csv/lib/csv.rb:683:in `parse'
	from (irb):3:in `parsing_method'
	from (irb):6

Notice on this one we do have reference to parsing_method.

This means that for any application that is using CSV that is affected, if there is a parsing error that occurs, it is very difficult to track down where that exception is coming from since the stacktrace what seems like everything outside of the parser file.

@doliveirakn
Copy link
Author

Seems like the usage of to_enum and next dropping parts of a stacktrace is an issue in Ruby (https://bugs.ruby-lang.org/issues/16829)

I think the CSV library should revisit the enumerator approach since it is unclear when it will be fixed in Ruby

@kou
Copy link
Member

kou commented May 4, 2020

I couldn't reproduce this with Ruby 2.7:

require "csv"

class TestClass
  def self.parsing_method
    CSV.parse("\"foo")
  end
end

TestClass.parsing_method
$ ruby -v /tmp/a.rb
ruby 2.7.0p0 (2019-12-25 revision 647ee6f091) [x86_64-linux-gnu]
Traceback (most recent call last):
	15: from /tmp/a.rb:9:in `<main>'
	14: from /tmp/a.rb:5:in `parsing_method'
	13: from /usr/lib/ruby/2.7.0/csv.rb:686:in `parse'
	12: from /usr/lib/ruby/2.7.0/csv.rb:1289:in `read'
	11: from /usr/lib/ruby/2.7.0/csv.rb:1289:in `to_a'
	10: from /usr/lib/ruby/2.7.0/csv.rb:1280:in `each'
	 9: from /usr/lib/ruby/2.7.0/csv.rb:1280:in `each'
	 8: from /usr/lib/ruby/2.7.0/csv/parser.rb:336:in `parse'
	 7: from /usr/lib/ruby/2.7.0/csv/parser.rb:823:in `parse_quotable_loose'
	 6: from /usr/lib/ruby/2.7.0/csv/parser.rb:49:in `each_line'
	 5: from /usr/lib/ruby/2.7.0/csv/parser.rb:49:in `each_line'
	 4: from /usr/lib/ruby/2.7.0/csv/parser.rb:52:in `block in each_line'
	 3: from /usr/lib/ruby/2.7.0/csv/parser.rb:862:in `block in parse_quotable_loose'
	 2: from /usr/lib/ruby/2.7.0/csv/parser.rb:884:in `parse_quotable_robust'
	 1: from /usr/lib/ruby/2.7.0/csv/parser.rb:959:in `parse_column_value'
/usr/lib/ruby/2.7.0/csv/parser.rb:1020:in `parse_quoted_column_value': Unclosed quoted field in line 1. (CSV::MalformedCSVError)

It includes 14: from /tmp/a.rb:5:in parsing_method'`.

@doliveirakn
Copy link
Author

doliveirakn commented May 4, 2020

@kou Sorry there was a typo. The CSV.parse should have been CSV.parse_line. I've updated the description.

@doliveirakn
Copy link
Author

With the updated description we should have this:

$ $ ruby -v /tmp/a.rb
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-darwin19]
Traceback (most recent call last):
	9: from /tmp/a.rb:in `each'
	8: from /Users/kd/.rbenv/versions/2.7.1/lib/ruby/2.7.0/csv/parser.rb:336:in `parse'
	7: from /Users/kd/.rbenv/versions/2.7.1/lib/ruby/2.7.0/csv/parser.rb:823:in `parse_quotable_loose'
	6: from /Users/kd/.rbenv/versions/2.7.1/lib/ruby/2.7.0/csv/parser.rb:49:in `each_line'
	5: from /Users/kd/.rbenv/versions/2.7.1/lib/ruby/2.7.0/csv/parser.rb:49:in `each_line'
	4: from /Users/kd/.rbenv/versions/2.7.1/lib/ruby/2.7.0/csv/parser.rb:52:in `block in each_line'
	3: from /Users/kd/.rbenv/versions/2.7.1/lib/ruby/2.7.0/csv/parser.rb:862:in `block in parse_quotable_loose'
	2: from /Users/kd/.rbenv/versions/2.7.1/lib/ruby/2.7.0/csv/parser.rb:884:in `parse_quotable_robust'
	1: from /Users/kd/.rbenv/versions/2.7.1/lib/ruby/2.7.0/csv/parser.rb:959:in `parse_column_value'
/Users/kd/.rbenv/versions/2.7.1/lib/ruby/2.7.0/csv/parser.rb:1020:in `parse_quoted_column_value': Unclosed quoted field in line 1. (CSV::MalformedCSVError)

@kou kou closed this as completed in 2959483 May 17, 2020
@kou
Copy link
Member

kou commented May 17, 2020

Thanks.
I've fixed this.

headius added a commit to headius/jruby that referenced this issue Sep 7, 2022
This pulls in a couple years of fixes including ruby/csv#120 which
fixes jruby#7346 (dangling fiber threads after CSV.parse_line).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants