Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add handling for ambiguous parsing options #226

Merged
merged 2 commits into from
Nov 18, 2021
Merged

Add handling for ambiguous parsing options #226

merged 2 commits into from
Nov 18, 2021

Commits on Nov 17, 2021

  1. Configuration menu
    Copy the full SHA
    4d3a609 View commit details
    Browse the repository at this point in the history

Commits on Nov 18, 2021

  1. Handle ambiguities between col_sep and strip parsing options (#225)

    With Ruby 3.0.2 and csv 3.2.1, the file
    
    ```ruby
    require "csv"
    File.open("example.tsv", "w") { |f| f.puts("foo\t\tbar") }
    CSV.read("example.tsv", col_sep: "\t", strip: true)
    ```
    
    produces the error
    
    ```
    lib/csv/parser.rb:935:in `parse_quotable_robust': TODO: Meaningful
    message in line 1. (CSV::MalformedCSVError)
    ```
    
    However, the CSV in this example is not malformed; instead, ambiguous
    options were provided to the parser. It is not obvious (to me) whether
    the string should be parsed as
    
    - `["foo\t\tbar"]`,
    - `["foo", "bar"]`,
    - `["foo", "", "bar"]`, or
    - `["foo", nil, "bar"]`.
    
    This commit adds code that raises an exception when this situation is
    encountered. Specifically, it checks if the column separator either ends
    with or starts with the characters that would be stripped away.
    
    This commit also adds unit tests and updates the documentation.
    adamroyjones committed Nov 18, 2021
    Configuration menu
    Copy the full SHA
    b86b3f4 View commit details
    Browse the repository at this point in the history