Skip to content

Commit

Permalink
[ruby/csv] Enhanced Rdoc for CSV (ruby#122)
Browse files Browse the repository at this point in the history
  • Loading branch information
BurdetteLamar authored and nobu committed Jul 19, 2020
1 parent 033514c commit 6ba1abd
Show file tree
Hide file tree
Showing 22 changed files with 962 additions and 216 deletions.
45 changes: 45 additions & 0 deletions doc/csv/col_sep.rdoc
@@ -0,0 +1,45 @@
====== Option +col_sep+

Specifies the \String field separator to be used
for both parsing and generating.
The \String will be transcoded into the data's \Encoding before use.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:col_sep) # => "," (comma)

For examples in this section:
ary = ['a', 'b', 'c']

Using the default:
str = CSV.generate_line(line)
str # => "a,b,c\n"
ary = CSV.parse_line(str)
ary # => ["a", "b", "c"]

Using +:+ (colon):
col_sep = ':'
str = CSV.generate_line(ary, col_sep: col_sep)
str # => "a:b:c\n"
ary = CSV.parse_line(str, col_sep: col_sep)
ary # => [["a", "b", "c"]]

Using +::+ (two colons):
col_sep = '::'
str = CSV.generate_line(ary, col_sep: col_sep)
str # => "a::b::c\n"
ary = CSV.parse_line(str, col_sep: col_sep)
ary # => [["a", "b", "c"]]

---

Raises an exception if given the empty \String:
col_sep = ''
# Raises ArgumentError (:col_sep must be 1 or more characters: "")
CSV.parse_line("a:b:c\n", col_sep: col_sep)

Raises an exception if the given value is not String-convertible:
col_sep = BasicObject.new
# Raises NoMethodError (undefined method `to_s' for #<BasicObject:>)
CSV.generate_line(line, col_sep: col_sep)
# Raises NoMethodError (undefined method `to_s' for #<BasicObject:>)
CSV.parse(str, col_sep: col_sep)
45 changes: 45 additions & 0 deletions doc/csv/converters.rdoc
@@ -0,0 +1,45 @@
====== Option +converters+

Specifies a single field converter name or \Proc,
or an \Array of field converter names and Procs.

See {Field Converters}[#class-CSV-label-Field+Converters]

Default value:
CSV::DEFAULT_OPTIONS.fetch(:converters) # => nil

The value may be a single field converter name:
str = '1,2,3'
# Without a converter
ary = CSV.parse_line(str)
ary # => ["1", "2", "3"]
# With built-in converter :integer
ary = CSV.parse_line(str, converters: :integer)
ary # => [1, 2, 3]

The value may be an \Array of field converter names:
str = '1,3.14159'
# Without converters
ary = CSV.parse_line(str)
ary # => ["1", "3.14159"]
# With built-in converters
ary = CSV.parse_line(str, converters: [:integer, :float])
ary # => [1, 3.14159]

The value may be a \Proc custom converter:
str = ' foo , bar , baz '
# Without a converter
ary = CSV.parse_line(str)
ary # => [" foo ", " bar ", " baz "]
# With a custom converter
ary = CSV.parse_line(str, converters: proc {|field| field.strip })
ary # => ["foo", "bar", "baz"]

See also {Custom Converters}[#class-CSV-label-Custom+Converters]

---

Raises an exception if the converter is not a converter name or a \Proc:
str = 'foo,0'
# Raises NoMethodError (undefined method `arity' for nil:NilClass)
CSV.parse(str, converters: :foo)
13 changes: 13 additions & 0 deletions doc/csv/empty_value.rdoc
@@ -0,0 +1,13 @@
====== Option +empty_value+

Specifies the object that is to be substituted
for each field that has an empty \String.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:empty_value) # => "" (empty string)

With the default, <tt>""</tt>:
CSV.parse_line('a,"",b,"",c') # => ["a", "", "b", "", "c"]

With a different object:
CSV.parse_line('a,"",b,"",c', empty_value: 'x') # => ["a", "x", "b", "x", "c"]
39 changes: 39 additions & 0 deletions doc/csv/field_size_limit.rdoc
@@ -0,0 +1,39 @@
====== Option +field_size_limit+

Specifies the \Integer field size limit.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:field_size_limit) # => nil

This is a maximum size CSV will read ahead looking for the closing quote for a field.
(In truth, it reads to the first line ending beyond this size.)
If a quote cannot be found within the limit CSV will raise a MalformedCSVError,
assuming the data is faulty.
You can use this limit to prevent what are effectively DoS attacks on the parser.
However, this limit can cause a legitimate parse to fail;
therefore the default value is +nil+ (no limit).

For the examples in this section:
str = <<~EOT
"a","b"
"
2345
",""
EOT
str # => "\"a\",\"b\"\n\"\n2345\n\",\"\"\n"

Using the default +nil+:
ary = CSV.parse(str)
ary # => [["a", "b"], ["\n2345\n", ""]]

Using <tt>50</tt>:
field_size_limit = 50
ary = CSV.parse(str, field_size_limit: field_size_limit)
ary # => [["a", "b"], ["\n2345\n", ""]]

---

Raises an exception if a field is too long:
big_str = "123456789\n" * 1024
# Raises CSV::MalformedCSVError (Field size exceeded in line 1.)
CSV.parse('valid,fields,"' + big_str + '"', field_size_limit: 2048)
17 changes: 17 additions & 0 deletions doc/csv/force_quotes.rdoc
@@ -0,0 +1,17 @@
====== Option +force_quotes+

Specifies the boolean that determines whether each output field is to be double-quoted.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:force_quotes) # => false

For examples in this section:
ary = ['foo', 0, nil]

Using the default, +false+:
str = CSV.generate_line(ary)
str # => "foo,0,\n"

Using +true+:
str = CSV.generate_line(ary, force_quotes: true)
str # => "\"foo\",\"0\",\"\"\n"
31 changes: 31 additions & 0 deletions doc/csv/header_converters.rdoc
@@ -0,0 +1,31 @@
====== Option +header_converters+

Specifies a \String converter name or an \Array of converter names.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:header_converters) # => nil

Identical in functionality to option {converters}[#class-CSV-label-Option+converters]
except that:
- The converters apply only to the header row.
- The built-in header converters are +:downcase+ and +:symbol+.

Examples:
str = <<-EOT
foo,0
bar,1
baz,2
EOT
headers = ['Name', 'Value']
# With no header converter
csv = CSV.parse(str, headers: headers)
csv.headers # => ["Name", "Value"]
# With header converter :downcase
csv = CSV.parse(str, headers: headers, header_converters: :downcase)
csv.headers # => ["name", "value"]
# With header converter :symbol
csv = CSV.parse(str, headers: headers, header_converters: :symbol)
csv.headers # => [:name, :value]
# With both
csv = CSV.parse(str, headers: headers, header_converters: [:downcase, :symbol])
csv.headers # => [:name, :value]
63 changes: 63 additions & 0 deletions doc/csv/headers.rdoc
@@ -0,0 +1,63 @@
====== Option +headers+

Specifies a boolean, \Symbol, \Array, or \String to be used
to define column headers.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:headers) # => false

---

Without +headers+:
str = <<-EOT
Name,Count
foo,0
bar,1
bax,2
EOT
csv = CSV.new(str)
csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"">
csv.headers # => nil
csv.shift # => ["Name", "Count"]

---

If set to +true+ or the \Symbol +:first_row+,
the first row of the data is treated as a row of headers:
str = <<-EOT
Name,Count
foo,0
bar,1
bax,2
EOT
csv = CSV.new(str, headers: true)
csv # => #<CSV io_type:StringIO encoding:UTF-8 lineno:2 col_sep:"," row_sep:"\n" quote_char:"\"" headers:["Name", "Count"]>
csv.headers # => ["Name", "Count"]
csv.shift # => #<CSV::Row "Name":"bar" "Count":"1">

---

If set to an \Array, the \Array elements are treated as headers:
str = <<-EOT
foo,0
bar,1
bax,2
EOT
csv = CSV.new(str, headers: ['Name', 'Count'])
csv
csv.headers # => ["Name", "Count"]
csv.shift # => #<CSV::Row "Name":"bar" "Count":"1">

---

If set to a \String +str+, method <tt>CSV::parse_line(str, options)</tt> is called
with the current +options+, and the returned \Array is treated as headers:
str = <<-EOT
foo,0
bar,1
bax,2
EOT
csv = CSV.new(str, headers: 'Name,Count')
csv
csv.headers # => ["Name", "Count"]
csv.shift # => #<CSV::Row "Name":"bar" "Count":"1">
19 changes: 19 additions & 0 deletions doc/csv/liberal_parsing.rdoc
@@ -0,0 +1,19 @@
====== Option +liberal_parsing+

Specifies the boolean value that determines whether
CSV will attempt to parse input not conformant with RFC 4180,
such as double quotes in unquoted fields.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:liberal_parsing) # => false

For examples in this section:
str = 'is,this "three, or four",fields'

Without +liberal_parsing+:
# Raises CSV::MalformedCSVError (Illegal quoting in str 1.)
CSV.parse_line(str)

With +liberal_parsing+:
ary = CSV.parse_line(str, liberal_parsing: true)
ary # => ["is", "this \"three", " or four\"", "fields"]
12 changes: 12 additions & 0 deletions doc/csv/nil_value.rdoc
@@ -0,0 +1,12 @@
====== Option +nil_value+

Specifies the object that is to be substituted for each null (no-text) field.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:nil_value) # => nil

With the default, +nil+:
CSV.parse_line('a,,b,,c') # => ["a", nil, "b", nil, "c"]

With a different object:
CSV.parse_line('a,,b,,c', nil_value: 0) # => ["a", 0, "b", 0, "c"]
32 changes: 32 additions & 0 deletions doc/csv/quote_char.rdoc
@@ -0,0 +1,32 @@
====== Option +quote_char+

Specifies the character (\String of length 1) used used to quote fields
in both parsing and generating.
This String will be transcoded into the data's \Encoding before use.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_char) # => "\"" (backslash)

This is useful for an application that incorrectly uses <tt>'</tt> (single-quote)
to quote fields, instead of the correct <tt>"</tt> (double-quote).

Using the default:
ary = ['a', 'b', '"c"', 'd']
str = CSV.generate_line(ary)
str # => "a,b,\"\"\"c\"\"\",d\n"
ary = CSV.parse_line(str)
ary # => ["a", "b", "\"c\"", "d"]

Using <tt>'</tt> (single-quote):
quote_char = "'"
ary = ['a', 'b', '\'c\'', 'd']
str = CSV.generate_line(ary, quote_char: quote_char)
str # => "a,b,'''c''',d\n"
ary = CSV.parse_line(str, quote_char: quote_char)
ary # => [["a", "b", "'c'", "d"]]

---

Raises an exception if the \String length is greater than 1:
# Raises ArgumentError (:quote_char has to be nil or a single character String)
CSV.new('', quote_char: 'xx')
12 changes: 12 additions & 0 deletions doc/csv/quote_empty.rdoc
@@ -0,0 +1,12 @@
====== Option +quote_empty+

Specifies the boolean that determines whether an empty value is to be double-quoted.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:quote_empty) # => true

With the default +true+:
CSV.generate_line(['"', ""]) # => "\"\"\"\",\"\"\n"

With +false+:
CSV.generate_line(['"', ""], quote_empty: false) # => "\"\"\"\",\n"
22 changes: 22 additions & 0 deletions doc/csv/return_headers.rdoc
@@ -0,0 +1,22 @@
====== Option +return_headers+

Specifies the boolean that determines whether method #shift
returns or ignores the header row.

Default value:
CSV::DEFAULT_OPTIONS.fetch(:return_headers) # => false

Examples:
str = <<-EOT
Name,Count
foo,0
bar,1
bax,2
EOT
# Without return_headers first row is str.
csv = CSV.new(str, headers: true)
csv.shift # => #<CSV::Row "Name":"foo" "Count":"0">
# With return_headers first row is headers.
csv = CSV.new(str, headers: true, return_headers: true)
csv.shift # => #<CSV::Row "Name":"Name" "Count":"Count">

0 comments on commit 6ba1abd

Please sign in to comment.