Skip to content

CSVReader

do- edited this page Nov 19, 2023 · 21 revisions

CSVReader is an asynchronous CSV parser implemented as a stream.Transform from a binary readable stream representing utf-8 encoded input into a readable object stream.

Each CSV line read produces one output object. The mapping is defined when creating the CSVReader instance.

const {CSVReader} = require ('csv-events')

const csv = CSVReader ({
//  delimiter: ',',
//  skip: 0,           // header lines
//  rowNumField: '#',  // how to name the line # property
//  rowNumBase: 1,     // what # has the 1st not skipped line
//  empty: null,       // how to interpret empty values (`''`)
    columns: [
      'id',            // 1st column: read as `id`, unquote
      null,            // 2nd column: to ignore
      {
        name: 'name',  // 3rd column: read as `name`
//      raw: true      // if you prefer to leave it quoted
      }, 
    ]
})

myReadUtf8CsvTextStream.pipe (csv)

for await (const {id, name} of csv) {
// do something with `id` and `name` 
}

Constructor Options

Name Default value Description
columns Array of column definitions (see below)
delimiter ',' Column delimiter
skip 0 Number of header lines to ignore
rowNumField null The name of the line # property (null for no numbering)
rowNumBase 1 - skip The 1st output record line #
empty null The value corresponding to zero length cell content
maxLength 1e6 The maximum cell length allowed (to prevent a memory overflow)

More on columns

Specifying columns is mandatory to create a CSVReader. It must be an array which every element is:

  • either null (for columns to bypass)
  • or a {name, raw} object
    • that can be shortened to a string name.

names are used as keys when constructing output objects.

Corresponding values are strings, except for the zero length case when the empty option value is used instead, null by default.

Normally, those string values come unquoted, but by using the raw option, one can turn off this processing. This may have sense in two cases:

  • the values read are immediately printed into another CSV stream, so quotes are reused;
  • for data guaranteed to be printed as is, reading raw cells content is slightly faster.

For CSV rows with less cells than columns.length, properties my be missing. The \n CSV will be read as a single {} object.

Clone this wiki locally