Aim is to write a package which converts CSV text input into arrays or objects. You have to use streams to accomplish the task.
- Your input can be a file path, a string containing a CSV code,
- Your output should be a stream so the pipe chain can continue unless overriden by the user. See below.
- Use async iterators and generators wherever you can.
YOUR_API("path to localFile/remote files (if you can do)", { ...YOUR OPTIONS });
YOUR_API('JSON STRING') (Bruno)
//=> outputs CSV (via stream? or string?)
input -> processing the input -> output file(stream) or string
YOUR_API('CSV STRING') (Arpit)
//=> outputs JSON (via stream? or string?)
Note: Every JSON may not be valid CSV. Ask the developer for how to fill the gaps.
- Ask yourself one question: Does the separator have to be only a comma? Can it be other characters?
- There may be an OPTIONAL header line appearing as the first line of the file with the same format as normal records. This header will contain names corresponding to the fields in the file, and MUST contain the same number of fields as the records in the rest of the file.
# INPUT
field_1,field_2,field_3¬
aaa,bbb,ccc¬
xxx,yyy,zzz¬
# OUTPUT (ignorting headers)
[ ["field_1", "field_2", "field_3"],
["aaa", "bbb", "ccc"],
["xxx", "yyy", "zzz"] ]
# OUTPUT (using headers)
[ {"field_1": "aaa", "field_2": "bbb", "field_3": "ccc"},
{"field_1": "xxx", "field_2": "yyy", "field_3": "zzz"} ]
- Give the power of transforming the header if I don't like the existing headers.
YOUR_API(`
"key_1","key_2"
"value 1","value 2"
`, {
headers: header =>
header.map( column => column.toUpperCase() )
})
//=>
// [{
// KEY_1: 'value 1',
// KEY_2: 'value 2'
// }]
- Perhaps by returning a promise that resolves with the entire output?
- Or by taking a callback which is invoked when the task is complete?
8. Should be able to handle errors, continue parsing even for ambiguous CSV's and return the errors for each row
- The user should be able to choose whether to stop on the first error, or skip the ill-formed line, or get an collection of all ill-formed lines at the end or something that you can think of.
const data = `
# At the begening of a record
"hello"
"world"# At the end of a record
`.trim()
YOUR_API(data)
// output
// [
// [ "hello" ],
// [ "world" ]
// ]
- Your API should be extensively tested using Jest.
Note: YOUR_API
can be a single function like parse
. It can also be replaced with something like CSV.parse
. Feel free to invent your own API.
- Make your library Universal JavaScript (aka isomorphic JavaScript).
- Support worker thread by passing a config. That is, the computation will not happen on main thread but on worker threads.
- Auto detect the delimiter
YOUR_API.detect(CSV_STRING)
//=> "\t"
- You should not be using any extra libraries.
- Apart from
eslint
,prettier
,babel
or other helper utilities.
- Apart from
- If your library is not using streams, it won't be evaluated.
- If your library does not have tests, it won't be evaluated.
- Take into account that the comma (,) character is not the only character used as a field delimiter. Semi-colons (;), tabs (\t), and more are also popular field delimiter characters.
- Each record starts at the beginning of its own line, and ends with a line break (shown as ¬)
# INPUT
aaa,bbb,ccc¬
xxx,yyy,zzz¬
# OUTPUT
[ ["aaa", "bbb", "ccc"],
["xxx", "yyy", "zzz"] ]
- The last record in a file is not required to have a ending line break
# INPUT
aaa , bbb , ccc¬
xxx, yyy ,zzz ¬
# OUTPUT
[ ["aaa ", " bbb ", " ccc"],
[" xxx", " yyy ", "zzz "] ]
// Two-line, comma-delimited file
var csv = myCSV.unparse([
["1-1", "1-2", "1-3"],
["2-1", "2-2", "2-3"]
]);