Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

downstreamFormat - Last in array to contain no comma? #333

Open
CyrisXD opened this issue Aug 13, 2019 · 40 comments
Open

downstreamFormat - Last in array to contain no comma? #333

CyrisXD opened this issue Aug 13, 2019 · 40 comments
Labels
Milestone

Comments

@CyrisXD
Copy link

CyrisXD commented Aug 13, 2019

Hi there, trying to read a CSV file and stream its output data to another JSON array file.

          var Converter = require("csvtojson").Converter;

          var csvConverter = new Converter({
            constructResult: false,
            downstreamFormat: "array"
          });

          var readStream = require("fs").createReadStream(req.file.path);

          var writeStream = require("fs").createWriteStream("csvData.json");

          readStream
            .pipe(csvConverter)
            .subscribe((jsonObj, index) => {
              jsonObj.myNewKey = "some value";
            })
            .pipe(writeStream);

But the last JSON added to the array adds an extra comma at the end creating an invalid array/json.

As such:

[
{Name: Bob},
{Name: Sarah},
{Name: James},
]

How could I make sure it doesn't add the comma on the last entry?
Cheers

@ushakov-ruslan
Copy link

The same for me ^^^

@rami-res-zz
Copy link

@Keyang The same for me too ^^^

@Keyang Keyang added the bug label Aug 27, 2019
@Keyang Keyang modified the milestones: 2.0.10, 2.0.11 Aug 27, 2019
@ushakov-ruslan
Copy link

ushakov-ruslan commented Aug 28, 2019

Maybe if someone needs a quick solution until it's fixed:

import { Transform } from 'stream';

const csv = csvtojson(csvOptions);

// you can use one more stream to transform csv stream output
const transform = new Transform({
    transform (chunk, encoding, callback) {
      let string = chunk.toString('utf-8');

      if (['[\n', ']\n'].includes(string)) {
        if (string === '[\n') {
          this.theFirstEntity = true;
        }
        return callback(null, string);
      }

      string = string.replace(/,$/gm, '');

      if (this.theFirstEntity) {
        this.theFirstEntity = false;
      } else {
        string = `,` + string;
      }
      callback(null, string);
    },

  });

  readableStream.pipe(csv).pipe(transform).pipe(fs.createWriteStream('output.json'));
/*
output should be the following:
[
{"Name": "Bob"}
,{"Name": "Sarah"}
,{"Name": "James"}
]
*/

@ushakov-ruslan
Copy link

@Keyang any updates on this?

@oliverfoster
Copy link

oliverfoster commented Dec 3, 2019

I had more luck with this lineToArray transform:

const { Transform } = require('stream');
const csvtojson = require("csvtojson");

const lineToArray = new Transform({
  transform (chunk, encoding, cb) {
    // add [ to very front
    // add , between rows
    // remove crlf from row
    this.push((this.isNotAtFirstRow ? ',' : '[') + chunk.toString('utf-8').slice(0, -1));
    this.isNotAtFirstRow = true;
    cb();
  },
  flush(cb) {
    // add ] to very end or [] if no rows
    const isEmpty = (!this.isNotAtFirstRow);
    this.push(isEmpty ? '[]' : ']');
    cb();
  }
});

readableStream
  .pipe(csvtojson({
    checkType: true,
    downstreamFormat: 'line'
  }))
  .pipe(lineToArray)
  .pipe(writableStream);

Also #389

Update: Modified to reflect comments on -1 #333 (comment)

@mannyvergel
Copy link

Is someone still maintaining this? I'm having the same exact issue.

@yuu2lee4
Copy link

same issue

@ravibadoni
Copy link

any update on this?

@ravibadoni
Copy link

@Keyang

@ravibadoni
Copy link

ravibadoni commented May 20, 2020

Maybe if someone needs a quick solution until it's fixed:

import { Transform } from 'stream';

const csv = csvtojson(csvOptions);

// you can use one more stream to transform csv stream output
const transform = new Transform({
    transform (chunk, encoding, callback) {
      let string = chunk.toString('utf-8');

      if (['[\n', ']\n'].includes(string)) {
        if (string === '[\n') {
          this.theFirstEntity = true;
        }
        return callback(null, string);
      }

      string = string.replace(/,$/gm, '');

      if (this.theFirstEntity) {
        this.theFirstEntity = false;
      } else {
        string = `,` + string;
      }
      callback(null, string);
    },

  });

  readableStream.pipe(csv).pipe(transform).pipe(fs.createWriteStream('output.json'));
/*
output should be the following:
[
{"Name": "Bob"}
,{"Name": "Sarah"}
,{"Name": "James"}
]
*/

hi @ushakov-ruslan
Can you please give me a working example as this is not working for me.
I need help in fixing comma.
@ushakov-ruslan
thanks

@oliverfoster
Copy link

did you try my example?

@ravibadoni
Copy link

ravibadoni commented May 21, 2020

yes @oliverfoster
Please guide If I am doing something wrong.

`parse(){
// you can use one more stream to transform csv stream output
const transform = new Transform({
transform (chunk, encoding, callback) {
let string = chunk.toString('utf-8');

            if (['[\n', ']\n'].includes(string)) {
                if (string === '[\n') {
                    this.theFirstEntity = true;
                }
                return callback(null, string);
            }

            string = string.replace(/,$/gm, '');

            if (this.theFirstEntity) {
                this.theFirstEntity = false;
            } else {
                string = `,` + string;
            }
            callback(null, string);
        },

    });
    const self = this;
    return new Promise(function(resolve,reject){
        const filePath = require('path').resolve(
            __dirname+"/../../public/",
            self.fileName);
        const readStream= fs.createReadStream(filePath);
        var stream = fs.createWriteStream(filePath+".json");
        readStream.pipe(csv({downstreamFormat: 'array',flatKeys:true,delimiter:"auto",quote:'"',escape:'"',noheader: !self.isHeader,}))
            .on('header',(header)=>{
                header.map((header)=> header.toString().trim().replace(/[.]/g, ""))
                resolve(header);
            }).pipe(transform).pipe(stream)
    })
}`

@olivermartinfoster
Copy link

olivermartinfoster commented May 21, 2020

You're not using my example. I also couldn't get @ushakov-ruslan s example to work either which is why I left my example. Perhaps the code changed in between?

Mine explicitly switches to the line stream so that each write to the stream is a single JSON item or csv row. It makes everything much simpler to handle.

@ravibadoni
Copy link

ravibadoni commented May 21, 2020

        const lineToArray = new Transform({
            transform (chunk, encoding, cb) {
                this.push((this.isNotAtFirstRow ? ',' : '[') + chunk.toString('utf-8').slice(0,-2));
                this.isNotAtFirstRow = true;
                cb();
            },
            flush(cb) {
                const isEmpty = (!this.isNotAtFirstRow);
                this.push(isEmpty ? '[]' : ']');
                cb();
            }
        });
        const self = this;
        return new Promise(function(resolve,reject){
            const filePath = require('path').resolve(
                __dirname+"/../../public/",
                self.fileName);
            const readStream= fs.createReadStream(filePath);
            var stream = fs.createWriteStream(filePath+".json");
            readStream.pipe(csv({downstreamFormat: 'array',flatKeys:true,delimiter:"auto",quote:'"',escape:'"',noheader: !self.isHeader,}))
                .on('header',(header)=>{
                    header.map((header)=> header.toString().trim().replace(/[.]/g, ""))
                    resolve(header);
                }).pipe(lineToArray).pipe(stream)
        })
    }```
I used this. No array object json created inside the file

@ravibadoni
Copy link

will this work for downstreamFormat: 'array'

@olivermartinfoster
Copy link

Downstreamformat: 'line'. My transform is called lineToArray

@ravibadoni
Copy link

downstreamFormat: 'line'
No data created in the file.

Also. I want to have array of object.

@olivermartinfoster
Copy link

olivermartinfoster commented May 21, 2020

readStream
  .pipe(csv({ 
    downstreamFormat: 'line', 
    checkType: true
  })
  .pipe(lineToArray)
  .pipe(stream);

Are you debugging?
It looks like you're returning a promise from your parse function before creating the stream? It's hard to tell because there's no indentation.
https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet#code

@ravibadoni
Copy link

image

@olivermartinfoster
Copy link

You're also not calling resolve on your promise.

@ravibadoni
Copy link

I am calling resolve on header to get all header values.
The code is working with array of object, The only thing I am getting is an comma at the end

@olivermartinfoster
Copy link

olivermartinfoster commented May 21, 2020

Can you remove all the header stuff and just test it as I wrote it three messages ago? The line parser is not the array parser. Get the array of objects first then work out how to change the headers.

@ravibadoni
Copy link

        const lineToArray = new Transform({
            transform (chunk, encoding, cb) {
                this.push((this.isNotAtFirstRow ? ',' : '[') + chunk.toString('utf-8').slice(0,-2));
                this.isNotAtFirstRow = true;
                cb();
            },
            flush(cb) {
                const isEmpty = (!this.isNotAtFirstRow);
                this.push(isEmpty ? '[]' : ']');
                cb();
            }
        });
        const self = this;
            const filePath = require('path').resolve(
                __dirname+"/../../public/",
                self.fileName);
            const readStream= fs.createReadStream(filePath);
            var stream = fs.createWriteStream(filePath+".json");
        readStream
            .pipe(csv({
                downstreamFormat: 'line',
                checkType: true
            })).pipe(lineToArray).pipe(stream);
    }```

I get blank file without any data

@olivermartinfoster
Copy link

Your input filepath isn't right, the CSV isn't formatted correctly, you're not importing the library properly, you're dropping errors in a try catch block around your call to parse or something along those lines. Are you using a debugger? If you're getting an empty file the last line I know is running is the fs.createWriteStream.

@ravibadoni
Copy link

image

Input filepath is correct.
CSV is formatted as It is wokring perfect with array of object only getting extra comma at the end.
Please check screenshot for readstream data. It is not empty

@olivermartinfoster
Copy link

readStream.pipe(stream);

Does this make a copy of the file?

@ravibadoni
Copy link

yes correct.
I am using streams.

            .pipe(csv({
                downstreamFormat: 'line',
                checkType: true
            })).pipe(lineToArray).pipe(stream);```

@olivermartinfoster
Copy link

What version of node are you using? I want to test myself. It definitely looks like an issue with the way I'm using streams.

@ravibadoni
Copy link

Node Version : v12.14.0

@oliverfoster
Copy link

csvtojson.zip

Works for me.

@olivermartinfoster
Copy link

Any luck @ravibadoni?

@ravibadoni
Copy link

Any luck @ravibadoni?

Great thanks @olivermartinfoster
It works.
Thank you for helping me out.!!

@olivermartinfoster
Copy link

Awesome, that's good!! 👍 😅

@gaetansenn
Copy link

Any news ? The transformer not working for my case.

@oliverfoster
Copy link

What's your case? No news yet.

@gaetansenn
Copy link

it worked for me if I replace chunk.toString('utf-8').slice(0, -2)) with chunk.toString('utf-8').slice(0, -1)) i'm using .tsv with delimiter: '\t'

@petermikitsh
Copy link

@oliverfoster's solution #333 (comment) worked for me, with this small change:

-    this.push((this.isNotAtFirstRow ? ',' : '[') + chunk.toString('utf-8').slice(0,-2));
+    this.push((this.isNotAtFirstRow ? ',' : '[') + chunk.toString('utf-8').slice(0,-1));

@rafaelsorto
Copy link

@petermikitsh change to -1 helped me correctly parse a CSV file to JSON the proper format, thank you both! @oliverfoster

@oliverfoster
Copy link

I've updated my comment to the -1 👍

@sjxuereb
Copy link

sjxuereb commented Jun 9, 2023

import csv from 'csvtojson';
import replaceStream from 'replacestream';

// https://github.com/eugeneware/replacestream

async function convertToJson(inputcsvfilename, outputjsonfilename) {

try {
    const converter = csv({
        checkType: true,
        noheader: false,
        trim: true,
        checkColumn: true,
        downstreamFormat: 'array',
    })

    const readStream = fs.createReadStream(inputcsvfilename);

    const writeStream = fs.createWriteStream(outputjsonfilename);

    readStream.pipe(converter)

        //Replace all carriage returns
        .pipe(replaceStream(/\r\n/g, ''))

        //Replace trailing comma
        .pipe(replaceStream(/},]/g, '}]'))

        //Restore carriage return
        .pipe(replaceStream(/},/g, '},\n'))
        .pipe(writeStream);

    //OR //.pipe(process.stdout);


    writeStream.on('close', function () {
        console.log('JSON file closed')
    });

    console.log('Records Converted to JSON');

    return true;

} catch (error) {
  console.log(error)
    return false;
}

}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests