.transform() async callback? #24

at0g · 2014-03-18T02:51:52Z

Is it possible to do an async operation in the transform method?

.transform(function(data, done){
   mongoose.model('user').findOne({ username: data.username }, function(err, result){
     if(err) throw err;
     return done({ userId: result._id });
  }
});

The text was updated successfully, but these errors were encountered:

doug-martin · 2014-04-04T00:11:16Z

You could call .pause() on the stream but this could lead to really long parse times as each lookup requires a call to the database

We do something similar but we do this in the .on("record") event.

For example to reduce the total IO as a large number of requests could build up.

//up above
var users = [],
      FIND_EVERY = 1000;
.on("record", function(data){
    users.push(data);
    if(users.length === FIND_EVERY){
       csvStream.pause(); //no more record events will be produced until resume is called
       doSomethingWithUsers(users, function(err, res){
           if(err){
              console.log(err.stack);
          }else{
            csvStream.resume(); //ok continue parsing
          }
       });
    }
});

Hope this helps.

neverfox · 2014-05-06T22:17:58Z

I imagine you can use Promises to accomplish this without pausing.

bebraw · 2014-06-25T07:53:04Z

Thanks for highlighting the usage of pause. I have to load 400M of CSV data to a database. This solution seems reliable.

Initially I used a async.queue for dealing with the incoming data but at it simply got swamped by the amount of data and didn't work. Lesson learned. :)

Wouldn't the ideal solution be to use pipe? I couldn't get this to work. As far as I understand if this worked, it would be compatible with async too and deal with the exhaustion problem (backpressure) effectively. You would simply write a Writable stream that would write to database and execute callback when done (resumes stream).

It's possible I'm missing something obvious here but that would seem the ideal solution for me as you don't have to muck with pause and resume.

doug-martin · 2014-06-25T20:12:47Z

Im not sure I understand can you please provide an example of what you are trying to do

Thanks!

-Doug

bebraw · 2014-06-26T04:21:55Z

@doug-martin Sure. I would expect the following to work more or less:

...

var stream = csv.fromPath(path, {
    headers: true
}).
pipe(function(data, _, next) {
    // write to db now (async, resumes on next)
    db.getOrCreate(data.id, next);
});

Obviously this won't work in the current version. I understand I could add .pipe(csv.createWriteStream({headers: true})) in between, parse that in another step and then write to database but that would be missing the point.

Allowing piping like this would make it possible to skip the pause/resume business.

TimNZ · 2014-08-16T00:55:38Z

I agree with bebraw, at a minimum pass a next function to on('record', function(data,next){ }) for async support, and deprecate the pause/resume which is non-standard.

doug-martin · 2014-08-27T23:41:43Z

There is now async support for transform and validate I decided not to go the on("record" method because I would have to override the EventEmitter, and you can implement your own stream to handle that case. Feedback is always welcome!

-Doug

doug-martin mentioned this issue Aug 27, 2014

v0.5.0 #59

Merged

doug-martin closed this as completed Aug 27, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.transform() async callback? #24

.transform() async callback? #24

at0g commented Mar 18, 2014

doug-martin commented Apr 4, 2014

neverfox commented May 6, 2014

bebraw commented Jun 25, 2014

doug-martin commented Jun 25, 2014

bebraw commented Jun 26, 2014

TimNZ commented Aug 16, 2014

doug-martin commented Aug 27, 2014

.transform() async callback? #24

.transform() async callback? #24

Comments

at0g commented Mar 18, 2014

doug-martin commented Apr 4, 2014

neverfox commented May 6, 2014

bebraw commented Jun 25, 2014

doug-martin commented Jun 25, 2014

bebraw commented Jun 26, 2014

TimNZ commented Aug 16, 2014

doug-martin commented Aug 27, 2014