Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Error: EMFILE, open - createReadStream crashes after many files have been opened - no cleanup is happening #6041

Closed
syberkitten opened this issue Aug 12, 2013 · 19 comments

Comments

@syberkitten
Copy link

I'm creating a video server under http,
http:/localhost:8080/media/video/xxxx1.mp4

using a simple createReadStream files are read and served via pipe(res)
all works well at first, but the more files are being served, the more i get network and response timeoue errors until, not even one file is served.

the strange thing is that i get not errors, there is not memory leak, no cpu usage (2%-6%), simply something inside does not read the file and does not send the stream of bytes to the client,
and no timeout expires...

have been using ab apache benchmark and loader.io, same results.

It is as if something is being clogged and does not get released inside the node <-> file system, via createReadStream.

since i serve 1-2 GB files, it is a only logical solution to use createReadStream,
here is all the Code, should work, simply change mediapath to point to your media files.
ps. i use graceful-fs to overcome the known " EMFILE,open " error.

var fs = require('graceful-fs');
var http = require('http');
//var gfs = require('graceful-fs');

var version = "0.34";
var mediaPath = '/my_media_location';


function log(str,data){
    //console.log.apply(this,arguments);
    console.log("info:"+str);

}

function logError(str,data){
    //console.log.apply(this,arguments);
    console.log("error:"+str);

}

function logWarn(str,data){
    //console.log.apply(this,arguments);
    console.log("warn:"+str);

}


function contentTypeSelector(ctype){
    if (!contentTypes[ctype]) return contentTypes["default"]
    return contentTypes[ctype];
}


// todo: do authentication here
function checkHeadersAuth(req){
    var h = req.headers;

    if (req.url.indexOf("favicon")>-1) return false;
    return true;
    if (h["user-agent"].indexOf("Mozilla")==-1) { // invalid headers
        return true;
    }
    return true;
}

/*
returns a metadata object for logging each request)
*/
function collectRequestMeta(req){
    var o = req.headers;
    o.url = req.url;
    o.ip = req.connection.remoteAddress;
    return o;
}

function responseError404(res) {
    res.writeHead(404,{'content-type':contentTypeSelector('html')});
    res.end();
}


function response200(res,str){
    res.writeHead(200,{'content-type':contentTypeSelector('html')});
    res.end(str || "");
}




var contentTypes = {
    "default":"text/html",
    "video":"video/mp4",
    "text":"text/json",
    "html":"text/html"
}



var content_type = 'video'
var pauseEnabled = false;


var server = require('http').createServer(function (req,res){

    var filename,readStream;

    log("Got Request from url:",req.url);
    // todo:
    // create mapping to files on netapp here - processing req.url


    //ignore favicon shit...
    if (!checkHeadersAuth(req)) {
        log("not authorized, exiting, ip:"+req.connection.remoteAddress);
        responseError404(res);
        return;
    }



    filename = mediaPath+req.url
    if (!fs.existsSync(filename)) {
        responseError404(res);
        err('File Not Found',{request:collectRequestMeta(req),filename:filename});
        return;
    } 

      var stat = fs.statSync(filename);
      var total = stat.size;
      if (req.headers['range']) {
        var range = req.headers.range;
        var parts = range.replace(/bytes=/, "").split("-");
        var partialstart = parts[0];
        var partialend = parts[1];

        var start = parseInt(partialstart, 10);
        var end = partialend ? parseInt(partialend, 10) : total-1;
        var chunksize = (end-start)+1;
        console.log('RANGE: ' + start + ' - ' + end + ' = ' + chunksize);

        var file = fs.createReadStream(filename, {start: start, end: end,autoClose:false});
        file.on('open',function(fd){
            console.log(fd);
            res.writeHead(206, { 'Content-Range': 'bytes ' + start + '-' + end + '/' + total, 'Accept-Ranges': 'bytes', 'Content-Length': chunksize, 'Content-Type': contentTypeSelector('video')});
            file.pipe(res); 
        })
        file.on('close',function(){
            file.destroy();
        });
        file.on('error',function(msg){
            logError('error on readstream:'+msg);
        });


      } else {
        console.log('ALL: ' + total);
        var file = fs.createReadStream(filename,{autoClose:false});
        file.on('open',function(){
            res.writeHead(200, { 'Content-Length': total, 'Content-Type': contentTypeSelector('video')});
            file.pipe(res)
        });
        file.on('close',function(){
            file.destroy();
        });
        file.on('error',function(msg){
            logError('error on readstream:'+msg);
        });     

      } 

});

process.on('uncaughtException', function(err) {
    // handle the error safely
    logError(err);
    server.listen(8080);
    logWarn("trying to restart server!!! after error")
});


server.listen(8080);
log('server started, scope:'+this,__filename+' version:'+version);  

Needless to say, when i close the process and restart, everything starts up fresh...
is there a good way to see and provide logs on what is happening inside?...

@bnoordhuis
Copy link
Member

Can you post your question to the mailing list? The issue tracker is for when you have discovered flat out bugs in node.js core but that's still to be determined here.

If you are confident there is a bug in node.js somewhere, then please reduce your test case and make sure it only uses core modules (and mention what versions of node.js you have tested it with.)

As a general word of advice, don't use fs.*Sync() methods in your HTTP handler.

@syberkitten
Copy link
Author

As i'm pretty sure there is bug, will reduce the code to core modules like you suggest,
it should be even easier to reproduce without the monkey patching "graceful-fs".

@syberkitten
Copy link
Author

should i open a new issue?
The error i get when using core modules only, and your suggestion to not use sync
is: error on readstream:Error: EMFILE, open '/media_iphone/015/1234.mp4'

this happens after running 3-4 ab tests:
ab -c 200 -n 200 http://localhost:8080/015/1234.mp4

one of the strange things is that when i close a browser window which plays
a video (above), no events are emitted.

the end / close events are emitted only when the stream reaches its end...

I think this might explain some of the leaking of the EMFILE handlers or something
like that...?

var http = require('http');
//var gfs = require('graceful-fs');
var fs = require('fs');
var util = require('util');
var version = "0.56";
var mediaPath = 'G:/movies';
mediaPath = '/media_iphone';

cache = {
    fileStat:{} 
}


function log(str,data){
    //console.log.apply(this,arguments);
    console.log("log:"+str);

}

function logError(str,data){
    //console.log.apply(this,arguments);
    console.log("error:"+str);

}

function logWarn(str,data){
    //console.log.apply(this,arguments);
    console.log("warn:"+str);
}



function contentTypeSelector(ctype){
    if (!contentTypes[ctype]) return contentTypes["default"];
    return contentTypes[ctype];
}


// todo: do authentication here
function checkHeadersAuth(req){
    var h = req.headers;

    if (req.url.indexOf("favicon")>-1) return false;
    return true;
    if (h["user-agent"].indexOf("Mozilla")==-1) { // invalid headers
        return true;
    }
    return true;
}

/*
returns a metadata object for logging each request)
*/
function collectRequestMeta(req){
    var o = req.headers;
    o.url = req.url;
    o.ip = req.connection.remoteAddress;
    return o;
}

function responseError404(res) {
    res.writeHead(404,{'content-type':contentTypeSelector('html')});
    res.end();
}


function response200(res,str){
    res.writeHead(200,{'content-type':contentTypeSelector('html')});
    res.end(str || "");
}




var contentTypes = {
    "default":"text/html",
    "video":"video/mp4",
    "text":"text/json",
    "html":"text/html"
}



var server = require('http').createServer(function (req,res){

    var filename,readStream;

    log("Got Request from url:"+req.url);



    //readStream.setEncoding('utf-8');
    if (!checkHeadersAuth(req)) {
        log("not authorized, exiting, ip:"+req.connection.remoteAddress);
        responseError404(res);
        return;
    }



    filename = mediaPath+req.url


    log("accessing filename:"+filename)
    if (!cache.fileStat[filename])  { // caching the file stat for later use
        log("loading file");
        fs.stat(filename,function(err,stat){
            if (err) {
                responseError404(res);
                err('File Not Found',{request:collectRequestMeta(req),filename:filename});
                return;
            } 
            cache.fileStat[filename] = stat;
            serveStream(req,res,cache.fileStat[filename],filename);         
        });
        return;
    } 
    log("getting file from cache");
    serveStream(req,res,cache.fileStat[filename],filename);




});


/*
serves the stream when we have the proper file 
and have not failed
*/
function serveStream(req,res,stat,filename) {
    log("serverStream()")
    var total = stat.size;
    var file,responseType;
    var header = {};
    if (req.headers['range']) {
        var range = req.headers.range;
        var parts = range.replace(/bytes=/, "").split("-");
        var partialstart = parts[0];
        var partialend = parts[1];

        var start = parseInt(partialstart, 10);
        var end = partialend ? parseInt(partialend, 10) : total-1;
        var chunksize = (end-start)+1;
        console.log('RANGE: ' + start + ' - ' + end + ' = ' + chunksize);
        file = fs.createReadStream(filename, {start: start, end: end,autoClose:false});
        header = { 'Content-Range': 'bytes ' + start + '-' + end + '/' + total, 'Accept-Ranges': 'bytes', 'Content-Length': chunksize, 'Content-Type': contentTypeSelector('video')};
        responseType = 206;
    } else {
        responseType = 200;     
        file = fs.createReadStream(filename,{autoClose:false,bufferSize:16*1024});
        header = { 'Content-Length': total, 'Content-Type': contentTypeSelector('video')};
    }


    file.on('open',function(fd){
        //log("opening stream:"+fd);
        res.writeHead(responseType, header);
        ///file.pipe(res);  
    })
    file.on('close',function(){
        //log("closing stream");
        file.destroy();
        res.end();
    });
    file.on('end',function(){
        //log("ending stream");
        file.destroy();
    });
    file.on('error',function(msg){
        logError('error on readstream:'+msg);
    });

    file.pipe(res);



}

server.listen(8080);
//log('server started, scope:'+this,__filename+' version:'+version);    
log('server started, version:'+version);    


@bnoordhuis
Copy link
Member

You're creating the file stream with fs.createReadStream(filename, { autoClose: false }) ... what do you expect?

@syberkitten
Copy link
Author

I've put the attribute autoClose:false on purpose :)
it was crashing the way i described by default, and when digging in, i learned it is possible
to disable the default configuration and manually close the descriptors, so that's what i did,
and there was not difference.....

@syberkitten
Copy link
Author

please note, that I'm closing manually:

    file.on('open',function(fd){
        //log("opening stream:"+fd);
        res.writeHead(responseType, header);
        ///file.pipe(res);  
    })
    file.on('close',function(){
        //log("closing stream");
        file.destroy();
        res.end();
    });
    file.on('end',function(){
        //log("ending stream");
        file.destroy();
    });

@syberkitten
Copy link
Author

I actually did not observe any stability improvement via autClose:true / false,
it behaves the same. i can even give it another try for the sport.

But, what i find peculiar... is that no request emits close / end
when the 'consumer' terminates, such as browser window closes.

It only emits the close/end event when the stream reaches its end
normally (and not abruptly)

@syberkitten
Copy link
Author

Ok, i've run many tests, again.

eventually i get to: :error on readstream:Error: EMFILE, open.

Seems to happen every time a specific file is requested more then 1000 times.

As much as i would like to use nodejs, i guess for this project we'll focus
on nginx / apache / php to do the job.. :(

@bnoordhuis
Copy link
Member

I don't mind looking into this but let me repeat:

  • Please mention the version or versions of node.js that you have tested with.
  • Reduce your test case. A test should be succinct and to the point, preferably 20 lines or less. It emphatically should not do a million unrelated things. I simply don't have time to go through reams of someone else's code.

@syberkitten
Copy link
Author

will try to reproduce on other distros...
below is a slim version of the running code.

prepare an ab test with a filename to serve,
change mediaPath to the base folder and run:
ab -n 200 -c 200 http://locahost:8080/myfile.mp4

you don't have to wait for all the video to stream, Ctrl-c
after a few seconds, run again, repeat it until you get
the EMFILE error.

var http = require('http');
//var fs = require('graceful-fs');
var fs = require('fs');
var version = "0.57";
var mediaPath = 'G:/movies';
mediaPath = '/media_iphone';

cache = {
    fileStat:{} 
}

function responseError404(res) {res.writeHead(404,{'content-type':'text/html'});res.end();}
function response200(res,str){res.writeHead(200,{'content-type':'text/html'});res.end(str || "");}


var server = require('http').createServer(function (req,res){

    var filename,readStream;
    console.log("Got Request from url:"+req.url);
    filename = mediaPath+req.url
    console.log("trying to access filename:"+filename)

    // CHECK IF FILE FOR STREAMING EXISTS, IF NOT RETURN, OTHERWISE CACHE ITS STAT OBJECT FOR LATER USER
    if (!cache.fileStat[filename])  { // caching the file stat for later use
        fs.stat(filename,function(err,stat){
            if (err) {
                responseError404(res);
                return;
            } 
            cache.fileStat[filename] = stat;
            serveStream(req,res,cache.fileStat[filename],filename);         
        });
        return;
    } 

    // START SERVING THE REQUEST
    serveStream(req,res,cache.fileStat[filename],filename);
});


/*
serves the stream when we have the proper file 
and have not failed
*/
function serveStream(req,res,stat,filename) {
    console.log("serveStream()")
    var total = stat.size;
    var file,responseType;
    var header = {};
    responseType = 200;     
    file = fs.createReadStream(filename,{autoClose:true,bufferSize:16*1024});
    header = { 'Content-Length': total, 'Content-Type': 'video/mp4'};

    file.on('open',function(fd){
        res.writeHead(responseType, header);

    })
    file.pipe(res);

}

server.listen(8080);

@syberkitten
Copy link
Author

I've also managed to reproduce it in Ubuntu (digital ocean):
Linux version 3.5.0-17-generic (buildd@allspice) (gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-2ubuntu1) ) #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012

nodejs version: v0.11.5-pre

let me know if i can assist further, and thanks for your attention so far -)

@bnoordhuis
Copy link
Member

Thanks, I can confirm the issue with v0.10 and master. Here is the test case stripped to its essentials:

var http = require('http');
var fs = require('fs');

var filename = process.argv[0];
var filesize = fs.statSync(filename).size;

http.createServer(function(req, res){
  var file = fs.createReadStream(filename);
  res.writeHead(200, { 'Content-Length': '' + filesize });
  //res.on('close', file.destroy.bind(file));
  file.pipe(res);
}).listen(8080);

Uncommenting the call to res.on(...) 'fixes' the problem.

@isaacs The response object emits 'close' and 'unpipe' when the connection is aborted but the fs.ReadStream object doesn't seem to emit (or do!) anything. Reading lib/_stream_readable.js, it seems the cleanup code never actually signals the data source, hence it doesn't know to close. I'm assigning this to you.

@bnoordhuis bnoordhuis reopened this Aug 14, 2013
@ghost ghost assigned isaacs Aug 14, 2013
@jonathanong
Copy link

#6220

i personally consider this a streams2 regression. also, some streams, like crypto streams, don't have .destroy() methods, just .close().

@srour
Copy link

srour commented Jan 28, 2014

+1

@tjfontaine
Copy link

see also #7065

@chrisdickinson
Copy link

@tjfontaine, some questions:

  • Are we looking to split this out into a separate event (separate from close)?
  • Or a different named method (other than .close) to avoid accidentally triggering this for userland streams that had a .close method?
  • Is there a use-case for re-piping an unpiped, in-flight stream? Anecdotally, I've never used that feature, but I'm not prepared to say offhand that no-one is :)
  • Or adding more state to pipe that assumes, absent of subsequent pipes / unpipe listeners, that close should be fired on dst.emit('close')?

@zhangshuiyong
Copy link

There is a solution,it may be help you. #8232

@jasnell
Copy link
Member

jasnell commented May 28, 2015

@bnoordhuis @trevnorris ... still an issue?

@bnoordhuis
Copy link
Member

Still an issue. I filed it over at io.js because I think it's a moderately serious bug: nodejs/node#1834

@jasnell jasnell added the P-2 label Jun 3, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants