Return fully functional GridFSFile objects from getFileList methods #48

wants to merge 1 commit into


None yet
3 participants

chacal commented Sep 23, 2011

At the moment DBObjects returned by GridFS.getFileList methods can be casted to GridFSFiles, but input/output with them fails as _fs instance variable has not been properly initialized using _fix method in GridFS class.

This effectively prevents one to execute "cursor queries" (with limit, skip etc) and still use the returned objects for input/output. Only way to circumvent this is to use find() instead of getFiles(), but it does not provide any API to e.g. limit the resultset and thus in many cases ends up using unnecessary I/O.

This commit wraps DBCursor returned from getFileList with GridFSCursor that uses _fix to inject GridFS instance into GridFSFiles returned when calling next(). This allows one to cast returned DBObjects into GridFSFiles and make I/O operations with them successfully.

I'm not sure if this kind of approach is an appropriate one and would very much like to hear any better suggestions to solve the same problem!

@chacal chacal Inject GridFS instance into GridFSFiles returned by getFileList methods.
Wraps returned DBCursor with GridFSCursor that uses _fix to inject GridFS instance into GridFSFiles returned by calling next().

trishagee commented Jul 2, 2013

Hi, I'm afraid we're going to close this pull request as it's very old.

trishagee closed this Jul 2, 2013

I need to go through the entire gridfs collection. I am currently using getFileList() but facing the issue above.
I need to call an additional findOne to get it working.
My current working code is:

            GridFS grid = new GridFS(mongo.getDB(definition.getMongoDb()),
            cursor = grid.getFileList();
            while (cursor.hasNext()) {
                DBObject object =;
                if (object instanceof GridFSDBFile) {
                    GridFSDBFile file = grid.findOne(new ObjectId(object.get(MONGODB_ID_FIELD).toString()));
                    addToStream(OPLOG_INSERT_OPERATION, null, file);

What would be the prefered way to walk through the entire gridfs collection?
Any reason this PR has not been merged?

richardwilly98 referenced this pull request in richardwilly98/elasticsearch-river-mongodb Sep 27, 2013


Optimize initial import with GridFS #147


trishagee commented Oct 8, 2013

The pull request did not work with the current code base, sadly we'd left it too long and the code had moved on. If someone wanted to submit an updated one, we'd look again at merging it.

However, please be aware that the whole driver is undergoing a large change as we implement the 3.0 driver, and we may find a better way to do this later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment