Plain Datumfile input format for minimum memory usage #2193

immars · 2015-03-25T01:30:39Z

When training with leveldb/lmdb, memory increases linearly with iterations and even with Next()s for DataLayer's rand_skip.

Here's simple plain datum file format via std::fstream to address this issue.

Pros

RAM usage basically stay constant (<1G, on googlenet small batches) during training as expected
datum file size between leveldb and lmdb format
no noticeable impact on training speed because of prefetch
concurrent read for multiple process

Cons

no random read. But caffe does not need random key-value access anyway.

sguada · 2015-03-26T16:16:04Z

include/caffe/util/db.hpp

@@ -181,6 +181,87 @@ class LMDB : public DB {
  MDB_dbi mdb_dbi_;
 };

+
+#define MAX_BUF 10485760  // max entry size


Why this value?

It prevents a too large key_size or data_size read from file, maybe from file corruption.
I thought 10M for a datum is large enough, or maybe should be larger? 100M?

immars · 2015-03-27T04:38:49Z

Thanks for the review @sguada !

Plain Datumfile input format for minimum memory usage

weiliu89 · 2015-04-16T21:06:49Z

@immars Thanks for the pull! I have been using it, and found that when I start N training jobs accessing the same datumfile, each one only uses 100/N % of CPU. Is it normal? I am not sure if it is going to make the training slower or not.

immars · 2015-04-18T14:40:22Z

@weiliu89 this should not be happening, not according to my test. No locking is used, training process should not be IO bound either. Are you running N process? what's your iostat -kx 1 output ? or nvidia-smi?

shelhamer · 2017-04-14T01:45:49Z

Closing as better addressed by the Python layer. There are many types of data, and as long as it can be handled in Python it can be handled as a Python layer.

immars added 3 commits March 24, 2015 13:45

datum file initial commit

6f42218

script change

afee83b

lint

bbbda44

immars mentioned this pull request Mar 26, 2015

Caffe memory increases with time(iterations?) #1377

Closed

sguada reviewed Mar 26, 2015
View reviewed changes

fix for code review; CHECK on error conditions

f764b94

immars added 2 commits March 27, 2015 14:24

lint

ccb212c

destructor fix

d03d600

weiliu89 added a commit to weiliu89/caffe that referenced this pull request Apr 14, 2015

Merge pull request BVLC#2193 from immars/datumfile

7b259c7

Plain Datumfile input format for minimum memory usage

weiliu89 added a commit to weiliu89/caffe that referenced this pull request Apr 16, 2015

change default database to datumfile BVLC#2193

06e850f

shelhamer closed this Apr 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plain Datumfile input format for minimum memory usage #2193

Plain Datumfile input format for minimum memory usage #2193

immars commented Mar 25, 2015

sguada Mar 26, 2015

immars Mar 27, 2015

immars commented Mar 27, 2015

weiliu89 commented Apr 16, 2015

immars commented Apr 18, 2015

shelhamer commented Apr 14, 2017

Plain Datumfile input format for minimum memory usage #2193

Plain Datumfile input format for minimum memory usage #2193

Conversation

immars commented Mar 25, 2015

sguada Mar 26, 2015

Choose a reason for hiding this comment

immars Mar 27, 2015

Choose a reason for hiding this comment

immars commented Mar 27, 2015

weiliu89 commented Apr 16, 2015

immars commented Apr 18, 2015

shelhamer commented Apr 14, 2017