errors during run the codes provided in the paper #7

byronwwang · 2016-06-27T09:32:03Z

File.lua:141: Unwritable object at <?>.callback.closure.mnist.testdataset.createdataset.readlush.torch.cat

byronwwang · 2016-06-27T11:23:57Z

When I check the torchnet codes, I found that, in listdataset.lua, line 116, for different of list, it should be different. When list is a table, #self.list should be a number, but when list is LongTensor , #self.list is a tensor. Am I right? Maybe this caused the problem.

lvdmaaten · 2016-06-27T13:02:20Z

No this does look like a serialization issue that arises when you're creating the ParallelDatasetIterator. This creates a number of threads; at thread construction time, upvalues are serialized. Can you post a repro? Also, please make sure your threads package is up-to-date.

ghost · 2016-06-27T14:15:56Z

Hello,
I have the same issue when running on fresh Torch+Torchnet + dependencies install

Installed rocks:

argcheck: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
cwrap: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
dok: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
env: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
fftw3: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
gnuplot: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
graph: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
graphicsmagick: 1.scm-0 (installed) - /home/x/torch/install/lib/luarocks/rocks
image: 1.1.alpha-0 (installed) - /home/x/torch/install/lib/luarocks/rocks
lua-cjson: 2.1devel-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
luaffi: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
luafilesystem: 1.6.3-2 (installed) - /home/x/torch/install/lib/luarocks/rocks
luasocket: 2.0rc1-2 (installed) - /home/x/torch/install/lib/luarocks/rocks
md5: 1.2-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
mnist: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
nn: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
nngraph: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
nnx: 0.1-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
optim: 1.0.5-0 (installed) - /home/x/torch/install/lib/luarocks/rocks
paths: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
penlight: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
qtlua: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
qttorch: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
signal: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
sundown: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
sys: 1.1-0 (installed) - /home/x/torch/install/lib/luarocks/rocks
tds: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
threads: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
torch: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
torchnet: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
trepl: scm-1 (installed) - /home/x/torch/install/lib/luarocks/rocks
xlua: 1.0-0 (installed) - /home/x/torch/install/lib/luarocks/rocks

shreyshahi · 2016-06-27T14:16:13Z

Hi, I am facing the same problem with ParallelDatasetIterator. I did a clean new install of torch and installed threads. Can any other library cause the serialization problem?

Is there something similar to pip freeze in Lua? It would be great if we can post the packages and their versions here.

lvdmaaten · 2016-06-27T14:27:33Z

Apparently, the mnist package is not serializable, perhaps because it contains objects that live in C-land, which is why you cannot do this. The solution is to not serialize the mnist object, by putting the content of loadmnist in the closure function (this way, the mnist package will be loaded inside each thread instead of being serialized).

geogeorgiev · 2016-06-27T14:38:07Z

As hinted by @byronwwang there is an actual bug in listdataset.lua, however, it is on line 126. When a torch.LongTensor is provided the assert statement fails with "

both (null) and torch.LongTensor have no less-equal operator"

.
The assert statement should be modified for the case of LongTensor by using :size(1) instead of #. The second option, which I am using, is to convert my list from LongTensor to Lua table with torch.totable()

lvdmaaten · 2016-06-27T14:39:20Z

@GooShan I think I fixed that bug a few minutes ago.

lvdmaaten · 2016-06-27T22:59:59Z

It appears the example in the paper was done for a different version of the mnist package. This code example should work fine:

local tnt   = require 'torchnet'

local function getIterator(mode)
  return tnt.ParallelDatasetIterator{
    nthread = 1,
    init    = function() require 'torchnet' end,
    closure = function()
      local mnist = require 'mnist'
      local dataset = mnist[mode .. 'dataset']()
      dataset.data = dataset.data:reshape(
         dataset.data:size(1),
         dataset.data:size(2) *
         dataset.data:size(3)
      ):double()
      return tnt.BatchDataset{
         batchsize = 128,
         dataset = tnt.ListDataset{
           list = torch.range(
             1, dataset.data:size(1)
           ):long(),
           load = function(idx)
             return {
               input  = dataset.data[idx],
               target = torch.LongTensor{
                 dataset.label[idx] + 1
               },
             } -- sample contains input and target
           end,
        }
      }
    end,
  }
end

local net = nn.Sequential():add(nn.Linear(784,10))

local engine = tnt.SGDEngine()
local meter  = tnt.AverageValueMeter()
local clerr  = tnt.ClassErrorMeter{topk = {1}}
engine.hooks.onStartEpoch = function(state)
  meter:reset()
  clerr:reset()
end
engine.hooks.onForwardCriterion =
function(state)
  meter:add(state.criterion.output)
  clerr:add(
    state.network.output, state.sample.target)
  print(string.format(
    'avg. loss: %2.4f; avg. error: %2.4f',
    meter:value(), clerr:value{k = 1}))
end

local criterion = nn.CrossEntropyCriterion()

engine:train{
  network   = net,
  iterator  = getIterator('train'),
  criterion = criterion,
  lr        = 0.1,
  maxepoch  = 10,
}
engine:test{
  network   = net,
  iterator  = getIterator('test'),
  criterion = criterion,
}

I will add it to the codebase.

lvdmaaten · 2016-06-27T23:12:53Z

See https://github.com/torchnet/torchnet/blob/master/example/mnist.lua for a working example with comments.

lvdmaaten closed this as completed Jun 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

errors during run the codes provided in the paper #7

errors during run the codes provided in the paper #7

byronwwang commented Jun 27, 2016

byronwwang commented Jun 27, 2016

lvdmaaten commented Jun 27, 2016

ghost commented Jun 27, 2016 •

edited by ghost

Loading

shreyshahi commented Jun 27, 2016 •

edited

Loading

lvdmaaten commented Jun 27, 2016

geogeorgiev commented Jun 27, 2016

lvdmaaten commented Jun 27, 2016

lvdmaaten commented Jun 27, 2016

lvdmaaten commented Jun 27, 2016

errors during run the codes provided in the paper #7

errors during run the codes provided in the paper #7

Comments

byronwwang commented Jun 27, 2016

byronwwang commented Jun 27, 2016

lvdmaaten commented Jun 27, 2016

ghost commented Jun 27, 2016 • edited by ghost Loading

shreyshahi commented Jun 27, 2016 • edited Loading

lvdmaaten commented Jun 27, 2016

geogeorgiev commented Jun 27, 2016

lvdmaaten commented Jun 27, 2016

lvdmaaten commented Jun 27, 2016

lvdmaaten commented Jun 27, 2016

ghost commented Jun 27, 2016 •

edited by ghost

Loading

shreyshahi commented Jun 27, 2016 •

edited

Loading