Skip to content
This repository has been archived by the owner on Oct 30, 2019. It is now read-only.

Error training! #153

Open
ilichev-andrey opened this issue Jan 9, 2017 · 1 comment
Open

Error training! #153

ilichev-andrey opened this issue Jan 9, 2017 · 1 comment

Comments

@ilichev-andrey
Copy link

ilichev-andrey commented Jan 9, 2017

Hello.
I am using model cifar-10 for training custom dataset.
My data:

{
  train : 
    {
      data : FloatTensor - size: 3000x3x96x96
      labels : FloatTensor - size: 3000
    }
  val : 
    {
      data : FloatTensor - size: 1000x3x96x96
      labels : FloatTensor - size: 1000
    }
}

I am change:

  1. 32 -> 96 https://github.com/facebook/fb.resnet.torch/blob/master/datasets/cifar10.lua#L48
  2. 32 -> 96 https://github.com/facebook/fb.resnet.torch/blob/master/models/init.lua#L44

I am change model: output - 7 classes (in my data 7 classes)
My model:

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> output]
  (1): cudnn.SpatialConvolution(3 -> 16, 3x3, 1,1, 1,1) without bias
  (2): nn.SpatialBatchNormalization (4D) (16)
  (3): cudnn.ReLU
  (4): nn.Sequential {
    [input -> (1) -> (2) -> (3) -> output]
    (1): nn.Sequential {
      [input -> (1) -> (2) -> (3) -> output]
      (1): nn.ConcatTable {
        input
          |`-> (1): nn.Sequential {
          |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
          |      (1): cudnn.SpatialConvolution(16 -> 16, 3x3, 1,1, 1,1) without bias
          |      (2): nn.SpatialBatchNormalization (4D) (16)
          |      (3): cudnn.ReLU
          |      (4): cudnn.SpatialConvolution(16 -> 16, 3x3, 1,1, 1,1) without bias
          |      (5): nn.SpatialBatchNormalization (4D) (16)
          |    }
           `-> (2): nn.Identity
           ... -> output
      }
      (2): nn.CAddTable
      (3): cudnn.ReLU
    }
    (2): nn.Sequential {
      [input -> (1) -> (2) -> (3) -> output]
      (1): nn.ConcatTable {
        input
          |`-> (1): nn.Sequential {
          |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
          |      (1): cudnn.SpatialConvolution(16 -> 16, 3x3, 1,1, 1,1) without bias
          |      (2): nn.SpatialBatchNormalization (4D) (16)
          |      (3): cudnn.ReLU
          |      (4): cudnn.SpatialConvolution(16 -> 16, 3x3, 1,1, 1,1) without bias
          |      (5): nn.SpatialBatchNormalization (4D) (16)
          |    }
           `-> (2): nn.Identity
           ... -> output
      }
      (2): nn.CAddTable
      (3): cudnn.ReLU
    }
    (3): nn.Sequential {
      [input -> (1) -> (2) -> (3) -> output]
      (1): nn.ConcatTable {
        input
          |`-> (1): nn.Sequential {
          |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
          |      (1): cudnn.SpatialConvolution(16 -> 16, 3x3, 1,1, 1,1) without bias
          |      (2): nn.SpatialBatchNormalization (4D) (16)
          |      (3): cudnn.ReLU
          |      (4): cudnn.SpatialConvolution(16 -> 16, 3x3, 1,1, 1,1) without bias
          |      (5): nn.SpatialBatchNormalization (4D) (16)
          |    }
           `-> (2): nn.Identity
           ... -> output
      }
      (2): nn.CAddTable
      (3): cudnn.ReLU
    }
  }
  (5): nn.Sequential {
    [input -> (1) -> (2) -> (3) -> output]
    (1): nn.Sequential {
      [input -> (1) -> (2) -> (3) -> output]
      (1): nn.ConcatTable {
        input
          |`-> (1): nn.Sequential {
          |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
          |      (1): cudnn.SpatialConvolution(16 -> 32, 3x3, 2,2, 1,1) without bias
          |      (2): nn.SpatialBatchNormalization (4D) (32)
          |      (3): cudnn.ReLU
          |      (4): cudnn.SpatialConvolution(32 -> 32, 3x3, 1,1, 1,1) without bias
          |      (5): nn.SpatialBatchNormalization (4D) (32)
          |    }
           `-> (2): nn.Sequential {
                 [input -> (1) -> (2) -> output]
                 (1): nn.SpatialAveragePooling(1x1, 2,2)
                 (2): nn.Concat {
                   input
                     |`-> (1): nn.Identity
                      `-> (2): nn.MulConstant
                      ... -> output
                 }
               }
           ... -> output
      }
      (2): nn.CAddTable
      (3): cudnn.ReLU
    }
    (2): nn.Sequential {
      [input -> (1) -> (2) -> (3) -> output]
      (1): nn.ConcatTable {
        input
          |`-> (1): nn.Sequential {
          |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
          |      (1): cudnn.SpatialConvolution(32 -> 32, 3x3, 1,1, 1,1) without bias
          |      (2): nn.SpatialBatchNormalization (4D) (32)
          |      (3): cudnn.ReLU
          |      (4): cudnn.SpatialConvolution(32 -> 32, 3x3, 1,1, 1,1) without bias
          |      (5): nn.SpatialBatchNormalization (4D) (32)
          |    }
           `-> (2): nn.Identity
           ... -> output
      }
      (2): nn.CAddTable
      (3): cudnn.ReLU
    }
    (3): nn.Sequential {
      [input -> (1) -> (2) -> (3) -> output]
      (1): nn.ConcatTable {
        input
          |`-> (1): nn.Sequential {
          |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
          |      (1): cudnn.SpatialConvolution(32 -> 32, 3x3, 1,1, 1,1) without bias
          |      (2): nn.SpatialBatchNormalization (4D) (32)
          |      (3): cudnn.ReLU
          |      (4): cudnn.SpatialConvolution(32 -> 32, 3x3, 1,1, 1,1) without bias
          |      (5): nn.SpatialBatchNormalization (4D) (32)
          |    }
           `-> (2): nn.Identity
           ... -> output
      }
      (2): nn.CAddTable
      (3): cudnn.ReLU
    }
  }
  (6): nn.Sequential {
    [input -> (1) -> (2) -> (3) -> output]
    (1): nn.Sequential {
      [input -> (1) -> (2) -> (3) -> output]
      (1): nn.ConcatTable {
        input
          |`-> (1): nn.Sequential {
          |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
          |      (1): cudnn.SpatialConvolution(32 -> 64, 3x3, 2,2, 1,1) without bias
          |      (2): nn.SpatialBatchNormalization (4D) (64)
          |      (3): cudnn.ReLU
          |      (4): cudnn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1) without bias
          |      (5): nn.SpatialBatchNormalization (4D) (64)
          |    }
           `-> (2): nn.Sequential {
                 [input -> (1) -> (2) -> output]
                 (1): nn.SpatialAveragePooling(1x1, 2,2)
                 (2): nn.Concat {
                   input
                     |`-> (1): nn.Identity
                      `-> (2): nn.MulConstant
                      ... -> output
                 }
               }
           ... -> output
      }
      (2): nn.CAddTable
      (3): cudnn.ReLU
    }
    (2): nn.Sequential {
      [input -> (1) -> (2) -> (3) -> output]
      (1): nn.ConcatTable {
        input
          |`-> (1): nn.Sequential {
          |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
          |      (1): cudnn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1) without bias
          |      (2): nn.SpatialBatchNormalization (4D) (64)
          |      (3): cudnn.ReLU
          |      (4): cudnn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1) without bias
          |      (5): nn.SpatialBatchNormalization (4D) (64)
          |    }
           `-> (2): nn.Identity
           ... -> output
      }
      (2): nn.CAddTable
      (3): cudnn.ReLU
    }
    (3): nn.Sequential {
      [input -> (1) -> (2) -> (3) -> output]
      (1): nn.ConcatTable {
        input
          |`-> (1): nn.Sequential {
          |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
          |      (1): cudnn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1) without bias
          |      (2): nn.SpatialBatchNormalization (4D) (64)
          |      (3): cudnn.ReLU
          |      (4): cudnn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1) without bias
          |      (5): nn.SpatialBatchNormalization (4D) (64)
          |    }
           `-> (2): nn.Identity
           ... -> output
      }
      (2): nn.CAddTable
      (3): cudnn.ReLU
    }
  }
  (7): cudnn.SpatialAveragePooling(8x8, 1,1)
  (8): nn.View(64)
  (9): nn.Linear(64 -> 7)
}

How me run training?

Error:

th main.lua -dataset cifar10 -batchSize 128 -depth 20 -shareGradInput true
=> Creating model from file: models/resnet.lua	
 | ResNet-20 CIFAR-10	
=> Training epoch # 1	

cudnnFindConvolutionForwardAlgorithm failed: 	2	 convDesc=[mode : CUDNN_CROSS_CORRELATION datatype : CUDNN_DATA_FLOAT] hash=-dimA128,32,48,48 -filtA64,32,3,3 128,64,24,24 -padA1,1 -convStrideA2,2 CUDNN_DATA_FLOAT	
/root/facedetect/torch/install/bin/luajit: .../facedetect/torch/install/share/lua/5.1/nn/Container.lua:66: 
In 6 module of nn.Sequential:
In 1 module of nn.Sequential:
In 1 module of nn.Sequential:
In 1 module of nn.ConcatTable:
In 1 module of nn.Sequential:
/root/facedetect/torch/install/share/lua/5.1/cudnn/find.lua:483: cudnnFindConvolutionForwardAlgorithm failed, sizes:  convDesc=[mode : CUDNN_CROSS_CORRELATION datatype : CUDNN_DATA_FLOAT] hash=-dimA128,32,48,48 -filtA64,32,3,3 128,64,24,24 -padA1,1 -convStrideA2,2 CUDNN_DATA_FLOAT
stack traceback:
	[C]: in function 'error'
	/root/facedetect/torch/install/share/lua/5.1/cudnn/find.lua:483: in function 'forwardAlgorithm'
	...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:190: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186>
	[C]: in function 'xpcall'
	.../facedetect/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	...facedetect/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <...facedetect/torch/install/share/lua/5.1/nn/Sequential.lua:41>
	[C]: in function 'xpcall'
	.../facedetect/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	...acedetect/torch/install/share/lua/5.1/nn/ConcatTable.lua:11: in function <...acedetect/torch/install/share/lua/5.1/nn/ConcatTable.lua:9>
	[C]: in function 'xpcall'
	...
	.../facedetect/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	...facedetect/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <...facedetect/torch/install/share/lua/5.1/nn/Sequential.lua:41>
	[C]: in function 'xpcall'
	.../facedetect/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	...facedetect/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	./train.lua:56: in function 'train'
	main.lua:51: in main chunk
	[C]: in function 'dofile'
	...tect/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405e90

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	.../facedetect/torch/install/share/lua/5.1/nn/Container.lua:66: in function 'rethrowErrors'
	...facedetect/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	./train.lua:56: in function 'train'
	main.lua:51: in main chunk
	[C]: in function 'dofile'
	...tect/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405e90

Help me please!
I really need help.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant