Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hr_res101('train') error with gpu,cuda,ubuntu #22

Closed
niamul070 opened this issue Jun 5, 2017 · 16 comments
Closed

hr_res101('train') error with gpu,cuda,ubuntu #22

niamul070 opened this issue Jun 5, 2017 · 16 comments

Comments

@niamul070
Copy link

I am getting the following error. Can you help. I changed the gpus=[1 2 3 4] to gpus=[1] in the hr_res101('train') because it was saying device id from 1:1 is needed. Now I am getting the following error (at the bottom of the page):

hr_res101('train');

ans =

models/widerface-resnet-101-simple-sample256-posfrac0.5-N25-bboxreg-cluster-scaled

Trying to initialize the structure of resnet-101-simple
Unknown model: cannot initialize.
Loading pretrained weights from ./trained_models/imagenet-resnet-101-dag.mat
Loaded imdb from data/widerface/imdb.mat
cluster path: data/widerface/RefBox_N25_scaled.mat

opts =

struct with fields:

  keepDilatedZeros: 0
         inputSize: [500 500]
      learningRate: [1×30 double]
           trainFn: '@cnn_train_dag_hardmine'
     batchGetterFn: '@cnn_get_batch_hardmine'
      freezeResNet: 0
               tag: ''
        clusterNum: 25
       clusterName: 'scaled'
           bboxReg: 1
        skipLRMult: [0 1 0.1000]
        sampleSize: 256
       posFraction: 0.5000
         posThresh: 0.7000
         negThresh: 0.3000
            border: [0 0]
 pretrainModelPath: './trained_models/imagenet-resnet-101-dag.mat'
           dataDir: 'data/widerface'
         modelType: 'resnet-101-simple'
       networkType: 'dagnn'
batchNormalization: 1
  weightInitMethod: 'gaussian'
    minClusterSize: [10 10]
    maxClusterSize: [Inf Inf]
            expDir: 'models/widerface-resnet-101-simple-sample256-posfrac0.5-N25-bboxreg-cluster…'
         batchSize: 12
     numSubBatches: 1
         numEpochs: 50
              gpus: 1
   numFetchThreads: 8
              lite: 0
          imdbPath: 'data/widerface/imdb.mat'
             train: [1×1 struct]

ans =

struct with fields:

            gpus: 1
       batchSize: 12
   numSubBatches: 1
       numEpochs: 50
    learningRate: [1×30 double]
keepDilatedZeros: 0

Start using dagnn.DetLoss for loss
cnn_train_dag_hardmine: resetting GPU
train: epoch 01: 1/1074:Invalid MEX-file '/purcell1/mbaqui/Documents/tiny/utils/compute_dense_overlap.mexa64': dlopen: cannot
load any more object with static TLS.

Error in cnn_get_batch_hardmine (line 378)
iou = compute_dense_overlap(ofx,ofy,stx,sty,vsx,vsy,...

Error in cnn_widerface>getDagNNBatch (line 258)
[images, clsmaps, regmaps] = batchGetter(imagePaths, imageSizes, labelRects, ...

Error in cnn_widerface>@(x,y)getDagNNBatch(batchGetter,bopts,useGpu,x,y) (line 243)
fn = @(x,y) getDagNNBatch(batchGetter, bopts,useGpu,x,y) ;

Error in cnn_train_dag_hardmine>process_epoch (line 268)
inputs = state.getBatch(state.imdb, batch) ;

Error in cnn_train_dag_hardmine (line 148)
[stats.train(epoch),prof] = process_epoch(net, state, opts, 'train') ;

Error in cnn_widerface (line 212)
[net, info] = trainFn(net, imdb, getBatchFn(batchGetter, opts, net.meta), ...

Error in hr_res101 (line 42)
cnn_widerface('inputSize', inputSize, ...

@LeeRock
Copy link

LeeRock commented Jun 6, 2017

I guess you you get something wrong in the previous step, "compile_mex" has a wrong result.

@peiyunh
Copy link
Owner

peiyunh commented Jun 7, 2017

I've never seen this before. Does running compile_mex give you any error?

@niamul070
Copy link
Author

compile_mex did not give any error. I also can run the bbox function to see the selfie.jpg. But when I run the train script (hr_res101.m) then only I am getting this.

@peiyunh
Copy link
Owner

peiyunh commented Jun 7, 2017

compute_dense_overlap is only called during training. I'm not sure why MATLAB gives such error.

@peiyunh peiyunh closed this as completed Jun 9, 2017
@peiyunh peiyunh reopened this Jun 9, 2017
@peiyunh
Copy link
Owner

peiyunh commented Jun 9, 2017

Let me know if you solved it.

@wolfworld6
Copy link

hello,where did you get the model file hr_res101.mat?

@peiyunh
Copy link
Owner

peiyunh commented Jul 24, 2017

@wolfworld6 Please see the latest README or tiny_face_detector.m.

@peiyunh peiyunh closed this as completed Jul 24, 2017
@takecareofbigboss
Copy link

@niamul070 Let me know if you have solver this problem, i have met the same problem...
Or could you please contact me by email: tangxu@shanghaitech.edu.cn

@niamul070
Copy link
Author

niamul070 commented Aug 31, 2017 via email

@takecareofbigboss
Copy link

@peiyunh hey, could you plz tell us the version of your matlab? Such that we can follow ur step. @niamul070 and I met the same problem, and we thought it owes to the differences between our version of matlab.
THXs.

@peiyunh
Copy link
Owner

peiyunh commented Sep 1, 2017

Here is my MATLAB version: 9.1.0.441655 (R2016b). It is unlikely an issue of MATLAB.

@takecareofbigboss and @niamul070, did you compile the MEX file on your system following the README.md?

@wolfworld6
Copy link

wolfworld6 commented Sep 1, 2017 via email

@peiyunh
Copy link
Owner

peiyunh commented Sep 1, 2017

Can you post your compilation error or runtime error here?

@peiyunh
Copy link
Owner

peiyunh commented Sep 1, 2017

I added a test script for compute_dense_overlap. Try this in MATLAB:

>> cd utils;
>> compile_mex;
>> test_compute_dense_overlap;

to see if it works now.

@takecareofbigboss
Copy link

yep, it solved!!! appreciate your help... @peiyunh

@takecareofbigboss
Copy link

@niamul070 you can try it again, all the problems are solved for me.

@peiyunh peiyunh closed this as completed Sep 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants