Test Error #160

woshigcy · 2017-11-16T03:13:34Z

Hi,
I have built the project, but I met this error when test the model.
.............
11600 videos parsed
11800 videos parsed
12000 videos parsed
12200 videos parsed
12400 videos parsed
12600 videos parsed
12800 videos parsed
13000 videos parsed
13200 videos parsed
frame folder analysis done
Setting device 0
Setting device 1
Setting device 2
Setting device 3
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1116 11:08:02.481065 13484 common.cpp:196] Check failed: error == cudaSuccess (10 vs. 0) invalid device ordinal
*** Check failure stack trace: ***
Setting device 4
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1116 11:08:03.024714 13507 common.cpp:196] Check failed: error == cudaSuccess (10 vs. 0) invalid device ordinal
*** Check failure stack trace: ***
Setting device 5
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1116 11:08:03.412492 13516 common.cpp:196] Check failed: error == cudaSuccess (10 vs. 0) invalid device ordinal
*** Check failure stack trace: ***
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1116 11:08:03.449525 13482 common.cpp:201] Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs. 0) CUBLAS_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1116 11:08:03.471946 13483 common.cpp:201] Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs. 0) CUBLAS_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
Setting device 6
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1116 11:08:03.512379 13481 common.cpp:201] Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs. 0) CUBLAS_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
Setting device 7
Setting device 8
Setting device 9
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1116 11:08:04.135582 13529 common.cpp:196] Check failed: error == cudaSuccess (10 vs. 0) invalid device ordinal
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1116 11:08:04.135614 13521 common.cpp:196] Check failed: error == cudaSuccess (10 vs. 0) invalid device ordinal
*** Check failure stack trace: ***
*** Check failure stack trace: ***
Setting device 10
Setting device 11
......................

Anyone have a suggestion?

yjxiong · 2017-11-28T20:23:06Z

This is usually due to that the network initialization has failed. Try run testing with one GPU and see the error log for more details.

woshigcy · 2017-11-29T01:57:36Z

Thanks xiong, I have solved this problem , it is caused by the version of NVIDIA's driver.

woshigcy · 2017-11-29T04:36:55Z

Hi Xiong, I have a problem when testing the rgb. I got a very low accuracy and the predict label are the same.

TimidLion · 2018-01-23T06:40:15Z

@woshigcy Hi, I have same problem with you. I wonder the detail of how you fixed this problem.

and this is the error that I got when I run eval_net.py with num_worker as 1.

I0123 06:41:43.439946   464 net.cpp:244] conv1/7x7_s2 does not need backward computation.
I0123 06:41:43.439951   464 net.cpp:285] This network produces output fc-action
I0123 06:41:43.440199   464 net.cpp:551] Collecting Learning Rate and Weight Decay.
I0123 06:41:43.440235   464 net.cpp:300] Network initialization done.
I0123 06:41:43.440241   464 net.cpp:301] Memory required for data: 74005652
F0123 06:41:43.527283   464 pooling_layer.cu:213] Check failed: error == cudaSuccess (8 vs. 0)  invalid device function
*** Check failure stack trace: ***
Aborted (core dumped)

Nothgard · 2018-01-23T17:51:12Z

@TimidLion i have the same problem, any solution?

@woshigcy I have nvidia, what was your driver's solution?

yjxiong · 2018-01-23T23:15:15Z

@Nothgard @TimidLion

Check your Caffe build. Make sure your driver is loaded correctly. Make sure you are using the correct version of the Caffe (not the Official version).

Nothgard · 2018-01-23T23:45:28Z

@yjxiong now i have this problem

frame folder analysis done
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 19:12: Message type "caffe.LayerParameter" has no field named "bn_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0123 15:42:48.810196 28312 upgrade_proto.cpp:79] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: models/ucf101/tsn_bn_inception_rgb_deploy.prototxt
*** Check failure stack trace: ***
Aborted (core dumped)

yjxiong · 2018-01-23T23:56:56Z

@Nothgard

This very error indicates you are running another version of Caffe installed on your machine.

TimidLion · 2018-01-24T00:34:10Z

@yjxiong Actually, when building caffe-action, I got some problem of
" unsupported gpu architecture 'compute_70' " error.
I use docker container and nvidia-docker run to start the container with CUDA8.0 and cudnn5.1, and my graphic card is Tesla v100. So I changed Cuda.cmake(TSN/lib/caffe-action/cmake/Cuda.cmake) file's 96 line from
caffe_detect_installed_gpus(__cuda_arch_bin)
to
set(__cuda_arch_bin "61")
Is this the main problem?If you know how to solve such 'compute_70' error, then I will build it in that way again.

yjxiong · 2018-01-24T00:35:50Z

@TimidLion AFIK, v100 only supports CUDA9

TimidLion · 2018-01-24T00:43:40Z

@yjxiong Thanks, I will try with CUDA9.0 version.

yjxiong · 2018-02-01T18:39:10Z

@TimidLion @woshigcy
Any update?

TimidLion · 2018-02-02T00:30:23Z

@yjxiong Actually I give up for this because of some errors, and moved on to the anet2016 repository. It was build completely on CUDA 9.0. Maybe I'll try this later. Thanks!

xiaoyu5301 · 2018-09-01T07:19:48Z

@Nothgard Hi, I have same problem with you. I wonder the detail of how you fixed this problem,what is the version of NVIDIA's driver?

Jhajaykant · 2018-10-01T11:25:21Z

I want to use this with its pre trained model on cpu_only caffe. I compiled caffe for cpu but after running the eval_net. py it is showing can not use gpu in cpu_only caffe i have changed solver mode to cpu but error is same . Further in your description it is not clear that how to fed image data into the program I mean directory architecture of framepath. Can you help with more specific on this.

Thanks in advance

yjxiong · 2018-10-01T17:16:24Z

@Jhajaykant

For custom dataset please see the wiki

https://github.com/yjxiong/temporal-segment-networks/wiki/Working-on-custom-datasets.

In CPU_only you may need to make you are using the Caffe version we provided and set every command of config to use CPU. Besides, I would not suggest you use CPU for training, it could be too slow.

Jhajaykant · 2018-10-03T09:46:12Z

One more question please. After looking at code I feel that the prediction is called on each frame provided in frame path. Am i correct?

woshigcy closed this as completed Nov 29, 2017

woshigcy reopened this Nov 29, 2017

yjxiong closed this as completed Feb 16, 2018

liuxiao214 mentioned this issue Dec 29, 2018

test error with multi gpu #249

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Error #160

Test Error #160

woshigcy commented Nov 16, 2017

yjxiong commented Nov 28, 2017

woshigcy commented Nov 29, 2017

woshigcy commented Nov 29, 2017

TimidLion commented Jan 23, 2018 •

edited

Nothgard commented Jan 23, 2018 •

edited

yjxiong commented Jan 23, 2018

Nothgard commented Jan 23, 2018

yjxiong commented Jan 23, 2018

TimidLion commented Jan 24, 2018

yjxiong commented Jan 24, 2018

TimidLion commented Jan 24, 2018

yjxiong commented Feb 1, 2018

TimidLion commented Feb 2, 2018

xiaoyu5301 commented Sep 1, 2018

Jhajaykant commented Oct 1, 2018 •

edited

yjxiong commented Oct 1, 2018

Jhajaykant commented Oct 3, 2018

Test Error #160

Test Error #160

Comments

woshigcy commented Nov 16, 2017

yjxiong commented Nov 28, 2017

woshigcy commented Nov 29, 2017

woshigcy commented Nov 29, 2017

TimidLion commented Jan 23, 2018 • edited

Nothgard commented Jan 23, 2018 • edited

yjxiong commented Jan 23, 2018

Nothgard commented Jan 23, 2018

yjxiong commented Jan 23, 2018

TimidLion commented Jan 24, 2018

yjxiong commented Jan 24, 2018

TimidLion commented Jan 24, 2018

yjxiong commented Feb 1, 2018

TimidLion commented Feb 2, 2018

xiaoyu5301 commented Sep 1, 2018

Jhajaykant commented Oct 1, 2018 • edited

yjxiong commented Oct 1, 2018

Jhajaykant commented Oct 3, 2018

TimidLion commented Jan 23, 2018 •

edited

Nothgard commented Jan 23, 2018 •

edited

Jhajaykant commented Oct 1, 2018 •

edited