-
Notifications
You must be signed in to change notification settings - Fork 2k
performance regression? caffe -> caffe2 #503
Comments
have you checked which engine are you using in Caffe2? You could verify that by looking into predict_net.Proto(). It should have "engine=CUDNN" for every op that you expect to run on CUDA |
Thanks. So when I parse that string ( Does this mean that in order to run on GPU, the |
Hmm, this seems to be related to #323, which doesn't seem to be resolved yet. |
@tdp2110 It seems your code, the way it's written, must run on CPU. You have to manually specify the appropriate device placement (at least, now). |
Hi @sergey-serebryakov, thanks. I tried your
If I use the same
I'll try to dig into the source to try and make sense of this, but maybe you already know. |
@tdp2110 Do you have tensors that you feed into workspace? You can pass device_opts to
I think these are the only places (loading predict and init nets and feeding blobs) where I use device opts. |
I wasn't explicitly pushing into the workspace with
I'll try looking into |
@bwasti , did you have a chance to have a look into Predictor interface and running on GPU ? You have been looking into this if I am not mistaken. |
@tdp2110 , could you try the manual approach for now? (without Predictor) |
@salexspb thanks, I'll look into the lower-level approach for now. |
@tdp2110 Hi ,i met the same problem when i try to use the gpu, such as |
@raininglixinyu Assign CPU and GPU separately for the different layers. Try something implemented by KeyKy here: https://github.com/KeyKy/caffe2/blob/master/caffe2/python/tutorials/Run_Alexnet_in_CPU_and_GPU_mode.ipynb |
Hi @raininglixinyu , I haven't looked at this in a few weeks, and caffe2 is moving pretty fast, but the last time I was working on this I believe my problem was basically that the |
I have an install of caffe and caffe2 on my desktop linux machine (specs at end of post). I have an NVIDIA GPU and I believe both builds (caffe and caffe2) are using the GPU and CUDNN (both projects were built from source). I compared performance of caffe and caffe2 running squeezenet and found significantly better performance in caffe (using GPU mode) than the new caffe2. Here are my scripts
Caffe on GPU
and output:
min time 0.0024528503418, mean time 0.0026821231842, max time 0.0160629749298, time stdev 0.00134584111746
Caffe2
and output:
min time 0.0587921142578, mean time 0.0777468800545, max time 0.131857872009, time stdev 0.0156148011639
(I saw similar performance between the caffe2 model zoo protocol buffers and those generated from a caffe model using
caffe2.python.caffe_translator
).I'm pretty sure my caffe2 setup is using GPU because
python -m caffe2.python.operator_test.relu_op_test
runs OK and references "engine=CUDNN").So the question is, is this expected? Am I doing something wrong? I've tried a few other models, and caffe+GPU seems to beat caffe2 every time.
Here are my machine specs: I'm running 64-bit Ubuntu 16.04, Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz x 8, 16 GB RAM. And my GPU is a GeForce GTX 1080 with NVIDIA-SMI 375.51 Driver Version: 375.51
The text was updated successfully, but these errors were encountered: