-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shorten training time as to Fast/Faster R-CNN without any changes on algorithms. #2354
Comments
We know Faster R-CNN's speed can be improved by writing custom C++ layers rather than Python layers, and use GPU implementation for non-max suppression. This is on-going work and we will gradually integrate. |
@cha-zhang Thanks. Can we use multi-GPU processes with higher NCxx at present? Or, should we wait for the above implementation? |
Multi-GPU would certainly help if you need it immediately. NCCL 2 is integrated in v2.2 (releasing today), so multi-machine training should work well. |
@cha-zhang Really?:) Once the releasing is completed, can you share the tutorial link here? |
Will post release notes on main page once it's out. Or, follow us on Twitter @mscntk. |
@cha-zhang @pkranen I tried to train Fast R-CNN as follows:
If NC24 is selected, 2-GPU are used then, but the processing time is almost same as normal processing:
|
This script is not ready for distributed learning. Check scripts like this one: |
In accordance with the comment as above, I tried to proceed Fast R-CNN with distributed version, at first. Here are the conditions. Related script is this Conditions:
## original learner
learner = momentum_sgd(frcn_output.parameters, lr_schedule, mm_schedule, l2_regularization_weight=l2_reg_weight)
## preparation of distributed learning
from cntk import distributed
:
## original learner was re-named to local_learner, which is taken over into data_parallel_distributed_learner
local_learner = momentum_sgd(frcn_output.parameters, lr_schedule, mm_schedule, l2_regularization_weight=l2_reg_weight)
learner = distributed.data_parallel_distributed_learner(local_learner, num_quantization_bits=1, distributed_after=1)
My Questions:
Finished Epoch[1 of 20]: [Training] loss = 4711.033750 * 25, metric = 21.03% * 25 10.268s ( 2.4 samples/s);
Finished Epoch[2 of 20]: [Training] loss = 401.341235 * 40, metric = 2.37% * 40 3.020s ( 13.2 samples/s);
Finished Epoch[3 of 20]: [Training] loss = 232.146558 * 40, metric = 1.55% * 40 3.110s ( 12.9 samples/s);
C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript (master)
(ver2.2sgd) λ python A2_RunWithPyModel.py
--------------------------------------------------------------
2017-09-23 12:49:33
PARAMETERS: datasetName = Grocery
PARAMETERS: cntk_nrRois = 2000
Selected GPU[1] Tesla K80 as the process wide default device.
-------------------------------------------------------------------
Build info:
Built time: Sep 15 2017 07:42:32
Last modified date: Fri Sep 15 04:28:56 2017
Build type: Release
Build target: GPU
With 1bit-SGD: yes
With ASGD: yes
Math lib: mkl
CUDA version: 8.0.0
CUDNN version: 6.0.21
Build Branch: HEAD
Build SHA1: 23878e5d1f73180d6564b6f907b14fe5f53513bb
MPI distribution: Microsoft MPI
MPI version: 7.0.12437.6
-------------------------------------------------------------------
Training Fast R-CNN model for 20 epochs.
Training 54603793 parameters in 6 parameter tensors.
Learning rate per 1 samples: 1e-05
Momentum per 1 samples: 0.9048374180359595
Finished Epoch[1 of 20]: [Training] loss = 3153.350937 * 25, metric = 21.16% * 25 10.572s ( 2.4 samples/s);
Finished Epoch[2 of 20]: [Training] loss = 251.568496 * 25, metric = 2.41% * 25 6.971s ( 3.6 samples/s);
Finished Epoch[3 of 20]: [Training] loss = 147.409863 * 25, metric = 1.94% * 25 6.982s ( 3.6 samples/s);
Finished Epoch[4 of 20]: [Training] loss = 101.552354 * 25, metric = 1.70% * 25 6.965s ( 3.6 samples/s);
Finished Epoch[5 of 20]: [Training] loss = 79.782490 * 25, metric = 1.37% * 25 6.994s ( 3.6 samples/s);
Finished Epoch[6 of 20]: [Training] loss = 68.687617 * 25, metric = 1.25% * 25 6.964s ( 3.6 samples/s);
Finished Epoch[7 of 20]: [Training] loss = 60.549863 * 25, metric = 1.11% * 25 6.986s ( 3.6 samples/s);
Finished Epoch[8 of 20]: [Training] loss = 54.716392 * 25, metric = 0.99% * 25 6.976s ( 3.6 samples/s);
Finished Epoch[9 of 20]: [Training] loss = 50.048423 * 25, metric = 0.97% * 25 7.013s ( 3.6 samples/s);
Finished Epoch[10 of 20]: [Training] loss = 40.542100 * 25, metric = 0.71% * 25 7.001s ( 3.6 samples/s);
Learning rate per 1 samples: 1e-06
Finished Epoch[11 of 20]: [Training] loss = 35.926621 * 25, metric = 0.60% * 25 6.995s ( 3.6 samples/s);
Finished Epoch[12 of 20]: [Training] loss = 34.639031 * 25, metric = 0.56% * 25 7.010s ( 3.6 samples/s);
Finished Epoch[13 of 20]: [Training] loss = 33.962507 * 25, metric = 0.54% * 25 6.990s ( 3.6 samples/s);
Finished Epoch[14 of 20]: [Training] loss = 33.616445 * 25, metric = 0.55% * 25 7.003s ( 3.6 samples/s);
Finished Epoch[15 of 20]: [Training] loss = 33.219561 * 25, metric = 0.53% * 25 7.016s ( 3.6 samples/s);
Learning rate per 1 samples: 1e-07
Finished Epoch[16 of 20]: [Training] loss = 32.881428 * 25, metric = 0.53% * 25 7.012s ( 3.6 samples/s);
Finished Epoch[17 of 20]: [Training] loss = 32.816619 * 25, metric = 0.53% * 25 7.009s ( 3.6 samples/s);
Finished Epoch[18 of 20]: [Training] loss = 32.781428 * 25, metric = 0.52% * 25 6.999s ( 3.6 samples/s);
Finished Epoch[19 of 20]: [Training] loss = 32.750801 * 25, metric = 0.52% * 25 7.003s ( 3.6 samples/s);
Finished Epoch[20 of 20]: [Training] loss = 32.719143 * 25, metric = 0.52% * 25 7.016s ( 3.6 samples/s);
Stored trained model at C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript\Output\frcn_py.model
Evaluating Fast R-CNN model for 5 images.
C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript (master)
(ver2.2sgd) λ python A3_ParseAndEvaluateOutput.py
--------------------------------------------------------------
2017-09-23 12:52:19
PARAMETERS: datasetName = Grocery
PARAMETERS: cntk_nrRois = 2000
Parsing CNTK output for image set: test
Parsing cntk output file, image 0 of 5
Parsing cntk output file, image 1 of 5
Parsing cntk output file, image 2 of 5
Parsing cntk output file, image 3 of 5
Parsing cntk output file, image 4 of 5
test.cache ss roidb loaded from C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript\proc\Grocery_2000\cntkFiles\test.cache_selective_search_roidb.pkl
Processing image 0 of 5..
C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript\cntk_helpers.py:813: RuntimeWarning: overflow encountered in exp
e = np.exp(w)
C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript\cntk_helpers.py:814: RuntimeWarning: invalid value encountered in true_divide
dist = e / np.sum(e, axis=1)[:, np.newaxis]
Number of rois before non-maxima surpression: 3183
Number of rois after non-maxima surpression: 461
Evaluating detections
AP for avocado = 0.5556
AP for orange = 1.0000
AP for butter = 1.0000
AP for champagne = 1.0000
AP for eggBox = 0.7500
AP for gerkin = 1.0000
AP for joghurt = 0.6667
AP for ketchup = 0.6667
AP for orangeJuice = 1.0000
AP for onion = 1.0000
AP for pepper = 1.0000
AP for tomato = 1.0000
AP for water = 0.5000
AP for milk = 1.0000
AP for tabasco = 0.5000
AP for mustard = 1.0000
Mean AP = 0.8524
DONE.
C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript (master)
(ver2.2sgd) λ mpiexec -n 4 python A2_RunWithPyModel_distributed.py
Selected GPU[0] Tesla K80 as the process wide default device.
ping [requestnodes (before change)]: 4 nodes pinging each other
Selected GPU[2] Tesla K80 as the process wide default device.
ping [requestnodes (before change)]: 4 nodes pinging each other
Selected GPU[3] Tesla K80 as the process wide default device.
ping [requestnodes (before change)]: 4 nodes pinging each other
Selected GPU[1] Tesla K80 as the process wide default device.
ping [requestnodes (before change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
ping [requestnodes (after change)]: 4 nodes pinging each other
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (1) are in (participating)
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (0) are in (participating)
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (3) are in (participating)
requestnodes [MPIWrapperMpi]: using 4 out of 4 MPI nodes on a single host (4 requested); we (2) are in (participating)
ping [mpihelper]: 4 nodes pinging each other
ping [mpihelper]: 4 nodes pinging each other
ping [mpihelper]: 4 nodes pinging each other
ping [mpihelper]: 4 nodes pinging each other
-------------------------------------------------------------------
Build info:
Built time: Sep 15 2017 07:42:32
Last modified date: Fri Sep 15 04:28:56 2017
Build type: Release
Build target: GPU
With 1bit-SGD: yes
With ASGD: yes
Math lib: mkl
CUDA version: 8.0.0
CUDNN version: 6.0.21
Build Branch: HEAD
Build SHA1: 23878e5d1f73180d6564b6f907b14fe5f53513bb
MPI distribution: Microsoft MPI
MPI version: 7.0.12437.6
-------------------------------------------------------------------
-------------------------------------------------------------------
Build info:
Built time: Sep 15 2017 07:42:32
Last modified date: Fri Sep 15 04:28:56 2017
Build type: Release
Build target: GPU
With 1bit-SGD: yes
With ASGD: yes
Math lib: mkl
CUDA version: 8.0.0
CUDNN version: 6.0.21
Build Branch: HEAD
Build SHA1: 23878e5d1f73180d6564b6f907b14fe5f53513bb
MPI distribution: Microsoft MPI
MPI version: 7.0.12437.6
-------------------------------------------------------------------
-------------------------------------------------------------------
Build info:
Built time: Sep 15 2017 07:42:32
Last modified date: Fri Sep 15 04:28:56 2017
Build type: Release
Build target: GPU
With 1bit-SGD: yes
With ASGD: yes
Math lib: mkl
CUDA version: 8.0.0
CUDNN version: 6.0.21
Build Branch: HEAD
Build SHA1: 23878e5d1f73180d6564b6f907b14fe5f53513bb
MPI distribution: Microsoft MPI
MPI version: 7.0.12437.6
-------------------------------------------------------------------
-------------------------------------------------------------------
Build info:
Built time: Sep 15 2017 07:42:32
Last modified date: Fri Sep 15 04:28:56 2017
Build type: Release
Build target: GPU
With 1bit-SGD: yes
With ASGD: yes
Math lib: mkl
CUDA version: 8.0.0
CUDNN version: 6.0.21
Build Branch: HEAD
Build SHA1: 23878e5d1f73180d6564b6f907b14fe5f53513bb
MPI distribution: Microsoft MPI
MPI version: 7.0.12437.6
-------------------------------------------------------------------
--------------------------------------------------------------
2017-09-23 12:46:04
PARAMETERS: datasetName = Grocery
PARAMETERS: cntk_nrRois = 2000
Training Fast R-CNN model for 20 epochs.
Training 54603793 parameters in 6 parameter tensors.
Finished Epoch[1 of 20]: [Training] loss = 4711.033750 * 25, metric = 21.03% * 25 10.268s ( 2.4 samples/s);
Finished Epoch[2 of 20]: [Training] loss = 401.341235 * 40, metric = 2.37% * 40 3.020s ( 13.2 samples/s);
Finished Epoch[3 of 20]: [Training] loss = 232.146558 * 40, metric = 1.55% * 40 3.110s ( 12.9 samples/s);
Finished Epoch[4 of 20]: [Training] loss = 201.895728 * 40, metric = 2.42% * 40 3.065s ( 13.1 samples/s);
Finished Epoch[5 of 20]: [Training] loss = 145.298535 * 40, metric = 1.85% * 40 3.004s ( 13.3 samples/s);
Finished Epoch[6 of 20]: [Training] loss = 140.297620 * 40, metric = 2.07% * 40 3.056s ( 13.1 samples/s);
Finished Epoch[7 of 20]: [Training] loss = 79.770465 * 40, metric = 1.12% * 40 3.000s ( 13.3 samples/s);
Finished Epoch[8 of 20]: [Training] loss = 92.765979 * 40, metric = 1.63% * 40 3.057s ( 13.1 samples/s);
Finished Epoch[9 of 20]: [Training] loss = 55.900171 * 40, metric = 1.08% * 40 2.992s ( 13.4 samples/s);
Finished Epoch[10 of 20]: [Training] loss = 72.423962 * 40, metric = 1.37% * 40 2.991s ( 13.4 samples/s);
Finished Epoch[11 of 20]: [Training] loss = 48.268195 * 40, metric = 0.89% * 40 3.035s ( 13.2 samples/s);
Finished Epoch[12 of 20]: [Training] loss = 34.338052 * 40, metric = 0.64% * 40 2.990s ( 13.4 samples/s);
Finished Epoch[13 of 20]: [Training] loss = 40.116125 * 40, metric = 0.73% * 40 3.052s ( 13.1 samples/s);
Finished Epoch[14 of 20]: [Training] loss = 38.849127 * 40, metric = 0.66% * 40 3.033s ( 13.2 samples/s);
Finished Epoch[15 of 20]: [Training] loss = 37.413214 * 40, metric = 0.77% * 40 3.042s ( 13.1 samples/s);
Finished Epoch[16 of 20]: [Training] loss = 40.390021 * 40, metric = 0.77% * 40 3.069s ( 13.0 samples/s);
Finished Epoch[17 of 20]: [Training] loss = 25.550015 * 40, metric = 0.48% * 40 3.008s ( 13.3 samples/s);
Finished Epoch[18 of 20]: [Training] loss = 21.753720 * 40, metric = 0.43% * 40 3.006s ( 13.3 samples/s);
Finished Epoch[19 of 20]: [Training] loss = 35.532837 * 40, metric = 0.62% * 40 3.172s ( 12.6 samples/s);
Finished Epoch[20 of 20]: [Training] loss = 26.353195 * 40, metric = 0.54% * 40 3.040s ( 13.2 samples/s);
Stored trained model at C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript\Output\frcn_py.model
Evaluating Fast R-CNN model for 5 images.
--------------------------------------------------------------
2017-09-23 12:46:04
PARAMETERS: datasetName = Grocery
PARAMETERS: cntk_nrRois = 2000
Training Fast R-CNN model for 20 epochs.
Training 54603793 parameters in 6 parameter tensors.
Finished Epoch[1 of 20]: [Training] loss = 4711.033750 * 25, metric = 21.03% * 25 10.768s ( 2.3 samples/s);
Finished Epoch[2 of 20]: [Training] loss = 401.341235 * 40, metric = 2.37% * 40 3.020s ( 13.2 samples/s);
Finished Epoch[3 of 20]: [Training] loss = 232.146558 * 40, metric = 1.55% * 40 3.108s ( 12.9 samples/s);
Finished Epoch[4 of 20]: [Training] loss = 201.895728 * 40, metric = 2.42% * 40 3.038s ( 13.2 samples/s);
Finished Epoch[5 of 20]: [Training] loss = 145.298535 * 40, metric = 1.85% * 40 3.033s ( 13.2 samples/s);
Finished Epoch[6 of 20]: [Training] loss = 140.297620 * 40, metric = 2.07% * 40 3.055s ( 13.1 samples/s);
Finished Epoch[7 of 20]: [Training] loss = 79.770465 * 40, metric = 1.12% * 40 3.001s ( 13.3 samples/s);
Finished Epoch[8 of 20]: [Training] loss = 92.765979 * 40, metric = 1.63% * 40 3.057s ( 13.1 samples/s);
Finished Epoch[9 of 20]: [Training] loss = 55.900171 * 40, metric = 1.08% * 40 2.991s ( 13.4 samples/s);
Finished Epoch[10 of 20]: [Training] loss = 72.423962 * 40, metric = 1.37% * 40 2.991s ( 13.4 samples/s);
Finished Epoch[11 of 20]: [Training] loss = 48.268195 * 40, metric = 0.89% * 40 3.036s ( 13.2 samples/s);
Finished Epoch[12 of 20]: [Training] loss = 34.338052 * 40, metric = 0.64% * 40 2.990s ( 13.4 samples/s);
Finished Epoch[13 of 20]: [Training] loss = 40.116125 * 40, metric = 0.73% * 40 3.024s ( 13.2 samples/s);
Finished Epoch[14 of 20]: [Training] loss = 38.849127 * 40, metric = 0.66% * 40 3.061s ( 13.1 samples/s);
Finished Epoch[15 of 20]: [Training] loss = 37.413214 * 40, metric = 0.77% * 40 3.043s ( 13.1 samples/s);
Finished Epoch[16 of 20]: [Training] loss = 40.390021 * 40, metric = 0.77% * 40 3.069s ( 13.0 samples/s);
Finished Epoch[17 of 20]: [Training] loss = 25.550015 * 40, metric = 0.48% * 40 3.007s ( 13.3 samples/s);
Finished Epoch[18 of 20]: [Training] loss = 21.753720 * 40, metric = 0.43% * 40 3.006s ( 13.3 samples/s);
Finished Epoch[19 of 20]: [Training] loss = 35.532837 * 40, metric = 0.62% * 40 3.172s ( 12.6 samples/s);
Finished Epoch[20 of 20]: [Training] loss = 26.353195 * 40, metric = 0.54% * 40 3.041s ( 13.2 samples/s);
Stored trained model at C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript\Output\frcn_py.model
Evaluating Fast R-CNN model for 5 images.
--------------------------------------------------------------
2017-09-23 12:46:04
PARAMETERS: datasetName = Grocery
PARAMETERS: cntk_nrRois = 2000
Training Fast R-CNN model for 20 epochs.
Training 54603793 parameters in 6 parameter tensors.
Finished Epoch[1 of 20]: [Training] loss = 4711.033750 * 25, metric = 21.03% * 25 11.277s ( 2.2 samples/s);
Finished Epoch[2 of 20]: [Training] loss = 401.341235 * 40, metric = 2.37% * 40 3.016s ( 13.3 samples/s);
Finished Epoch[3 of 20]: [Training] loss = 232.146558 * 40, metric = 1.55% * 40 3.108s ( 12.9 samples/s);
Finished Epoch[4 of 20]: [Training] loss = 201.895728 * 40, metric = 2.42% * 40 3.039s ( 13.2 samples/s);
Finished Epoch[5 of 20]: [Training] loss = 145.298535 * 40, metric = 1.85% * 40 3.032s ( 13.2 samples/s);
Finished Epoch[6 of 20]: [Training] loss = 140.297620 * 40, metric = 2.07% * 40 3.055s ( 13.1 samples/s);
Finished Epoch[7 of 20]: [Training] loss = 79.770465 * 40, metric = 1.12% * 40 3.001s ( 13.3 samples/s);
Finished Epoch[8 of 20]: [Training] loss = 92.765979 * 40, metric = 1.63% * 40 3.057s ( 13.1 samples/s);
Finished Epoch[9 of 20]: [Training] loss = 55.900171 * 40, metric = 1.08% * 40 2.992s ( 13.4 samples/s);
Finished Epoch[10 of 20]: [Training] loss = 72.423962 * 40, metric = 1.37% * 40 2.991s ( 13.4 samples/s);
Finished Epoch[11 of 20]: [Training] loss = 48.268195 * 40, metric = 0.89% * 40 3.036s ( 13.2 samples/s);
Finished Epoch[12 of 20]: [Training] loss = 34.338052 * 40, metric = 0.64% * 40 2.990s ( 13.4 samples/s);
Finished Epoch[13 of 20]: [Training] loss = 40.116125 * 40, metric = 0.73% * 40 3.024s ( 13.2 samples/s);
Finished Epoch[14 of 20]: [Training] loss = 38.849127 * 40, metric = 0.66% * 40 3.060s ( 13.1 samples/s);
Finished Epoch[15 of 20]: [Training] loss = 37.413214 * 40, metric = 0.77% * 40 3.047s ( 13.1 samples/s);
Finished Epoch[16 of 20]: [Training] loss = 40.390021 * 40, metric = 0.77% * 40 3.064s ( 13.1 samples/s);
Finished Epoch[17 of 20]: [Training] loss = 25.550015 * 40, metric = 0.48% * 40 3.008s ( 13.3 samples/s);
Finished Epoch[18 of 20]: [Training] loss = 21.753720 * 40, metric = 0.43% * 40 3.006s ( 13.3 samples/s);
Finished Epoch[19 of 20]: [Training] loss = 35.532837 * 40, metric = 0.62% * 40 3.172s ( 12.6 samples/s);
Finished Epoch[20 of 20]: [Training] loss = 26.353195 * 40, metric = 0.54% * 40 3.041s ( 13.2 samples/s);
Stored trained model at C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript\Output\frcn_py.model
Evaluating Fast R-CNN model for 5 images.
--------------------------------------------------------------
2017-09-23 12:46:04
PARAMETERS: datasetName = Grocery
PARAMETERS: cntk_nrRois = 2000
Training Fast R-CNN model for 20 epochs.
Training 54603793 parameters in 6 parameter tensors.
Finished Epoch[1 of 20]: [Training] loss = 4711.033750 * 25, metric = 21.03% * 25 9.772s ( 2.6 samples/s);
Finished Epoch[2 of 20]: [Training] loss = 401.341235 * 40, metric = 2.37% * 40 3.016s ( 13.3 samples/s);
Finished Epoch[3 of 20]: [Training] loss = 232.146558 * 40, metric = 1.55% * 40 3.108s ( 12.9 samples/s);
Finished Epoch[4 of 20]: [Training] loss = 201.895728 * 40, metric = 2.42% * 40 3.039s ( 13.2 samples/s);
Finished Epoch[5 of 20]: [Training] loss = 145.298535 * 40, metric = 1.85% * 40 3.032s ( 13.2 samples/s);
Finished Epoch[6 of 20]: [Training] loss = 140.297620 * 40, metric = 2.07% * 40 3.055s ( 13.1 samples/s);
Finished Epoch[7 of 20]: [Training] loss = 79.770465 * 40, metric = 1.12% * 40 3.001s ( 13.3 samples/s);
Finished Epoch[8 of 20]: [Training] loss = 92.765979 * 40, metric = 1.63% * 40 3.057s ( 13.1 samples/s);
Finished Epoch[9 of 20]: [Training] loss = 55.900171 * 40, metric = 1.08% * 40 2.991s ( 13.4 samples/s);
Finished Epoch[10 of 20]: [Training] loss = 72.423962 * 40, metric = 1.37% * 40 2.991s ( 13.4 samples/s);
Finished Epoch[11 of 20]: [Training] loss = 48.268195 * 40, metric = 0.89% * 40 3.036s ( 13.2 samples/s);
Finished Epoch[12 of 20]: [Training] loss = 34.338052 * 40, metric = 0.64% * 40 2.990s ( 13.4 samples/s);
Finished Epoch[13 of 20]: [Training] loss = 40.116125 * 40, metric = 0.73% * 40 3.024s ( 13.2 samples/s);
Finished Epoch[14 of 20]: [Training] loss = 38.849127 * 40, metric = 0.66% * 40 3.061s ( 13.1 samples/s);
Finished Epoch[15 of 20]: [Training] loss = 37.413214 * 40, metric = 0.77% * 40 3.042s ( 13.1 samples/s);
Finished Epoch[16 of 20]: [Training] loss = 40.390021 * 40, metric = 0.77% * 40 3.070s ( 13.0 samples/s);
Finished Epoch[17 of 20]: [Training] loss = 25.550015 * 40, metric = 0.48% * 40 3.008s ( 13.3 samples/s);
Finished Epoch[18 of 20]: [Training] loss = 21.753720 * 40, metric = 0.43% * 40 3.006s ( 13.3 samples/s);
Finished Epoch[19 of 20]: [Training] loss = 35.532837 * 40, metric = 0.62% * 40 3.172s ( 12.6 samples/s);
Finished Epoch[20 of 20]: [Training] loss = 26.353195 * 40, metric = 0.54% * 40 3.041s ( 13.2 samples/s);
Stored trained model at C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript\Output\frcn_py.model
Evaluating Fast R-CNN model for 5 images.
C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript (master)
(ver2.2sgd) λ python A3_ParseAndEvaluateOutput.py
--------------------------------------------------------------
2017-09-23 12:48:18
PARAMETERS: datasetName = Grocery
PARAMETERS: cntk_nrRois = 2000
Parsing CNTK output for image set: test
Parsing cntk output file, image 0 of 5
Parsing cntk output file, image 1 of 5
Parsing cntk output file, image 2 of 5
Parsing cntk output file, image 3 of 5
Parsing cntk output file, image 4 of 5
test.cache ss roidb loaded from C:\git\ver2.2\CNTK\Examples\Image\Detection\FastRCNN\BrainScript\proc\Grocery_2000\cntkFiles\test.cache_selective_search_roidb.pkl
Processing image 0 of 5..
Number of rois before non-maxima surpression: 3184
Number of rois after non-maxima surpression: 487
Evaluating detections
AP for avocado = 0.5556
AP for orange = 1.0000
AP for butter = 1.0000
AP for champagne = 1.0000
AP for eggBox = 0.7500
AP for gerkin = 1.0000
AP for joghurt = 0.6667
AP for ketchup = 0.6667
AP for orangeJuice = 1.0000
AP for onion = 1.0000
AP for pepper = 1.0000
AP for tomato = 1.0000
AP for water = 0.5000
AP for milk = 1.0000
AP for tabasco = 1.0000
AP for mustard = 1.0000
Mean AP = 0.8837
DONE. |
@kyoro1 Thanks for the detailed info. To answer your questions: As seen in distributed log, there are similar 4 blocks. Is it usual results? I imagined that data-parallel is a kind of architecture of dividing data-set in each GPU and aggregating them. Log should be aggregated in 1 block, shouldn't it? I wonder if this log structure is correct.
Sample size was over 40 except the 1st epoch(=25). In usual cases, it should be the number of training samples. What causes the difference between 25 and 40(from the 2nd epoch to the final)?
Mean AP was different from original(Mean AP = 0.8837) and distributed(Mean AP = 0.8837). I assumed that the only difference is distribution, and Mean AP should be the same. Is it correct setting?
|
@cha-zhang Thanks for your comment. Then, trial as above is almost expected except the log architecture, isn't it? |
Not in the short term that we will work on this. It would be great if you could send us a PR. :) |
Here is the 1st step for Fast R-CNN with distributed learning. 1312bf8 |
We'd like to use
Fast/Faster R-CNN
, and it takes about 30 minutes for a bunch of images underNC6
Azure environment.As far as I checked, there seemed to be no tremendous improvements regarding training time only when I changed the environment(
NCxx
). i.e. It also took about 30 minutes with those kind of procedures underNC12
orNC24
.Questions:
NC12
orNC24
, what kind of parameter setting is needed?The text was updated successfully, but these errors were encountered: