Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EfficientNet extractor slow for large number of points per image. #31

Closed
beijbom opened this issue Oct 11, 2020 · 25 comments
Closed

EfficientNet extractor slow for large number of points per image. #31

beijbom opened this issue Oct 11, 2020 · 25 comments
Assignees

Comments

@beijbom
Copy link
Collaborator

beijbom commented Oct 11, 2020

I ran some more thorough timing tests and found to my surprise that while EfficientNet extractor is (as expected) faster at low number of points per image, it's slower for large (>100) number of points.

These issue are replicated both locally in Docker and in the cloud. To replicate run scripts/docker/runtimes.py inside the docker. Specifically:

docker run -v ${PWD}:/workspace/spacer/ -v /path/to/your/local/models:/workspace/models -it beijbom/pyspacer:v0.2.7 python3 scripts/docker/runtimes.py

Here is what I got in docker:
legend: image_size, nbr_points, extractor, runtime (s)

'1000, 10, efficientnet_b0_ver1: 20.01', '1000, 10, vgg16_coralnet_ver1: 32.42', 
'1000, 50, efficientnet_b0_ver1: 64.39', '1000, 50, vgg16_coralnet_ver1: 72.45', 
'1000, 100, efficientnet_b0_ver1: 113.83', '1000, 100, vgg16_coralnet_ver1: 132.77', 
'1000, 200, efficientnet_b0_ver1: 175.81', '1000, 200, vgg16_coralnet_ver1: 125.75'

Note that these measurements are very noisy since they ran on the local system. But it's quite clear that vgg16 is faster for 200 points while efficientNet is faster for fewer points. I'm also attaching some more robust stats from AWS.

In the first plot note how the EfficientNet extractors are al slower for nbr points > 400 or so.
Screen Shot 2020-10-11 at 1 16 40 PM

In this plot we see that (a) the effect of image size is minimal (b) runtime is linear in number of points (c) efficientNet is actually slower on average (across all runs) since it's so slow for large nbr points.
Screen Shot 2020-10-11 at 1 16 57 PM

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 11, 2020

@qiminchen : I found some surprising runtime results for the EfficientNet extractor. Do you mind taking a look?

@qiminchen
Copy link
Collaborator

hmm this is interesting, no problem, i will take a look at this.

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 12, 2020

Thanks. LMK if you are able to reproduce the problem.

@qiminchen
Copy link
Collaborator

qiminchen commented Oct 12, 2020

I pulled the boto3 branch and ran it locally in Docker, here is the detailed log:

(pyspacer) qimin@qimin-desktop:~/Projects/pyspacer_boto3$ sudo docker run -v ${PWD}:/workspace/spacer/ -v /home/qimin/Projects/pyspacer_boto3/spacer/models:/workspace/models -it beijbom/pyspacer:v0.2.7 python3 scripts/docker/runtimes.py
[sudo] password for qimin: 
2020-10-11 23:55:41,089 Entering: extract_features [regression_job]
2020-10-11 23:55:41,089 Initializing EfficientNetExtractor
2020-10-11 23:55:41,089 Entering: fetching of model to /workspace/models/efficientnet_b0_ver1.pt
2020-10-11 23:55:41,089 Entering: downloading from efficientnet_b0_ver1.pt
2020-10-11 23:55:45,092 Exiting: downloading from efficientnet_b0_ver1.pt after 4.002337 seconds.
2020-10-11 23:55:45,092 Exiting: fetching of model to /workspace/models/efficientnet_b0_ver1.pt after 4.002535 seconds.
2020-10-11 23:55:45,094 Entering: actual extraction
2020-10-11 23:55:45,094 Entering: cropping 10 patches
2020-10-11 23:55:45,116 Exiting: cropping 10 patches after 0.021515 seconds.
2020-10-11 23:55:45,185 Entering: loading of state dict
2020-10-11 23:55:45,266 Exiting: loading of state dict after 0.081680 seconds.
2020-10-11 23:55:45,266 Entering: model initialization
2020-10-11 23:55:45,284 Exiting: model initialization after 0.017527 seconds.
2020-10-11 23:55:45,567 Exiting: actual extraction after 0.472969 seconds.
2020-10-11 23:55:45,567 Entering: storing features
2020-10-11 23:55:45,574 Exiting: storing features after 0.007057 seconds.
2020-10-11 23:55:45,574 Exiting: extract_features [regression_job] after 4.485260 seconds.
1000, 10, efficientnet_b0_ver1: 0.47
2020-10-11 23:55:46,081 Entering: extract_features [regression_job]
2020-10-11 23:55:46,081 Initializing VGG16CaffeExtractor
2020-10-11 23:55:46,081 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt
2020-10-11 23:55:46,081 Entering: downloading from vgg16_coralnet_ver1.deploy.prototxt
2020-10-11 23:55:46,319 Exiting: downloading from vgg16_coralnet_ver1.deploy.prototxt after 0.237684 seconds.
2020-10-11 23:55:46,319 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt after 0.237876 seconds.
2020-10-11 23:55:46,320 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel
2020-10-11 23:55:46,320 Entering: downloading from vgg16_coralnet_ver1.caffemodel
2020-10-11 23:55:56,828 Exiting: downloading from vgg16_coralnet_ver1.caffemodel after 10.508263 seconds.
2020-10-11 23:55:56,828 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel after 10.508506 seconds.
2020-10-11 23:55:56,831 Entering: actual extraction
2020-10-11 23:55:56,982 Generating new fontManager, this may take some time...
2020-10-11 23:55:57,319 Entering: cropping of 10 patches
2020-10-11 23:55:57,336 Exiting: cropping of 10 patches after 0.017110 seconds.
2020-10-11 23:55:57,336 Entering: load net, check SHA
2020-10-11 23:55:57,763 Exiting: load net, check SHA after 0.426871 seconds.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 552712164
2020-10-11 23:56:05,031 Exiting: actual extraction after 8.200101 seconds.
2020-10-11 23:56:05,031 Entering: storing features
2020-10-11 23:56:05,041 Exiting: storing features after 0.009868 seconds.
2020-10-11 23:56:05,041 Exiting: extract_features [regression_job] after 18.960158 seconds.
1000, 10, vgg16_coralnet_ver1: 7.71
2020-10-11 23:56:05,488 Entering: extract_features [regression_job]
2020-10-11 23:56:05,489 Initializing EfficientNetExtractor
2020-10-11 23:56:05,489 Entering: fetching of model to /workspace/models/efficientnet_b0_ver1.pt
2020-10-11 23:56:05,489 Exiting: fetching of model to /workspace/models/efficientnet_b0_ver1.pt after 0.000045 seconds.
2020-10-11 23:56:05,489 Entering: actual extraction
2020-10-11 23:56:05,489 Entering: cropping 50 patches
2020-10-11 23:56:05,504 Exiting: cropping 50 patches after 0.015199 seconds.
2020-10-11 23:56:05,544 Entering: loading of state dict
2020-10-11 23:56:05,624 Exiting: loading of state dict after 0.079590 seconds.
2020-10-11 23:56:05,624 Entering: model initialization
2020-10-11 23:56:05,640 Exiting: model initialization after 0.016114 seconds.
2020-10-11 23:56:06,920 Exiting: actual extraction after 1.430882 seconds.
2020-10-11 23:56:06,920 Entering: storing features
2020-10-11 23:56:06,955 Exiting: storing features after 0.034690 seconds.
2020-10-11 23:56:06,955 Exiting: extract_features [regression_job] after 1.466515 seconds.
1000, 50, efficientnet_b0_ver1: 1.43
2020-10-11 23:56:07,378 Entering: extract_features [regression_job]
2020-10-11 23:56:07,378 Initializing VGG16CaffeExtractor
2020-10-11 23:56:07,378 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt
2020-10-11 23:56:07,378 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt after 0.000049 seconds.
2020-10-11 23:56:07,378 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel
2020-10-11 23:56:07,379 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel after 0.000031 seconds.
2020-10-11 23:56:07,379 Entering: actual extraction
2020-10-11 23:56:07,379 Entering: cropping of 50 patches
2020-10-11 23:56:07,393 Exiting: cropping of 50 patches after 0.014673 seconds.
2020-10-11 23:56:27,184 Exiting: actual extraction after 19.805501 seconds.
2020-10-11 23:56:27,184 Entering: storing features
2020-10-11 23:56:27,234 Exiting: storing features after 0.049812 seconds.
2020-10-11 23:56:27,235 Exiting: extract_features [regression_job] after 19.857114 seconds.
1000, 50, vgg16_coralnet_ver1: 19.81
2020-10-11 23:56:27,642 Entering: extract_features [regression_job]
2020-10-11 23:56:27,642 Initializing EfficientNetExtractor
2020-10-11 23:56:27,642 Entering: fetching of model to /workspace/models/efficientnet_b0_ver1.pt
2020-10-11 23:56:27,642 Exiting: fetching of model to /workspace/models/efficientnet_b0_ver1.pt after 0.000051 seconds.
2020-10-11 23:56:27,642 Entering: actual extraction
2020-10-11 23:56:27,642 Entering: cropping 100 patches
2020-10-11 23:56:27,657 Exiting: cropping 100 patches after 0.014642 seconds.
2020-10-11 23:56:27,693 Entering: loading of state dict
2020-10-11 23:56:27,762 Exiting: loading of state dict after 0.068956 seconds.
2020-10-11 23:56:27,762 Entering: model initialization
2020-10-11 23:56:27,780 Exiting: model initialization after 0.017154 seconds.
2020-10-11 23:56:30,370 Exiting: actual extraction after 2.728266 seconds.
2020-10-11 23:56:30,370 Entering: storing features
2020-10-11 23:56:30,436 Exiting: storing features after 0.065761 seconds.
2020-10-11 23:56:30,437 Exiting: extract_features [regression_job] after 2.795361 seconds.
1000, 100, efficientnet_b0_ver1: 2.73
2020-10-11 23:56:30,858 Entering: extract_features [regression_job]
2020-10-11 23:56:30,858 Initializing VGG16CaffeExtractor
2020-10-11 23:56:30,858 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt
2020-10-11 23:56:30,858 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt after 0.000049 seconds.
2020-10-11 23:56:30,858 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel
2020-10-11 23:56:30,858 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel after 0.000030 seconds.
2020-10-11 23:56:30,858 Entering: actual extraction
2020-10-11 23:56:30,858 Entering: cropping of 100 patches
2020-10-11 23:56:30,873 Exiting: cropping of 100 patches after 0.014403 seconds.
2020-10-11 23:57:07,041 Exiting: actual extraction after 36.183207 seconds.
2020-10-11 23:57:07,041 Entering: storing features
2020-10-11 23:57:07,139 Exiting: storing features after 0.097209 seconds.
2020-10-11 23:57:07,141 Exiting: extract_features [regression_job] after 36.283386 seconds.
1000, 100, vgg16_coralnet_ver1: 36.18
2020-10-11 23:57:07,601 Entering: extract_features [regression_job]
2020-10-11 23:57:07,601 Initializing EfficientNetExtractor
2020-10-11 23:57:07,601 Entering: fetching of model to /workspace/models/efficientnet_b0_ver1.pt
2020-10-11 23:57:07,601 Exiting: fetching of model to /workspace/models/efficientnet_b0_ver1.pt after 0.000049 seconds.
2020-10-11 23:57:07,601 Entering: actual extraction
2020-10-11 23:57:07,601 Entering: cropping 200 patches
2020-10-11 23:57:07,616 Exiting: cropping 200 patches after 0.014559 seconds.
2020-10-11 23:57:07,697 Entering: loading of state dict
2020-10-11 23:57:07,767 Exiting: loading of state dict after 0.070400 seconds.
2020-10-11 23:57:07,767 Entering: model initialization
2020-10-11 23:57:07,784 Exiting: model initialization after 0.016606 seconds.
2020-10-11 23:57:13,049 Exiting: actual extraction after 5.448241 seconds.
2020-10-11 23:57:13,050 Entering: storing features
2020-10-11 23:57:13,181 Exiting: storing features after 0.131054 seconds.
2020-10-11 23:57:13,182 Exiting: extract_features [regression_job] after 5.581481 seconds.
1000, 200, efficientnet_b0_ver1: 5.45
2020-10-11 23:57:13,690 Entering: extract_features [regression_job]
2020-10-11 23:57:13,690 Initializing VGG16CaffeExtractor
2020-10-11 23:57:13,690 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt
2020-10-11 23:57:13,690 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt after 0.000047 seconds.
2020-10-11 23:57:13,690 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel
2020-10-11 23:57:13,690 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel after 0.000028 seconds.
2020-10-11 23:57:13,690 Entering: actual extraction
2020-10-11 23:57:13,690 Entering: cropping of 200 patches
2020-10-11 23:57:13,705 Exiting: cropping of 200 patches after 0.014597 seconds.
2020-10-11 23:58:22,434 Exiting: actual extraction after 68.743641 seconds.
2020-10-11 23:58:22,434 Entering: storing features
2020-10-11 23:58:22,627 Exiting: storing features after 0.193378 seconds.
2020-10-11 23:58:22,632 Exiting: extract_features [regression_job] after 68.942363 seconds.
1000, 200, vgg16_coralnet_ver1: 68.74
['1000, 10, efficientnet_b0_ver1: 0.47', '1000, 10, vgg16_coralnet_ver1: 7.71', '1000, 50, efficientnet_b0_ver1: 1.43', '1000, 50, vgg16_coralnet_ver1: 19.81', '1000, 100, efficientnet_b0_ver1: 2.73', '1000, 100, vgg16_coralnet_ver1: 36.18', '1000, 200, efficientnet_b0_ver1: 5.45', '1000, 200, vgg16_coralnet_ver1: 68.74']
'1000, 10, efficientnet_b0_ver1: 0.47', '1000, 10, vgg16_coralnet_ver1: 7.71', 
'1000, 50, efficientnet_b0_ver1: 1.43', '1000, 50, vgg16_coralnet_ver1: 19.81', 
'1000, 100, efficientnet_b0_ver1: 2.73', '1000, 100, vgg16_coralnet_ver1: 36.18', 
'1000, 200, efficientnet_b0_ver1: 5.45', '1000, 200, vgg16_coralnet_ver1: 68.74'

Seems to work fine? This is really confusing me..

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 12, 2020

Hmm. Yeah, those results look great. Indeed, they look amazing. 10 points in 0.47 seconds -- you sure you are not using the GPU? How much RAM have you allocated to your Docker Engine? And how many CPU core? I have mine capped at 4GB and 1 core. I tried to up it to 8GB and 2 CPU cores just now, but not seeing much of a difference.

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 12, 2020

Here is detailed log and summary using 2CPU cores and 8GB RAM. A bit better, but nowhere near what you got. What host system are you using?

docker run -v ${PWD}:/workspace/spacer/ -v /Users/beijbom/spacermodels/:/workspace/models -it beijbom/pyspacer:v0.2.7 python3 scripts/docker/runtimes.py
2020-10-12 01:16:06,732 Entering: extract_features [regression_job]
2020-10-12 01:16:06,733 Entering: initializing EfficientNetExtractor
2020-10-12 01:16:06,733 Entering: fetching of model to /workspace/models/efficientnet_b0_ver1.pt
2020-10-12 01:16:06,736 Exiting: fetching of model to /workspace/models/efficientnet_b0_ver1.pt after 0.002226 seconds.
2020-10-12 01:16:06,736 Exiting: initializing EfficientNetExtractor after 0.002635 seconds.
2020-10-12 01:16:06,736 Entering: actual extraction
2020-10-12 01:16:06,736 Entering: cropping 10 patches
2020-10-12 01:16:06,800 Exiting: cropping 10 patches after 0.063286 seconds.
2020-10-12 01:16:06,945 Entering: loading of state dict
2020-10-12 01:16:07,734 Exiting: loading of state dict after 0.789864 seconds.
2020-10-12 01:16:07,735 Entering: model initialization
2020-10-12 01:16:07,765 Exiting: model initialization after 0.030119 seconds.
2020-10-12 01:16:12,419 Exiting: actual extraction after 5.682878 seconds.
2020-10-12 01:16:12,419 Entering: storing features
2020-10-12 01:16:12,432 Exiting: storing features after 0.012887 seconds.
2020-10-12 01:16:12,433 Exiting: extract_features [regression_job] after 5.700227 seconds.
1000, 10, efficientnet_b0_ver1: 5.68
2020-10-12 01:16:12,957 Entering: extract_features [regression_job]
2020-10-12 01:16:12,957 Entering: initializing VGG16CaffeExtractor
2020-10-12 01:16:12,958 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt
2020-10-12 01:16:12,959 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt after 0.001851 seconds.
2020-10-12 01:16:12,960 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel
2020-10-12 01:16:12,962 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel after 0.001956 seconds.
2020-10-12 01:16:12,962 Exiting: initializing VGG16CaffeExtractor after 0.004308 seconds.
2020-10-12 01:16:12,962 Entering: actual extraction
2020-10-12 01:16:13,678 Generating new fontManager, this may take some time...
2020-10-12 01:16:14,837 Entering: cropping of 10 patches
2020-10-12 01:16:14,867 Exiting: cropping of 10 patches after 0.029503 seconds.
2020-10-12 01:16:14,867 Entering: load net, check SHA
2020-10-12 01:16:19,899 Exiting: load net, check SHA after 5.031558 seconds.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 552712164
2020-10-12 01:16:32,421 Exiting: actual extraction after 19.459403 seconds.
2020-10-12 01:16:32,422 Entering: storing features
2020-10-12 01:16:32,442 Exiting: storing features after 0.020557 seconds.
2020-10-12 01:16:32,443 Exiting: extract_features [regression_job] after 19.486018 seconds.
1000, 10, vgg16_coralnet_ver1: 17.58
2020-10-12 01:16:32,993 Entering: extract_features [regression_job]
2020-10-12 01:16:32,993 Entering: initializing EfficientNetExtractor
2020-10-12 01:16:32,994 Entering: fetching of model to /workspace/models/efficientnet_b0_ver1.pt
2020-10-12 01:16:32,997 Exiting: fetching of model to /workspace/models/efficientnet_b0_ver1.pt after 0.003189 seconds.
2020-10-12 01:16:32,997 Exiting: initializing EfficientNetExtractor after 0.004260 seconds.
2020-10-12 01:16:32,998 Entering: actual extraction
2020-10-12 01:16:32,998 Entering: cropping 50 patches
2020-10-12 01:16:33,045 Exiting: cropping 50 patches after 0.046805 seconds.
2020-10-12 01:16:33,162 Entering: loading of state dict
2020-10-12 01:16:33,876 Exiting: loading of state dict after 0.713654 seconds.
2020-10-12 01:16:33,876 Entering: model initialization
2020-10-12 01:16:33,900 Exiting: model initialization after 0.023959 seconds.
2020-10-12 01:16:54,079 Exiting: actual extraction after 21.080544 seconds.
2020-10-12 01:16:54,079 Entering: storing features
2020-10-12 01:16:54,135 Exiting: storing features after 0.055750 seconds.
2020-10-12 01:16:54,137 Exiting: extract_features [regression_job] after 21.144073 seconds.
1000, 50, efficientnet_b0_ver1: 21.08
2020-10-12 01:16:54,679 Entering: extract_features [regression_job]
2020-10-12 01:16:54,680 Entering: initializing VGG16CaffeExtractor
2020-10-12 01:16:54,680 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt
2020-10-12 01:16:54,682 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt after 0.001316 seconds.
2020-10-12 01:16:54,682 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel
2020-10-12 01:16:54,683 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel after 0.001433 seconds.
2020-10-12 01:16:54,683 Exiting: initializing VGG16CaffeExtractor after 0.003597 seconds.
2020-10-12 01:16:54,684 Entering: actual extraction
2020-10-12 01:16:54,685 Entering: cropping of 50 patches
2020-10-12 01:16:54,706 Exiting: cropping of 50 patches after 0.021327 seconds.
2020-10-12 01:17:16,732 Exiting: actual extraction after 22.048003 seconds.
2020-10-12 01:17:16,733 Entering: storing features
2020-10-12 01:17:16,804 Exiting: storing features after 0.071336 seconds.
2020-10-12 01:17:16,807 Exiting: extract_features [regression_job] after 22.127240 seconds.
1000, 50, vgg16_coralnet_ver1: 22.05
2020-10-12 01:17:17,331 Entering: extract_features [regression_job]
2020-10-12 01:17:17,332 Entering: initializing EfficientNetExtractor
2020-10-12 01:17:17,332 Entering: fetching of model to /workspace/models/efficientnet_b0_ver1.pt
2020-10-12 01:17:17,334 Exiting: fetching of model to /workspace/models/efficientnet_b0_ver1.pt after 0.001556 seconds.
2020-10-12 01:17:17,334 Exiting: initializing EfficientNetExtractor after 0.002424 seconds.
2020-10-12 01:17:17,335 Entering: actual extraction
2020-10-12 01:17:17,335 Entering: cropping 100 patches
2020-10-12 01:17:17,356 Exiting: cropping 100 patches after 0.021657 seconds.
2020-10-12 01:17:17,415 Entering: loading of state dict
2020-10-12 01:17:17,846 Exiting: loading of state dict after 0.431348 seconds.
2020-10-12 01:17:17,846 Entering: model initialization
2020-10-12 01:17:17,869 Exiting: model initialization after 0.022797 seconds.
2020-10-12 01:17:57,242 Exiting: actual extraction after 39.907069 seconds.
2020-10-12 01:17:57,242 Entering: storing features
2020-10-12 01:17:57,352 Exiting: storing features after 0.110384 seconds.
2020-10-12 01:17:57,355 Exiting: extract_features [regression_job] after 40.023672 seconds.
1000, 100, efficientnet_b0_ver1: 39.91
2020-10-12 01:17:57,915 Entering: extract_features [regression_job]
2020-10-12 01:17:57,915 Entering: initializing VGG16CaffeExtractor
2020-10-12 01:17:57,915 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt
2020-10-12 01:17:57,918 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt after 0.002661 seconds.
2020-10-12 01:17:57,918 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel
2020-10-12 01:17:57,920 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel after 0.002314 seconds.
2020-10-12 01:17:57,920 Exiting: initializing VGG16CaffeExtractor after 0.005639 seconds.
2020-10-12 01:17:57,921 Entering: actual extraction
2020-10-12 01:17:57,921 Entering: cropping of 100 patches
2020-10-12 01:17:57,947 Exiting: cropping of 100 patches after 0.025141 seconds.
2020-10-12 01:18:37,146 Exiting: actual extraction after 39.224730 seconds.
2020-10-12 01:18:37,146 Entering: storing features
2020-10-12 01:18:37,287 Exiting: storing features after 0.140645 seconds.
2020-10-12 01:18:37,295 Exiting: extract_features [regression_job] after 39.379993 seconds.
1000, 100, vgg16_coralnet_ver1: 39.22
2020-10-12 01:18:37,845 Entering: extract_features [regression_job]
2020-10-12 01:18:37,845 Entering: initializing EfficientNetExtractor
2020-10-12 01:18:37,846 Entering: fetching of model to /workspace/models/efficientnet_b0_ver1.pt
2020-10-12 01:18:37,848 Exiting: fetching of model to /workspace/models/efficientnet_b0_ver1.pt after 0.002127 seconds.
2020-10-12 01:18:37,849 Exiting: initializing EfficientNetExtractor after 0.003290 seconds.
2020-10-12 01:18:37,850 Entering: actual extraction
2020-10-12 01:18:37,851 Entering: cropping 200 patches
2020-10-12 01:18:37,875 Exiting: cropping 200 patches after 0.023983 seconds.
2020-10-12 01:18:37,937 Entering: loading of state dict
2020-10-12 01:18:38,375 Exiting: loading of state dict after 0.437415 seconds.
2020-10-12 01:18:38,375 Entering: model initialization
2020-10-12 01:18:38,401 Exiting: model initialization after 0.025561 seconds.
2020-10-12 01:20:02,139 Exiting: actual extraction after 84.289119 seconds.
2020-10-12 01:20:02,139 Entering: storing features
2020-10-12 01:20:02,399 Exiting: storing features after 0.259331 seconds.
2020-10-12 01:20:02,405 Exiting: extract_features [regression_job] after 84.559914 seconds.
1000, 200, efficientnet_b0_ver1: 84.29
2020-10-12 01:20:02,925 Entering: extract_features [regression_job]
2020-10-12 01:20:02,926 Entering: initializing VGG16CaffeExtractor
2020-10-12 01:20:02,926 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt
2020-10-12 01:20:02,928 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.deploy.prototxt after 0.001830 seconds.
2020-10-12 01:20:02,928 Entering: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel
2020-10-12 01:20:02,931 Exiting: fetching of model to /workspace/models/vgg16_coralnet_ver1.caffemodel after 0.002186 seconds.
2020-10-12 01:20:02,931 Exiting: initializing VGG16CaffeExtractor after 0.005246 seconds.
2020-10-12 01:20:02,932 Entering: actual extraction
2020-10-12 01:20:02,932 Entering: cropping of 200 patches
2020-10-12 01:20:02,956 Exiting: cropping of 200 patches after 0.024268 seconds.
2020-10-12 01:21:18,862 Exiting: actual extraction after 75.930817 seconds.
2020-10-12 01:21:18,863 Entering: storing features
2020-10-12 01:21:19,130 Exiting: storing features after 0.266736 seconds.
2020-10-12 01:21:19,140 Exiting: extract_features [regression_job] after 76.214743 seconds.
1000, 200, vgg16_coralnet_ver1: 75.93
['1000, 10, efficientnet_b0_ver1: 5.68', '1000, 10, vgg16_coralnet_ver1: 17.58', 
'1000, 50, efficientnet_b0_ver1: 21.08', '1000, 50, vgg16_coralnet_ver1: 22.05', 
'1000, 100, efficientnet_b0_ver1: 39.91', '1000, 100, vgg16_coralnet_ver1: 39.22', 
'1000, 200, efficientnet_b0_ver1: 84.29', '1000, 200, vgg16_coralnet_ver1: 75.93']

@qiminchen
Copy link
Collaborator

No I not using the GPU, actually our code doesn't support GPU so far since we did not move any model and data to GPU. How do you specify the resources, like RAM and number of core when running the docker? I am not very familiar with Docker

@qiminchen
Copy link
Collaborator

Ubuntu 18.04.5 with AMD® Ryzen 7 3700x 8-core processor × 16

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 12, 2020

Hey @qiminchen . Re. GPU: gotit -- that is what I thought.

Re. setting resources, I haven't done it on Ubuntu, but there are plenty of resources online, like this one: https://hostadvice.com/how-to/how-to-limit-a-docker-containers-resources-on-ubuntu-18-04/.

Just as a sanity check, can you try running again using the exact same image that I used? You can pull it down using docker pull beijbom/pyspacer:v0.2.7.

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 12, 2020

@StephenChan : would you mind running this also so we get a third data point?

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 12, 2020

I ran again with 4 cores and 16GB ram -- basically my whole laptop. Slightly better yet again:

['1000, 10, efficientnet_b0_ver1: 5.78', '1000, 10, vgg16_coralnet_ver1: 18.91', 
'1000, 50, efficientnet_b0_ver1: 22.93', '1000, 50, vgg16_coralnet_ver1: 44.96', 
'1000, 100, efficientnet_b0_ver1: 32.40', '1000, 100, vgg16_coralnet_ver1: 52.44', 
'1000, 200, efficientnet_b0_ver1: 57.04', '1000, 200, vgg16_coralnet_ver1: 106.46']

@qiminchen
Copy link
Collaborator

I pulled the exact same image and ran a couple of tests and here is the full comparison. I'm pretty sure I have the RAM limit correct but not sure about the CPU core as the only option I found on the official document that might be related to CPU core is --cpuset-cpus and I basically change it to --cpuset-cpus "0", --cpuset-cpus "0-1", --cpuset-cpus "0-3" etc for 1, 2 and 4 cores.

sudo docker run --cpuset-cpus "0" --memory 4GB -v ${PWD}:/workspace/spacer/ -v /home/qimin/Projects/pyspacer_boto3/spacer/models:/workspace/models -it beijbom/pyspacer:v0.2.7 python3 scripts/docker/runtimes.py
  • System: Ubuntu 18.04.5
  • Processor: AMD® Ryzen 7 3700x 8-core processor × 16
  1. 4GB RAM + 1 CPU Core
'1000, 10, efficientnet_b0_ver1: 0.65', '1000, 10, vgg16_coralnet_ver1: 22.75', 
'1000, 50, efficientnet_b0_ver1: 2.13', '1000, 50, vgg16_coralnet_ver1: 64.75', 
'1000, 100, efficientnet_b0_ver1: 4.13', '1000, 100, vgg16_coralnet_ver1: 118.90', 
'1000, 200, efficientnet_b0_ver1: 8.25', '1000, 200, vgg16_coralnet_ver1: 226.95'

$ sudo docker stats
CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT   MEM %               NET I/O             BLOCK I/O           PIDS
240b2695cb44        relaxed_banzai      100.60%             1.953GiB / 4GiB     48.82%              2.37MB / 33.8kB     904MB / 0B          1
  1. 8GB + 2 CPU Core
'1000, 10, efficientnet_b0_ver1: 0.48', '1000, 10, vgg16_coralnet_ver1: 13.31', 
'1000, 50, efficientnet_b0_ver1: 1.55', '1000, 50, vgg16_coralnet_ver1: 36.97', 
'1000, 100, efficientnet_b0_ver1: 2.98', '1000, 100, vgg16_coralnet_ver1: 67.81', 
'1000, 200, efficientnet_b0_ver1: 5.68', '1000, 200, vgg16_coralnet_ver1: 129.16']

$ sudo docker stats
CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT   MEM %               NET I/O             BLOCK I/O           PIDS
f3327768d446        thirsty_hawking     195.57%             1.986GiB / 8GiB     24.82%              1.58MB / 24.5kB     242kB / 0B          2
  1. 16GB + 4 CPU Core
'1000, 10, efficientnet_b0_ver1: 0.46', '1000, 10, vgg16_coralnet_ver1: 8.75', 
'1000, 50, efficientnet_b0_ver1: 1.42', '1000, 50, vgg16_coralnet_ver1: 23.31', 
'1000, 100, efficientnet_b0_ver1: 2.68', '1000, 100, vgg16_coralnet_ver1: 42.99', 
'1000, 200, efficientnet_b0_ver1: 5.22', '1000, 200, vgg16_coralnet_ver1: 81.68'

$ sudo docker stats
CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT   MEM %               NET I/O             BLOCK I/O           PIDS
e4fafb4ea324        wonderful_keldysh   384.70%             1.974GiB / 16GiB    12.34%              1.58MB / 23.2kB     0B / 0B             4
  1. 32GB + 8 CPU Core
'1000, 10, efficientnet_b0_ver1: 0.45', '1000, 10, vgg16_coralnet_ver1: 7.94', 
'1000, 50, efficientnet_b0_ver1: 1.43', '1000, 50, vgg16_coralnet_ver1: 19.82', 
'1000, 100, efficientnet_b0_ver1: 2.70', '1000, 100, vgg16_coralnet_ver1: 36.06', 
'1000, 200, efficientnet_b0_ver1: 5.44', '1000, 200, vgg16_coralnet_ver1: 69.12'

$ sudo docker stats
CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
70b05e75175e        relaxed_hamilton    753.50%             2.059GiB / 31.29GiB   6.58%               1.97MB / 28.6kB     0B / 0B             8

Seems like VGG16 gains much more benefit from increasing the RAM than EfficientNet-b0. And I think the performance might be highly affected by the CPU types, also the performance of Pytorch vs. Caffe.

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 13, 2020

Ok. So to summarize. The runs below are all using 1 CPU core and 4GB RAM.

@qiminchen 's results using AMD® Ryzen 7 3700x 8-core processor × 16 on ubuntu:

'1000, 10, efficientnet_b0_ver1: 0.65', '1000, 10, vgg16_coralnet_ver1: 22.75', 
'1000, 50, efficientnet_b0_ver1: 2.13', '1000, 50, vgg16_coralnet_ver1: 64.75', 
'1000, 100, efficientnet_b0_ver1: 4.13', '1000, 100, vgg16_coralnet_ver1: 118.90', 
'1000, 200, efficientnet_b0_ver1: 8.25', '1000, 200, vgg16_coralnet_ver1: 226.95'

My results using 2.4 GHz Dual-Core Intel Core i7 on OSX:

'1000, 10, efficientnet_b0_ver1: 20.01', '1000, 10, vgg16_coralnet_ver1: 32.42', 
'1000, 50, efficientnet_b0_ver1: 64.39', '1000, 50, vgg16_coralnet_ver1: 72.45', 
'1000, 100, efficientnet_b0_ver1: 113.83', '1000, 100, vgg16_coralnet_ver1: 132.77', 
'1000, 200, efficientnet_b0_ver1: 175.81', '1000, 200, vgg16_coralnet_ver1: 125.75'

My results using m4.large AWS instance with a 2.3 GHz Intel Xeon® E5-2686 v4 CPU:

10428793-d7e9-4e4a-9300-b6b24c148682, 10, efficientnet_b0_ver1, 1000, 11.451279640197754
800bc059-876f-46fd-af1a-fbecbb87bb1a, 50, efficientnet_b0_ver1, 1000, 53.84459185600281
fae35b67-5478-4084-a4e1-3b7bbdf46bda, 100, efficientnet_b0_ver1, 1000, 105.54940605163574
90eb5cea-60ad-4405-9ad7-9a87b624befa, 200, efficientnet_b0_ver1, 1000, 210.5010862350464
 
e63e2add-00af-4be9-814e-81c82925c737, 10, vgg16_coralnet_ver1, 1000, 22.87420344352722
8ea42e26-4d56-4caa-8121-a0aacdef2df3, 50, vgg16_coralnet_ver1, 1000, 58.998679399490356
bae75373-efb3-402b-be16-f99ada22b6c4, 100, vgg16_coralnet_ver1, 1000, 120.24310874938965
da0207ca-1417-4f60-b25d-7d2c7065aecf, 200, vgg16_coralnet_ver1, 1000, 223.89115595817566

A few notes.

  • The Caffe results are more or less the same across all thee setups.
  • My laptop results and the m4.large results are also more or less the same.
  • The torch results from @qiminchen , on the other hand, are ~30x faster.

So there is a clear outlier in @qiminchen 's torch runtimes. I did some googling but didn't fine any evidence that AMD would be so much faster for pytorch compared to Intel. It just doesn't make sense to me. Let's keep digging.

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 13, 2020

@qiminchen : I think I found it. For some reson the timing measurements in your log don't add up: E.g.

2020-10-11 23:55:45,574 Exiting: extract_features [regression_job] after 4.485260 seconds.
1000, 10, efficientnet_b0_ver1: 0.47

I'll take a look and see what might have gone wrong there.

@qiminchen
Copy link
Collaborator

Looks like this is the number from the actual extraction 2020-10-11 23:55:45,567 Exiting: actual extraction after 0.472969 seconds.

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 13, 2020

Looks like this is the number from the actual extraction 2020-10-11 23:55:45,567 Exiting: actual extraction after 0.472969 seconds.

You are right. That first measurement is longer than actual b/c it is downloading the model. The subsequent runs are timed correctly. I guess the mystery remains.

@qiminchen
Copy link
Collaborator

this one from your log seems to be weird. the actual extraction took 19.459403 seconds but 1000, 10, vgg16_coralnet_ver1: 17.58

2020-10-12 01:16:14,867 Exiting: cropping of 10 patches after 0.029503 seconds.
2020-10-12 01:16:14,867 Entering: load net, check SHA
2020-10-12 01:16:19,899 Exiting: load net, check SHA after 5.031558 seconds.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 552712164
2020-10-12 01:16:32,421 Exiting: actual extraction after 19.459403 seconds.
2020-10-12 01:16:32,422 Entering: storing features
2020-10-12 01:16:32,442 Exiting: storing features after 0.020557 seconds.
2020-10-12 01:16:32,443 Exiting: extract_features [regression_job] after 19.486018 seconds.
1000, 10, vgg16_coralnet_ver1: 17.58

@StephenChan
Copy link
Member

I didn't get around to testing this today. I can do it tomorrow if the problem hasn't been figured out by then.

@qiminchen
Copy link
Collaborator

qiminchen commented Oct 14, 2020

Again, I pulled the exact same image and boto3 branch and tested the runtime on my laptop: 2.3 GHz Quad-Core Intel Core i5 on MacOS. This time I was able to reproduce the issue on my laptop. Here are the results:

  1. 1 core + 4GB RAM
'1000, 10, efficientnet_b0_ver1: 8.29', '1000, 10, vgg16_coralnet_ver1: 17.51', 
'1000, 50, efficientnet_b0_ver1: 33.34', '1000, 50, vgg16_coralnet_ver1: 27.32', 
'1000, 100, efficientnet_b0_ver1: 64.95', '1000, 100, vgg16_coralnet_ver1: 49.93', 
'1000, 200, efficientnet_b0_ver1: 135.88', '1000, 200, vgg16_coralnet_ver1: 101.42'
  1. 2 cores + 8GB RAM
'1000, 10, efficientnet_b0_ver1: 4.54', '1000, 10, vgg16_coralnet_ver1: 13.04', 
'1000, 50, efficientnet_b0_ver1: 17.50', '1000, 50, vgg16_coralnet_ver1: 17.93', 
'1000, 100, efficientnet_b0_ver1: 33.82', '1000, 100, vgg16_coralnet_ver1: 33.23', 
'1000, 200, efficientnet_b0_ver1: 67.42', '1000, 200, vgg16_coralnet_ver1: 62.74'
  1. 4 cores + 16GB RAM
'1000, 10, efficientnet_b0_ver1: 2.93', '1000, 10, vgg16_coralnet_ver1: 11.38', 
'1000, 50, efficientnet_b0_ver1: 10.55', '1000, 50, vgg16_coralnet_ver1: 14.87', 
'1000, 100, efficientnet_b0_ver1: 19.42', '1000, 100, vgg16_coralnet_ver1: 27.51', 
'1000, 200, efficientnet_b0_ver1: 37.40', '1000, 200, vgg16_coralnet_ver1: 49.96'

I guess this might be because of the AMD vs Intel. Is it possible that you test the runtime on an AWS instance with an AMD CPU instead of an Intel one?

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 14, 2020 via email

@StephenChan
Copy link
Member

StephenChan commented Oct 14, 2020

VirtualBox on my laptop with Intel CPU. The VM is Ubuntu Server 20.04, set to 1 core + 4 GB RAM:

'1000, 10, efficientnet_b0_ver1: 8.79', '1000, 10, vgg16_coralnet_ver1: 45.31',
'1000, 50, efficientnet_b0_ver1: 44.62', '1000, 50, vgg16_coralnet_ver1: 56.41',
'1000, 100, efficientnet_b0_ver1: 80.95', '1000, 100, vgg16_coralnet_ver1: 103.31',
'1000, 200, efficientnet_b0_ver1: 161.94', '1000, 200, vgg16_coralnet_ver1: 197.27'

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 15, 2020

@StephenChan @qiminchen . I setup a compute environment on Batch with only AMD instanced. Lo and behold: it worked.

10 pts using efficientnet_b0_ver1: 0.80 seconds
50 pts using efficientnet_b0_ver1: 2.87 seconds
100 pts using efficientnet_b0_ver1: 4.52 seconds
200 pts using efficientnet_b0_ver1: 8.41 seconds
500 pts using efficientnet_b0_ver1: 20.84 seconds

10 pts using vgg16_coralnet_ver1: 29.97 seconds
50 pts using vgg16_coralnet_ver1: 65.96 seconds
100 pts using vgg16_coralnet_ver1: 111.88 seconds
200 pts using vgg16_coralnet_ver1: 179.14 seconds
500 pts using vgg16_coralnet_ver1: 568.88 seconds

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 15, 2020

It is still unclear to me why this is a thing. And I haven't found any clues online. But I'll consider this problem solved, for now. ;)

@beijbom beijbom closed this as completed Oct 15, 2020
@kriegman
Copy link

kriegman commented Oct 15, 2020 via email

@beijbom
Copy link
Collaborator Author

beijbom commented Oct 15, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants