Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scripts for downloading models in DNN samples #18591

Merged
merged 21 commits into from
Dec 11, 2020

Conversation

sl-sergei
Copy link
Contributor

@sl-sergei sl-sergei commented Oct 15, 2020

Relates: #12186

force_builders=linux,docs,Custom,Custom Mac
build_image:Custom=ubuntu-openvino-2020.4.0:16.04
build_image:Custom Win=openvino-2020.4.0
build_image:Custom Mac=openvino-2020.4.0

test_modules:Custom=dnn,python2,python3,java,samples
test_modules:Custom Win=dnn,python2,python3,java,samples
test_modules:Custom Mac=dnn,python2,python3,java,samples

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
test_opencl:Custom=OFF
test_bigdata:Custom=1
test_filter:Custom=*

@sl-sergei sl-sergei requested a review from alalek October 15, 2020 07:55
@sl-sergei
Copy link
Contributor Author

@alalek Can you please review these changes? Please, let me know, if there is something else needed for this task.

@sl-sergei sl-sergei changed the title WIP: Download scripts for DNN samples WIP: Scripts for downloading models in DNN samples Oct 16, 2020
@asmorkalov asmorkalov added this to the 5.0 milestone Oct 16, 2020
@asmorkalov
Copy link
Contributor

@vpisarev please take a look.

@vpisarev
Copy link
Contributor

vpisarev commented Oct 20, 2020

@sl-sergei, thank you!

  1. can you please put force_builders=... comment into the pull request to run it one some build slaves?
  2. I would like to have some decentralized solution (i.e. the info about all downloadable items does not have to be put into one big file), and not necessarily limited to deep learning models. It can also be images, video clips and generally arbitrary data that is needed for a sample or an app. There should be documented program API as well, not just command-line API.

Approximate usage:

import cv2 as cv, sys

try:
   imgpath=cv.download(
       url="https://github.com/opencv/opencv/raw/master/samples/data/lena.jpg",
       sha="12345"
       )
except IOError:
   sys.exit("could not load lena image")

img = cv.imread(imgpath, -1)
... 
  • url and sha are required parameters
  • dir/save_dir is optional parameter. If it's not set, some environment variable should be used (like OPENCV_SAVE_DIR), and if it's empty, it could be opencv_data subdirectory inside $TEMP or something like that. Saving to current directory may not be very good idea, because sample can be executed from different directories. Saving to the same directory as sample is not a good idea either, because it can be in read-only directory (like /usr/share) or on a slow flash drive etc.
  • The file should be saved under <save_dir>/<sha_key>/<base_part_of_url> or maybe as <save_dir>/<sha_key>_<base_part_of_url>, like $TEMP/opencv_data/12345/lena.jpg or $TEMP/opencv_data/12345_lena.jpg in this case. Putting file into a dedicated subdirectory or adding sha as a prefix will help to avoid possible name conflicts
  • timeout could be another optional parameter

@sl-sergei
Copy link
Contributor Author

@vpisarev @alalek Can you please review the changes?

@alalek
Copy link
Member

alalek commented Oct 30, 2020

cv.download

This means that this functionality should be contributed into Python package with OpenCV itself (somewhere here)


subdirectory inside $TEMP

There are several variants which are not secure (due to be globally writable locations, like /tmp).
Consider checking (reusing / adopting) code for OpenCL binaries cache (review this code).

@sl-sergei
Copy link
Contributor Author

cv.download

This means that this functionality should be contributed into Python package with OpenCV itself (somewhere here)

subdirectory inside $TEMP

There are several variants which are not secure (due to be globally writable locations, like /tmp).
Consider checking (reusing / adopting) code for OpenCL binaries cache (review this code).

Okay, I will do it

@vpisarev
Copy link
Contributor

vpisarev commented Nov 9, 2020

@sl-sergei, thank you! I still do not see how this functionality can be used from another Python script, as in the sample that I put above. Can you, please, modify at least one Python sample to download some data automatically?

@sl-sergei
Copy link
Contributor Author

@alalek I updated script logic to make cache folder selection more secure. @vpisarev I added an example to README

@sl-sergei sl-sergei changed the title WIP: Scripts for downloading models in DNN samples Scripts for downloading models in DNN samples Nov 18, 2020
@vpisarev
Copy link
Contributor

@sl-sergei, looks good to me! 👍

@asmorkalov
Copy link
Contributor

@alalek Could we merge it?

Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see changes in current samples.
Is there some "single-click" sample? Which can download all necessary data without passing of extra parameters?

################################################################################
# Object detection models.
################################################################################

# OpenCV's face detection network
opencv_fd:
load_info:
url: "https://github.com/opencv/opencv_3rdparty/raw/dnn_samples_face_detector_20170830/res10_300x300_ssd_iter_140000.caffemodel"
sha: "15aa726b4d46d9f023526d85537db81cbc8dd566"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let use sha1 entry name (because there are several SHA implementations or user may want to pass md5 instead).

temp_env = os.environ.get("TMPDIR", None)
if temp_env is None or not os.path.isdir(temp_env):
temp_dir = Path("/tmp")
print("Using world accessible cache directory. This may be not secure: ", temp_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wrap and reuse C++ implementation instead of adding duplicated logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets rollback changes and add "TODO" here until we have resolved issues in #18931

@sl-sergei
Copy link
Contributor Author

I don't see changes in current samples.
Is there some "single-click" sample? Which can download all necessary data without passing of extra parameters?

python download_models.py command will parse models.yml and face_detector/weights.meta4 and download all models mentioned there

@alalek
Copy link
Member

alalek commented Nov 23, 2020

@vpisarev @asmorkalov @sl-sergei Please provide feedback on comments above.

@sl-sergei
Copy link
Contributor Author

sl-sergei commented Nov 23, 2020

@alalek I have already fixed naming for SHA1 key locally and plan to reuse getCacheDirectory, just in the middle of progress

@vpisarev
Copy link
Contributor

👍

Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

samples/dnn/download_models.py

Probably its name is very specific. We may want to download other files (like annotations, datasets) too.

Also we can keep this "as is" and then proxy into cv.downloader from package when it will be available.


By running the following commands, you will get **MobileNetSSD_deploy.caffemodel** file:
```bash
export OPENCV_SAVE_DIR=save_dir_2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets introduce more specific name here.

It is unclear what is going to be saved here. May be test results?

Something like:

  • OPENCV_DOWNLOAD_DATA_PATH
  • OPENCV_DOWNLOAD_CACHE_DATA_PATH

P.S. _DATA_PATH is aligned with OPENCV_TEST_DATA_PATH

temp_env = os.environ.get("TMPDIR", None)
if temp_env is None or not os.path.isdir(temp_env):
temp_dir = Path("/tmp")
print("Using world accessible cache directory. This may be not secure: ", temp_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets rollback changes and add "TODO" here until we have resolved issues in #18931

Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 👍

@alalek alalek merged commit 1f3255d into opencv:3.4 Dec 11, 2020
@alalek alalek mentioned this pull request Dec 11, 2020
@alalek alalek mentioned this pull request Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants