Download model weights in parallel for prototype CI #4772

pmeier · 2021-10-28T07:49:14Z

cc @datumbox @pmeier @seemethere @bjuncek

facebook-github-bot · 2021-10-28T07:49:21Z

💊 CI failures summary and remediations

As of commit 4f3ddb1 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

datumbox

I know it's WIP, just added couple of FYI comments. Feel free to ignore if it's too early to address.

scripts/collect_model_urls.py

test/test_prototype_models.py

datumbox

LGTM, thanks a lot @pmeier.

@NicolasHug could you also have a look as you are more familiar with CircleCi?

.circleci/config.yml.in

scripts/collect_model_urls.py

datumbox · 2021-11-26T12:00:42Z

scripts/collect_model_urls.py

+            for line in file:
+                model_urls.update(MODEL_URL_PATTERN.findall(line))
+
+    print("\n".join(sorted(model_urls)))


This approach is a bit hacky, though admittedly you can't do much else prior compiling torchvision. If TorchVision was compiled, we could rely on the upcoming registration mechanism to get all available models and weights and then fetch their URLs. Since this is not possible for speed reasons, we might be forced to do something like that. The good thing is even if this fails, we will download the weights properly one-by-one later.

Seeing how fast the download is, we could also do it in the foreground after the installation. Still much faster than downloading sequentially during the tests. Both variants are fine by me. You pick.

13 seconds sounds pretty good to me. It's your call. I don't mind either way.

scripts/collect_model_urls.py

.circleci/config.yml.in

.circleci/config.yml

datumbox

LGTM, thanks for the awesome work @pmeier.

As discussed offline, this solution will be the basis for revamping our testing strategy for models. We will need to make things cross-platform, support cpu/gpu runs etc but that's something we can do on a follow up PR.

Summary: * enable caching of model weights for prototype CI * syntax * syntax * make cache dir dynamic * increase verbosity * fix * use larget CI machine * revert debug output * [DEBUG] test env var usage in save_cache * retry * use checksum for caching * remove env vars because expansion is not working * syntax * cleanup * base caching on model-urls * relax regex * cleanup skips * cleanup * fix skipping logic * improve step name * benchmark without caching * benchmark with external download * debug * fix manual download location * debug again * download weights in the background * try parallel download * add missing import * use correct decoractor * up resource_class * fix wording * enable stdout passthrough to see download during test * remove linebreak * move checkout up * cleanup * debug failing test * temp fix * fix * cleanup * fix regex * remove explicit install of numpy Reviewed By: NicolasHug Differential Revision: D32694305 fbshipit-source-id: 96a9ac5af170ca491edcedf0affdc338481befb8

enable caching of model weights for prototype CI

57ad254

pmeier added module: models module: tests module: ci prototype labels Oct 28, 2021

pytorch-probot bot added the ciflow/default label Oct 28, 2021

facebook-github-bot added the cla signed label Oct 28, 2021

pmeier added 14 commits October 28, 2021 09:53

syntax

54e88ca

syntax

603f615

make cache dir dynamic

35a7a46

increase verbosity

df386b9

fix

96b7411

use larget CI machine

adf0c24

revert debug output

0fc0bdd

[DEBUG] test env var usage in save_cache

b9335aa

retry

7bebe1a

use checksum for caching

a134e03

remove env vars because expansion is not working

c6bdeea

syntax

f443f2a

cleanup

d497f21

base caching on model-urls

4a9f1c1

datumbox reviewed Oct 28, 2021

View reviewed changes

scripts/collect_model_urls.py Outdated Show resolved Hide resolved

test/test_prototype_models.py Outdated Show resolved Hide resolved

pmeier added 4 commits October 28, 2021 15:04

relax regex

bd4c84d

cleanup skips

b5bf51a

Merge branch 'main' into cache-model-weights

b030fdd

cleanup

fe4537b

pmeier marked this pull request as ready for review October 28, 2021 15:01

pmeier requested a review from datumbox October 28, 2021 15:01

datumbox approved these changes Oct 28, 2021

View reviewed changes

pmeier added 6 commits November 26, 2021 09:42

up resource_class

037aeb1

fix wording

70b09b0

enable stdout passthrough to see download during test

a45f56d

remove linebreak

a9eb425

move checkout up

0c81611

cleanup

8b1ebaf

pmeier commented Nov 26, 2021

View reviewed changes

.circleci/config.yml.in Show resolved Hide resolved

.circleci/config.yml.in Show resolved Hide resolved

pmeier requested a review from datumbox November 26, 2021 09:14

pmeier marked this pull request as ready for review November 26, 2021 09:20

pmeier changed the title ~~enable caching of model weights for prototype CI~~ Download model weights in parallel for prototype CI Nov 26, 2021

pmeier added 4 commits November 26, 2021 11:04

debug failing test

133933e

temp fix

60b2716

fix

0a247d4

cleanup

7ed8fe3

pmeier mentioned this pull request Nov 26, 2021

fix mobilenetv3 quantization state dict loading #4997

Merged

datumbox reviewed Nov 26, 2021

View reviewed changes

scripts/collect_model_urls.py Outdated Show resolved Hide resolved

datumbox reviewed Nov 26, 2021

View reviewed changes

scripts/collect_model_urls.py Show resolved Hide resolved

pmeier added 2 commits November 26, 2021 13:57

fix regex

a2f9758

Merge branch 'main' into cache-model-weights

fe3e128

pmeier requested a review from datumbox November 26, 2021 13:01

pmeier commented Nov 26, 2021

View reviewed changes

.circleci/config.yml.in Outdated Show resolved Hide resolved

pmeier commented Nov 26, 2021

View reviewed changes

.circleci/config.yml Outdated Show resolved Hide resolved

remove explicit install of numpy

31f26a6

datumbox approved these changes Nov 26, 2021

View reviewed changes

Merge branch 'main' into cache-model-weights

4f3ddb1

pmeier merged commit 29f38f1 into pytorch:main Nov 26, 2021

pmeier deleted the cache-model-weights branch November 26, 2021 21:58

datumbox mentioned this pull request Aug 16, 2022

Add the S3D architecture to TorchVision #6412

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Download model weights in parallel for prototype CI #4772

Download model weights in parallel for prototype CI #4772

Uh oh!

pmeier commented Oct 28, 2021 •

edited by pytorch-probot bot

Loading

Uh oh!

facebook-github-bot commented Oct 28, 2021 •

edited

Loading

Uh oh!

datumbox left a comment

Uh oh!

Uh oh!

Uh oh!

datumbox left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

datumbox Nov 26, 2021

Uh oh!

pmeier Nov 26, 2021

Uh oh!

datumbox Nov 26, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

datumbox left a comment

Uh oh!

Uh oh!

Download model weights in parallel for prototype CI #4772

Download model weights in parallel for prototype CI #4772

Uh oh!

Conversation

pmeier commented Oct 28, 2021 • edited by pytorch-probot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Oct 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

datumbox left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

datumbox left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

datumbox Nov 26, 2021

Choose a reason for hiding this comment

Uh oh!

pmeier Nov 26, 2021

Choose a reason for hiding this comment

Uh oh!

datumbox Nov 26, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

datumbox left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pmeier commented Oct 28, 2021 •

edited by pytorch-probot bot

Loading

facebook-github-bot commented Oct 28, 2021 •

edited

Loading