Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing s3 object (ignored for download): 1515/9670 #48

Closed
v-iashin opened this issue Feb 20, 2022 · 6 comments
Closed

Missing s3 object (ignored for download): 1515/9670 #48

v-iashin opened this issue Feb 20, 2022 · 6 comments

Comments

@v-iashin
Copy link

Running
python -m ego4d.cli.cli --aws_profile_name="ego4d" --output_directory="./ego4d_data" --datasets full_scale annotations --metadata
gives a bunch of warnings that some mp4s are missing on S3:

Boto 403 Exception For exists: a0b8bc0a-9ed2-489b-a31c-48e1afac1bc1 | a0b8bc0a-9ed2-489b-a31c-48e1afac1bc1.mp4
Missing s3 object (ignored for download): a0b8bc0a-9ed2-489b-a31c-48e1afac1bc1 | a0b8bc0a-9ed2-489b-a31c-48e1afac1bc1.mp4

Boto 403 Exception For exists: eaa560d5-6432-4030-a03a-b6ba512e621d | eaa560d5-6432-4030-a03a-b6ba512e621d.mp4
Missing s3 object (ignored for download): eaa560d5-6432-4030-a03a-b6ba512e621d | eaa560d5-6432-4030-a03a-b6ba512e621d.mp4

Boto 403 Exception For exists: 19a7cce8-5771-4ec9-bdd0-189cc34c082d | 19a7cce8-5771-4ec9-bdd0-189cc34c082d.mp4
Missing s3 object (ignored for download): 19a7cce8-5771-4ec9-bdd0-189cc34c082d | 19a7cce8-5771-4ec9-bdd0-189cc34c082d.mp4
...

No existing videos to filter.
ERROR:root:1515/9670 missing S3 downloads will be ignored
Downloading 8155/9670..
Expected size of downloaded files is 5448.6 GB. Do you want to start the download? (y/n)

Is this expectable behavior?

conda env
name: ego4d
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=4.5=1_gnu
  - ca-certificates=2021.10.26=h06a4308_2
  - certifi=2021.10.8=py38h06a4308_2
  - ld_impl_linux-64=2.35.1=h7274673_9
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.3.0=h5101ec6_17
  - libgomp=9.3.0=h5101ec6_17
  - libstdcxx-ng=9.3.0=hd4cf53a_17
  - ncurses=6.3=h7f8727e_2
  - openssl=1.1.1m=h7f8727e_0
  - pip=21.2.4=py38h06a4308_0
  - python=3.8.12=h12debd9_0
  - readline=8.1.2=h7f8727e_1
  - setuptools=58.0.4=py38h06a4308_0
  - sqlite=3.37.2=hc218d9a_0
  - tk=8.6.11=h1ccaba5_0
  - wheel=0.37.1=pyhd3eb1b0_0
  - xz=5.2.5=h7b6447c_0
  - zlib=1.2.11=h7f8727e_4
  - pip:
    - boto3==1.21.3
    - botocore==1.24.3
    - ego4d==1.0
    - jmespath==0.10.0
    - python-dateutil==2.8.2
    - s3transfer==0.5.1
    - six==1.16.0
    - tqdm==4.62.3
    - urllib3==1.26.8
@ebyrne
Copy link
Contributor

ebyrne commented Feb 20, 2022

Please stand by. There's an error on our end for data access on a subset of the videos. We'll have it corrected shortly.

In the meantime, there shouldn't be any issues downloading the remainder and the annotations. Please proceed there!

Apologies for the rocky start!

@ebyrne
Copy link
Contributor

ebyrne commented Feb 20, 2022

@v-iashin This should be resolved. Can you confirm?

@shubham-goel
Copy link

shubham-goel commented Feb 21, 2022

Hi, I was running into the same issue while trying to download clips. It has been resolved at my end (almost). There's just 1 file now which throws a Boto 403 Exception:

Boto 403 Exception For exists: 476870ad-d779-423d-86ec-b8c9c4c54df2 | 476870ad-d779-423d-86ec-b8c9c4c54df2.mp4
Missing s3 object (ignored for download): 476870ad-d779-423d-86ec-b8c9c4c54df2 | 476870ad-d779-423d-86ec-b8c9c4c54df2.mp4
ERROR:root:1/12285 missing S3 downloads will be ignored

@v-iashin
Copy link
Author

Yep, I think it is resolved. At least, it shows 9670/9670 as expected.

On a side note, I run the same command but it seems that it just starts it all over again. Would it be beneficial to add guidance on how to resume the download?

@ebyrne
Copy link
Contributor

ebyrne commented Feb 21, 2022

@v-iashin Vladimir, had it finished the first time? It's on the list, but right now it only rights the version info once at the end - so if it fails or is killed for whatever reason, it won't have written it. (Should certainly be written incrementally given the download timing!)

Is that consistent with what you're seeing?

@v-iashin
Copy link
Author

v-iashin commented Feb 22, 2022

Yes, it finished the first time but only those that were not missing.

I tried to run the same command again in the same folder and it seemed to start all over again. I did not check what it did specifically eg if the version was written.

I cannot check it anymore as I removed the ego4d_data folder already and started again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants