Skip to content

Conversation

hrsvrn
Copy link
Contributor

@hrsvrn hrsvrn commented Sep 4, 2025

Context

With reference to PR #9192

Issue

The google drive links are broken and does not download

How to use?

import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose([
    transforms.Resize((224, 224)), # Resize to a common size for pre-trained models
    transforms.ToTensor(),         # Convert image to a tensor
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) # Normalize
])


caltech101_dataset = torchvision.datasets.Caltech101(
    root='./data',        # Directory to store the dataset
    download=True,      # Set to True to download if not already present
    transform=transform # Apply the defined transformations
)

caltech256_dataset = torchvision.datasets.Caltech256(
    root='./data',        # Directory to store the dataset
    download=True,      # Set to True to download if not already present
    transform=transform # Apply the defined transformations
)

Fix

Replaced the deadlinks with the official CalTech Repository site downloads and made sure that the data downloading is fixed accordingly

Copy link

pytorch-bot bot commented Sep 4, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9205

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 3bb37f4 with merge base 7bd8066 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link

meta-cla bot commented Sep 4, 2025

Hi @hrsvrn!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@meta-cla meta-cla bot added the cla signed label Sep 4, 2025
Copy link

meta-cla bot commented Sep 4, 2025

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Comment on lines 141 to 144
extracted_dir = os.path.join(self.root, "caltech-101")
extract_archive(os.path.join(extracted_dir, "101_ObjectCategories.tar.gz"), self.root)
extract_archive(os.path.join(extracted_dir, "Annotations.tar.gz"), self.root) # Note: Annotations is now also .tar.gz in the new archive
shutil.rmtree(extracted_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we re-use the code from other datasets? For instance in "mnist.py" to extract all possible subdirectories?

for gzip_file in os.listdir(gzip_folder):
    if gzip_file.endswith(".gz"):
        extract_archive(os.path.join(gzip_folder, gzip_file), self.raw_folder)
    shutil.rmtree(gzip_folder)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that works as well.. I will change this one :)

@AntoineSimoulin
Copy link
Member

I realize we have two proposed fixes and issues for this issue #9097 is raising a similar issue and #9098 is implementing a similar fix.

@hrsvrn
Copy link
Contributor Author

hrsvrn commented Sep 5, 2025

Should i still work on this or not?

@hrsvrn
Copy link
Contributor Author

hrsvrn commented Sep 6, 2025

@AntoineSimoulin any updates on this one?

@JonasKlotz
Copy link
Contributor

Hey @hrsvrn
@AntoineSimoulin asked whether if it is ok for me if we use your code instead, see my PR #9098 .
I am fine with it, your code looks good so far! I don't know how to officially review it!

@AntoineSimoulin
Copy link
Member

@hrsvrn yes please let's make it to the finish line for this PR! I just adjusted the linting but I think we are pretty close now! @JonasKlotz I will credit you when merging this PR to acknowledge your suggested changes in #9098!

@hrsvrn
Copy link
Contributor Author

hrsvrn commented Sep 8, 2025

Hey @AntoineSimoulin and @JonasKlotz

sorry for the late reply.
Its totally okay to use my code..
You can merge the PR :)

@AntoineSimoulin AntoineSimoulin merged commit cdc1fee into pytorch:main Sep 9, 2025
59 of 60 checks passed
Copy link

github-actions bot commented Sep 9, 2025

Hey @AntoineSimoulin!

You merged this PR, but no labels were added.
The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants