Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some doubts #3

Closed
CSer-Tang-hao opened this issue Sep 20, 2019 · 6 comments
Closed

Some doubts #3

CSer-Tang-hao opened this issue Sep 20, 2019 · 6 comments

Comments

@CSer-Tang-hao
Copy link

Hi, I really appreciate your great work ! But i have some doubts about details.

  1. for miniimagenet ,in torchmeta.datasets.helpers. I find you use resize(84), why not using data augument such as randomcrop(84)? In addtion to above issue, In turn, the dafault setting of transform is resize, totensor. Why no Normalize?
  2. we run your code on miniimagenet, I find that gpus take up a lot of memory compared to normal code with same setting, a 1080Ti is not enough. Why?
    Looking forward to your reply, thank you!
@tristandeleu
Copy link
Owner

Thank you very much for the kind words!

  1. The default processing for Mini-ImageNet in helpers is based on how the dataset was processed in (Ravi et al., 2017), where the images are only scaled to 84x84. That's why there is no RandomCrop, nor Normalize, to keep it as close as possible to the original dataset. However, if normalization and/or random crops are used more often in more recent papers, I'd be happy to change these defaults accordingly! Nonetheless, you can specify your own transform if you want to change the preprocessing, even in the helper function (just use the transform=Compose([...]) argument).
  2. I haven't faced this issue while using the dataloaders on GPU, but I will investigate that! This looks like it might be a memory leak?

@CSer-Tang-hao
Copy link
Author

I a newbie, maybe is my code using pytorchmeta non-standard? Could you please use miniimagenet to test the GPU memory usage on protonet which is in your example? I wonder if it is a bug of my own code? thank you!

@CSer-Tang-hao
Copy link
Author

I can't see the length of BatchMetaDataLoder, I guess that the arg.num_batches in the example is equal to the num of episode which one episode contains "batchsize" tasks(K-way N-shot) , is right? I also want to know : whether the one iteration of BatchMetaDataLoder is random? If so, can it traverse the entire train_dataset?

@tristandeleu
Copy link
Owner

I was indeed able to reproduce the memory issue. This is very likely due to the size of the tensors being stored in memory ((16, 25, 3, 84, 84) & (16, 75, 3, 84, 84), if this is 5-way 5-shot with 15 test images). Here are a couple of solutions:

  • Reduce the batch size. If I'm using batch_size=8 instead of 16 it fits into my GPU (with 12G of memory).
  • Reduce the amount of test images. The test_shots=15 is used for Omniglot, but not necessarily for Mini-Imagenet. If I use test_shots=None (ie. same amount as shots), then it also fits in memory.
  • Be sure to also update the model to make sure that it outputs embeddings with the correct embedding size. I updated it with a simple mean pooling over the final representations, but having additional convolutions might be better. If you don't do that, the embedding will have size 1600 instead of 64.

Here is the diff for my test on Prototypical Networks:

diff --git a/examples/protonet/model.py b/examples/protonet/model.py
index 8d64a83..a4ce1e6 100644
--- a/examples/protonet/model.py
+++ b/examples/protonet/model.py
@@ -24,4 +24,5 @@ class PrototypicalNetwork(nn.Module):

     def forward(self, inputs):
         embeddings = self.encoder(inputs.view(-1, *inputs.shape[2:]))
+        embeddings = embeddings.mean([2, 3])
         return embeddings.view(*inputs.shape[:2], -1)
diff --git a/examples/protonet/train.py b/examples/protonet/train.py
index 1f89913..8cc7cd3 100644
--- a/examples/protonet/train.py
+++ b/examples/protonet/train.py
@@ -2,19 +2,20 @@ import os
 import torch
 from tqdm import tqdm

-from torchmeta.datasets.helpers import omniglot
+from torchmeta.datasets.helpers import omniglot, miniimagenet
 from torchmeta.utils.data import BatchMetaDataLoader

 from model import PrototypicalNetwork
 from utils import get_prototypes, prototypical_loss, get_accuracy

 def train(args):
-    dataset = omniglot(args.folder, shots=args.num_shots, ways=args.num_ways,
-        shuffle=True, test_shots=15, meta_train=True, download=args.download)
+    dataset = miniimagenet(args.folder, shots=args.num_shots, ways=args.num_ways,
+                           shuffle=True, test_shots=None, meta_train=True,
+                           download=args.download)
     dataloader = BatchMetaDataLoader(dataset, batch_size=args.batch_size,
         shuffle=True, num_workers=args.num_workers)

-    model = PrototypicalNetwork(1, args.embedding_size,
+    model = PrototypicalNetwork(3, args.embedding_size,
         hidden_size=args.hidden_size)
     model.to(device=args.device)
     model.train()

Regarding your second question you are right: in the protonet example, num_batches corresponds to the number of episodes and each episode contains batch_size tasks. The iterations of BatchMetaDataLoader are not random by default, but you can use the argument shuffle=True to get random batches. It could traverse the entire dataset, but since the size of the dataset is combinatorial (all possible tuples of classes), then it would be very impractical. One workaround is to stop iterating over BatchMetaDataLoader after a certain number of batches (as in the protonet example).

@CSer-Tang-hao
Copy link
Author

I have to admit that I really appreciate this project, i think if you want to enable seamless and consistent evaluation of meta-learning algorithms , you should verify availability of Torchmeta. You should show the results or examples of different algorithms to us to make a fair comparison. I test Torchmeta on miniImagenet use your provided protonet example, the obtained results are much higher than value provided in the origial paper, so I was amazed. I will continue to care about this project, we are looking forward to your comprehensive experimental verification!

@tristandeleu
Copy link
Owner

Thank you! Having the results for the examples in the repo would be great, I will definitely try to have them available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants