Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom Data #44

Closed
Kunika05 opened this issue Jun 2, 2022 · 6 comments
Closed

custom Data #44

Kunika05 opened this issue Jun 2, 2022 · 6 comments
Labels
question Further information is requested

Comments

@Kunika05
Copy link

Kunika05 commented Jun 2, 2022

I want to train this model on custom data but I did not understand the split for CUB and I could not even get documentation on EasySet ? do you know where it is ?
i just have 2 classes in my data btw

@Kunika05 Kunika05 added the question Further information is requested label Jun 2, 2022
@ebennequin
Copy link
Collaborator

Hi! Your question is not entirely clear to me. Do you mean to use CUB or custom data?

In any case, the EasySet class, along with its docstring, is here. Following this doc, if you mean to use custom data with two classes, you need to provide a JSON specification file like this:

        {
            "class_names": [
                "class_1_name",
                "class_2_name"
            ],
            "class_roots": [
                "path/to/class_1_folder",
                "path/to/class_2_folder"
            ]
        }

With all images for the first class being located in class_1_folder (same for the second class).

Did I answer your question?

@Kunika05
Copy link
Author

Kunika05 commented Jun 7, 2022

Epoch 0
Traceback (most recent call last):
File "/mnt/Data/Kunika/easy-few-shot-learning/episodic_training1.py", line 187, in
average_loss = training_epoch(few_shot_classifier, train_loader, train_optimizer)
File "/mnt/Data/Kunika/easy-few-shot-learning/episodic_training1.py", line 148, in training_epoch
enumerate(data_loader), total=len(data_loader), desc="Training"
File "/mnt/Data/Kunika/venv/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 355, in iter
return self._get_iterator()
File "/mnt/Data/Kunika/venv/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 301, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/mnt/Data/Kunika/venv/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 940, in init
self._reset(loader, first_iter=True)
File "/mnt/Data/Kunika/venv/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 971, in _reset
self._try_put_index()
File "/mnt/Data/Kunika/venv/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1205, in _try_put_index
index = self._next_index()
File "/mnt/Data/Kunika/venv/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 508, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/mnt/Data/Kunika/easy-few-shot-learning/easyfsl/samplers/task_sampler.py", line 61, in iter
for label in random.sample(self.items_per_label.keys(), self.n_way)
File "/home/kunika/.conda/envs/kunika/lib/python3.9/random.py", line 449, in sample
raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative
error is arriving on CUB data only
will it work for custom data?
any suggestions solving this one

@ebennequin
Copy link
Collaborator

This error occurs when the number of items in a class is smaller than n_shot + n_query. This is also what happens when your EasySet is empty, e.g. if it didn't find any files in the specified root directory (this is a very unclear error, this should be fixed by this commit from the latest merge).

CUB is implemented on EasySet, its only job is actually to directly find the specification files in data/CUB (see here).
I suggest you check that CUB's images are indeed located in the files specified in the specification files (e.g. train.json).

@Kunika05
Copy link
Author

Kunika05 commented Jun 8, 2022

So now when I am trying on my dataset, I first had 2 classes only, but I divided it into 6 classes naming train test and val for both the classes with different data. but I am still facing the same issue if I have to specify the number of images in all the classes will be then in total in one class 1 had 144 images which were divided into 3 classes 28 for testing 28 for validation and 140 for training.
for he second class I had 1976 in total images which were divided into 197 for testing 197 for validation and 1745 for training can you suggest how could I solve the above error which I pasted earlier

@ebennequin
Copy link
Collaborator

There are two possible causes for this error:

  • there is a class with a number of items smaller than n_shot + n_query
  • your number of classes is smaller than n_way

In your case, you'll have to specify n_way=2 when building the TaskSampler object. Is that the case?

Also, it seems that your setting (2 classes, same classes for training and testing, many examples per class) is far from the standard Few-Shot Learning setting. May I ask why you're using FSL on this dataset?

@Kunika05
Copy link
Author

Kunika05 commented Jun 8, 2022

the data for one of the class is low so I thought I should give few shot a try can you suggest something which could help in my case ?
where in the notebook of episodic training should I change n_ways to 2
and the first point where do I have item smaller than n_sht+n_quey?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants