Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to split the dataset #4

Open
WXIAO-TJ opened this issue Nov 16, 2019 · 1 comment
Open

How to split the dataset #4

WXIAO-TJ opened this issue Nov 16, 2019 · 1 comment

Comments

@WXIAO-TJ
Copy link

Hello, i want to know how do you split the "caltech-256" datasets into 16980 training images and 5120 testing images. Or where could i download the splited datasets directly? I will appreciate it if you could tell me about it.

@vietvo89
Copy link

Hello, I have the same concern.

  1. After downloading caltech256, it has totally 257 sub_folders and one .txt file to list down all folder name. So I guess if I need to train a classifier on caltech256, I may need to create training set and evaluation set manually. Is it right? Or how do you build seperate dataset from caltech256 for training and validating?

  2. In your github, I could not find "train_no_resizing" and "val" but found "train_metadata.csv" and " **val_metadata.csv **" instead. Is it right to you these files and change it in the code?

train_folder = ImageFolder(data_dir + 'train_no_resizing', train_transform)
val_folder = ImageFolder(data_dir + 'val', val_transform)

  1. I run the code but it poped up an error when data_dir is fixed "data_dir = '/home/ubuntu/data/' " If I need to change it, is it right to point to the folder keeping caltech256 or where? It seems not right. I do not know

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants