New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Noisy Imagenette/Woof Proposal #40
Comments
As far as URLs are concerned I think @jph00 can upload this to the appropriate location https://github.com/tmabraham/noisy_imagenette/blob/main/noisy_imagenette.csv Tanishq, do you want to start with a proposal on which leaderboards you think might be best to keep or highlight? You would probably know better than me at this point |
As far as URLs are concerned I think @jph00 <https://github.com/jph00> can upload this to the appropriate location https://github.com/tmabraham/noisy_imagenette/blob/main/noisy_imagenette.csv
I think it's more than just uploading the CSV - I think we should update the tgz to include the csv, and update the example training script to use the csv too. That way people can easily change to different labels by just changing the column they select in the csv.
I can provide the upload details when you're ready to upload.
|
@hamelsmu The README has some leaderboards that we could use. I basically kept size 128, and tried both 20 epochs and 200 epochs for each noise level. My reasoning was that it was worth having a somewhat short training run (20 epochs) and a somewhat long training run (200 epochs) for each noise level. I kept size 128 because I assume any observations will transfer over to larger image sizes, and 128x128 images are easier/faster to work with. Of course, if you and Jeremy don't agree, I am willing to change the main leaderboards if necessary. @jph00 I have already provided a training example that works with the CSV and the only thing that would need to be changed is the path once the CSV is added to the tgz and uploaded. |
Great work @tmabraham! There are usually 2 types of noise benchmarked in the literature, What we have so far is
|
Which is more common in the literature? I assume we should follow whatever is standard.
|
Usually papers report results on both, for example:
from SELF. Reviewers also care about real world noise.
from ELR As a practitioner I would care whether the benchmarks transfer to the real world or not. Since, it's probably not possible to have real world noise for woof and nette we can probably go with symmetric and asymmetric for now. Otherwise, we could create mini versions of Clothing1M and WebVision. |
I think it's up to @jph00 to decide if we should include asymmetric noise. My only concern is that it adds another set of leaderboards and it might be too much. But if we decide to include it, I can make a version of Imagenette with asymmetric noise and add it to my |
I'd suggest not including it.
|
Once the leaderboard is official I can work with @tmabraham to test out few papers that we implemented as callbacks and include them as examples/initial results. |
I'm starting to work on this now, sorry for the delay. I was a bit lost on some things and @jph00 clarified what I needed to do. I'm going to first start by familiarizing myself with the leaderboard and the code that uses it so I understand what is going on, and then move forward from there. I am not available next week at all, so this may take me a bit. I'll update my progress on this issue as I make progress |
@hamelsmu Thanks for the update! Let me know if you have any questions. |
Just for my own background learning, can someone educate me why this is not allowed:
Also what is the last bit about the validation size > train size? Someone would make the validation size greater than the training set size? Why would they do that? |
@hamelsmu I think this is referring to the image size in the validation set being larger than for the training set. I think there are some papers suggesting there can be an improvement in this case, but I don't remember the details now. This is from the original Imagenette repo, so hopefully @jph00 can confirm. |
I think you are referring to FixRes paper, they argue lower training image size than test improves results. In general, excluding all the test time tricks is to fairly compare different training approaches. Also, TTA has randomness in it which adds extra noise for comparison. |
@tmabraham do you have a csv file for imagewoof as well? I only see one for imagenette. Thanks for your help. |
Sorry I was ignoring your training script, which was my stupid mistake! Also I forgot Jeremy said to only add two new leaderboards 10% and 50% imagenette and no imagewoof |
@hamelsmu Okay thanks for letting me know. Do you need me to re-generate the CSV then? I don't have 10% in the CSV. Also, I would have to rerun the baselines. Are we sticking to size 128 and 20/200 epochs, giving a total of 4 leaderboards? |
Notes To Reproduce My WorkTaking some notes in this issue because will probably be asked to tweak this one day and I don't want to forget what I did. Step 1: Download all the files and pre-process the datasetSee this repo for the below code in this step from fastai.basics import *
from fastai.vision.all import *
from fastai.callback.all import *
from fastai.distributed import *
from fastprogress import fastprogress
from torchvision.models import *
from fastai.vision.models.xresnet import *
from fastai.callback.mixup import *
from fastcore.script import *
_all_urls=[URLs.IMAGENETTE, URLs.IMAGENETTE_160, URLs.IMAGENETTE_320,
URLs.IMAGEWOOF, URLs.IMAGEWOOF_160, URLs.IMAGEWOOF_320]
paths = parallel(untar_data, _all_urls, threadpool=True)
nurl = 'https://raw.githubusercontent.com/tmabraham/noisy_imagenette/main/noisy_imagenette.csv'
wurl = 'https://raw.githubusercontent.com/tmabraham/noisy_imagenette/main/noisy_imagewoof.csv'
npath = untar_data(URLs.IMAGENETTE_160)
wpath = untar_data(URLs.IMAGEWOOF_160)
wdf = pd.read_csv(wurl)
ndf = pd.read_csv(nurl)
def get_lbl(p, path=npath):
pth = (path/p)
assert pth.exists()
return pth.parent.name
# Add column for zero noise
ndf['noisy_labels_0'] = ndf.path.apply(get_lbl)
wdf['noisy_labels_0'] = wdf.path.apply(partial(get_lbl, path=wpath))
cols = ['path', 'noisy_labels_0', 'noisy_labels_1', 'noisy_labels_5', 'noisy_labels_25','noisy_labels_50', 'is_valid']
ndf = ndf[cols].set_index('path')
wdf = wdf[cols].set_index('path')
# save noise csvs into appropriate folders
for p in paths:
if 'woof' in p.name: wdf.to_csv(p/'noisy_imagewoof.csv')
else: ndf.to_csv(p/'noisy_imagenette.csv') This code ends up putting the appropriate csv file in each directory like below. This will be the directory structure of the new files. Note I know we are not having imagewoof leaderboard for noise, only imagenette, but putting the imagewoof csv files for good measure incase people want to have a go at it for their own entertainment.
Step 2: I uploaded the files using the following bash scriptfor d in */ ; do
tar czf "${d%?}".tgz $d
done
awscp ()
{
aws s3 cp $1 s3://fast-ai-$2/ --grants read=uri=http://acs.amazonaws.com/groups/global/AllUsers
}
for d in *.tgz; do
awscp $d "imageclas"
done Here are the logs from this (this took about 20 minutes):
Step 3: I confirmed that files are on s3
Note we are looking for the date
Step 4: I made changes to the training script in
|
@tmabraham no no need I'm going with 5% instead of 10% (I talked to Jeremy about this). Not sure about the size and epochs part but will ask about that thanks for bringing that up |
@tmabraham I'm thinking I'll just use the same size and epochs as the non-noisy LB. LMK if you think this sounds like a bad idea for some reason. |
@hamelsmu you'll need to update the hashes in fastai too. If you delete the imagenette tgzs from your .fastai/archive directory and your data directory, then call |
Done in fastai/fastai#3210 |
Ok I believe this is done. Thanks to @tmabraham for the extensive testing. |
Noisy Imagenette/Woof
Introduction
Most of the time, dataset labels are actually quite noisy as the humans generating those labels are error-prone. This is especially the case for labels generated through crowdsourcing. Recently, there has been significant research in dealing with noisy labels in datasets and training deep learning models that are robust to noise (ex: here, here, here, here, here, and here). It would be great to be able to implement some of these techniques in fastai (some work has already been done on this front, ex: here) and test them on a benchmarking dataset.
Proposal
I propose to add to this dataset/repository a corrupted version of Imagenette and ImageWoof where the training labels are switched randomly at varying probabilities to simulate datasets with different levels of noise. This dataset is currently available here. The images themselves are the same as Imagenette, but the labels are instead provided in a CSV file. There are 4 noise levels: 1%, 5%, 25%, 50% noise. The generation of these labels are provided here. A baseline based on the original Imagenette baseline is provided as well.
Note that in this
imagenette
repository there are currently 24 leaderboards (12 for Imagenette and 12 for ImageWoof). Since there are 4 noise levels, there is a total of 96 possible leaderboards. I have run my baseline for all 96 possibilities (done automatically on 4 TITAN RTXs with this bash script) and provided an extended leaderboard over here. I have also selected 16 leaderboards that I have kept on the README. It is up to the maintainers (Jeremy and Hamel) to decide which leaderboards they want to keep.What needs to be done
These are the things I believe needs to be done for this dataset to be added to Imagenette:
@jph00 and @hamelsmu Please review my repository and let me know if you think there are any additional tasks that I would need to do.
The text was updated successfully, but these errors were encountered: