Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plant Category Name #11

Open
nomoneyExpection opened this issue May 19, 2023 · 14 comments
Open

Plant Category Name #11

nomoneyExpection opened this issue May 19, 2023 · 14 comments

Comments

@nomoneyExpection
Copy link

Are you sure that. json file is correct? Why did I display the ID of the first image as 4 when running the demo, but an error was reported indicating that this key does not have a corresponding value? Also, if possible, could you upload a version of the. txt key and value that corresponds to each other? Additionally, from the code, there are over 1080 classes in total, right

@maximiliense
Copy link
Collaborator

There are 1081 classes in total. None of them has ID 4. The classes ID starts at 1355868 and ends up at 1718287. However, note that deep learning frameworks label classes from 0 to the number of classes minus one (here 1080). I guess that 4 means that it is the $4^{th}$ in the deep learning framework. You should check the mapping from "PyTorch id" to "dataset id". It is given in the attributes returned by the get_data function.

@Milk-Cool
Copy link

Milk-Cool commented Aug 18, 2023

It is given in the attributes returned by the get_data function.

It requires folders 'train', 'val' and 'test', which are not referenced by any other scripts. How do I get these?

@Milk-Cool
Copy link

Alright so I did a bit of searching and found out that you'd have to split the dataset into these three folders. Do I need to use the splitfolders library here or should I use something else?

@Milk-Cool
Copy link

Also something's wrong with the pretrained model. Here's my code: https://pastebin.com/jvFjJsxc (don't look at how I tried mapping indexes to ids, it probably won't work). It always returns index 204 for some reason. Or it is just me being dumb and not using the model properly. Should I create an issue on this?

@Milk-Cool
Copy link

@maximiliense sorry for pinging, but it's been three days and you still haven't answered so I have no clue if you've seen this or not.

@garcinc
Copy link
Collaborator

garcinc commented Aug 29, 2023

It requires folders 'train', 'val' and 'test', which are not referenced by any other scripts. How do I get these?

These are folders that already exist when you download the dataset, so you don't have to create them.
Make sure to download the latest version in Zenodo : https://zenodo.org/record/5645731

@garcinc
Copy link
Collaborator

garcinc commented Aug 29, 2023

Also something's wrong with the pretrained model. Here's my code: https://pastebin.com/jvFjJsxc (don't look at how I tried mapping indexes to ids, it probably won't work). It always returns index 204 for some reason. Or it is just me being dumb and not using the model properly. Should I create an issue on this?

Which images are you testing on ? Are these images in the test set ? How many images did you try ? Thanks

@Milk-Cool
Copy link

These are folders that already exist when you download the dataset, so you don't have to create them.
Make sure to download the latest version in Zenodo : https://zenodo.org/record/5645731

Thank you so much for replying! These folders did not show up in the archive preview, so I was a little confused there. I will try downloading the dataset.

@Milk-Cool
Copy link

Which images are you testing on ? Are these images in the test set ? How many images did you try ? Thanks

I am testing on the images from the README from this repository, plus one image of my plant. I'll send it when I get home

@Milk-Cool
Copy link

@garcinc
Copy link
Collaborator

garcinc commented Aug 30, 2023

Hi @Milk-Cool,

I have the same behaviour as you on these images.
However, when I check the performances of the pre-trained model (resnet18) on the test set of Pl@ntNet-300K I get back the performances reported in the paper (i.e., ~70% of accuracy on the whole test set).
Keep in mind that the macro-average accuracy , which is approximately 30%, is much lower than the accuracy , which means that if you draw a class at random and an image in that class, you will be right only 30% of time. This is because the classes in the long tail are very hard to discriminate. This is explained more thoroughly in the paper which I encourage you to read.
This could explain what you observe. If you look at the top-5 predictions of the images you are talking about, you will see that they are different. The model might predict class 204 often because this is an overly represented class in the training set.
To evaluate the model, I encourage you to test it on a large set of images belonging to the 1081 species covered by the dataset.
I hope this helps.

Bests

@Milk-Cool
Copy link

Thanks for the answer @garcinc ! That explains pretty much everything.

@dazmashaly
Copy link

@Milk-Cool did you ever get better results?

@Milk-Cool
Copy link

@Milk-Cool did you ever get better results?

no, not really, i'm too lazy to train the model tbh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants