Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

lq, hq data split #2

Closed
Nimbus1997 opened this issue Dec 6, 2022 · 17 comments
Closed

lq, hq data split #2

Nimbus1997 opened this issue Dec 6, 2022 · 17 comments

Comments

@Nimbus1997
Copy link

Hello I came again馃榿

I was looking at your paper and came up with one question about how you split the lq and hq data. As mentioned in your paper and your github, you split your hq and lq data by score of eyeQ/MCFNet. - LQ for "usable" and HQ for "good"-.
But in the table in the paper, Original FIQA is not zero which means some pictures are marked as "good".
How could this possible?
Did you used "EyeQ/data/Label_EyeQ_train.csv" quality level? I did and found out that some datas mark as "usable" to be "good" or "reject" when I tested with MCF net.

image

@QtacierP
Copy link
Owner

QtacierP commented Dec 6, 2022

Aha, there must be some problems/misunderstandings. Firstly, the data split is followed by the ground-truth label (which is the csv file you mentioned). Moreover, you say you have not found the data classified wrongly. It seems impossible, since the accuracy of MCF-Net on the test set is not 100% (which is around 80%~90%, and the grade for "usable" is the most challenging task). Be sure that you are testing on the EyeQ test set, and then calculate the ACC on this dataset.

@QtacierP
Copy link
Owner

QtacierP commented Dec 6, 2022

image
The performance of MCF-Net is here. You can see all of these models cannot achieve the ACC of 100%. Moreover, the classification for "Usable" is the most difficult, since it is the most ambiguous level, so the performance on classifying "usable" should be lower than the average performance.

@QtacierP
Copy link
Owner

QtacierP commented Dec 6, 2022

Here are some suggestions for this problem: (1) First, calculate the ACC on the EyeQ test dataset, and compare the results with the original paper. I guess there may be a performance gap between your implementation and the official implementation. (2) Second, check the weights in the network, be sure that you have loaded the correct weights from the pre-trained checkpoint. (3) Thirdly, check the pre-processing (especially the normalization), be sure all of these should be similar to the original MCF-Net. (4) Lastly, the FIQA only focuses on Good/Not Good, which is not related to the "Rejected" grade. Using the torch.argmax() to get the predicted label, and sum over the number of samples which are graded as "Good", formulated by X. The total number of samples is formulated as Y. The FIQA is simply calculated by X/Y.
I hope these suggestions can help you :)

@Nimbus1997
Copy link
Author

Nimbus1997 commented Dec 6, 2022

Aha so the quality labels of the "EyeQ/data/Label_EyeQ_train.csv" and "EyeQ/data/Label_EyeQ_test.csv" are the ground truth not the result of MCF Net! Thanks a million

How did you split your data to train/val/test? In your github, it is said on readme.md "Split the dataset into train/val/test according to the EyePACS challenge." but, in the kaggle challenge, I only found test/train separation 馃様

@QtacierP
Copy link
Owner

QtacierP commented Dec 7, 2022

We followed the DR classification task dataset split in my teammate's work (https://arxiv.org/pdf/2110.14160.pdf). I think you can just split 20% of the training data as the validation set, which seems also okay.

@Nimbus1997
Copy link
Author

Aha I see
Thanks a million馃樃

@Nimbus1997
Copy link
Author

Hello again :)

I have more questions about the data split you made. I read your teammate's work you mentioned (https://arxiv.org/pdf/2110.14160.pdf) but found out that the total number is different from the EyeQ data. - the total number in EyePACS: 88,702 and the total number in EyeQ when using only "usable" for low quality: 23,252.
image

So could I get how you split your data (with a data name list for each train/val/test)? If you can, I will give you my email!
Or could you please tell me how much data is in each train/val/test set (low and high quality separately)?

Thanks a lot always, you are a huge help to me.

@Nimbus1997 Nimbus1997 reopened this Feb 24, 2023
@QtacierP
Copy link
Owner

We have followed the train/test split provided by the official EyeQ dataset, which can be found at https://github.com/HzFu/EyeQ/tree/master/data. However, as there was no validation dataset available, we have created a validation set by splitting the training set. Nevertheless, we have updated the data split in the repository, which you can now access. :)

@QtacierP
Copy link
Owner

The label "1" is "good", while label "0" is "usable".

@Nimbus1997
Copy link
Author

Nimbus1997 commented Feb 24, 2023

Oh wow..! 馃ズ馃ズ
Thank you so much again

@Nimbus1997
Copy link
Author

Nimbus1997 commented Mar 4, 2023

Hello, I found out that the label on the csv file you uploaded is slightly different from the EyeQ version.(https://github.com/HzFu/EyeQ).
Eyeq version>
image

your version >
Good(hq) = 12,905
Usable(lq) = 10,347
But the total (good + usable) number is the same (23252).

Could you please tell me how did you get the labels?

@Nimbus1997 Nimbus1997 reopened this Mar 4, 2023
@QtacierP
Copy link
Owner

QtacierP commented Mar 4, 2023

I used the label in EyeQ V1, but EyeQ has been updated to V2 now. You can check this branch https://github.com/HzFu/EyeQ/tree/95c63a743a68b1665d7ecb1e050a2d5b4f0f3408 for more details in V1.
image I think V2 version may provide more accurate annotations for image assessment :)

@Nimbus1997
Copy link
Author

Aha I see thanks!

@QtacierP
Copy link
Owner

QtacierP commented Mar 4, 2023

I apologize for any confusion. Upon reviewing my workspace, I discovered that the number of "good" images is 16818, as compared to 16817 in EyeQ. Additionally, the number of "usable" images is 6436, versus 6435 in EyeQ. It appears that there is only a one-image discrepancy between versions 1 and 2. As the CSV file I uploaded is only utilized for the public split, I will investigate whether there are any issues with these files.

@QtacierP
Copy link
Owner

QtacierP commented Mar 4, 2023

image
After reviewing the codes and the number of uploaded CSV files, everything appears to be in order. However, it may be prudent to double-check the CSV files to ensure that they are accurate.

@QtacierP
Copy link
Owner

QtacierP commented Mar 4, 2023

The "bad" label shown in the bash represents the "usable" grades in EyeQ.

@Nimbus1997
Copy link
Author

I am sorry, it was my mistake..
I found out that I had mistaken the number of your data set. When I counted again, it is same as the eyeq set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants