Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request about Release the pre-processed Dataset #3

Closed
xxliang99 opened this issue Mar 29, 2021 · 3 comments
Closed

Request about Release the pre-processed Dataset #3

xxliang99 opened this issue Mar 29, 2021 · 3 comments

Comments

@xxliang99
Copy link

Dear Author,
I met some problems at step 2 "Prepare the dataset", as I do not know what it exactly is in the .npy file (e.g. converted from raw image? or after some pre-processing? or something else), and there is no reference. Therefore, could you please release the re-organized dataset that we could directly applied in training? I mean, the "/Dataset" repository with clientxxx and proper placed .npy files in it. Dataset for either task would help a lot. Thank you very much for your attention. Also congratulations!

@lichen14
Copy link

Sorry to say that I also found the same question during the reproduction process.
According to the reference[52,10,40], I searched for relevant raw datasets in the optic segmentation task. The corresponding datasets are DRISHTI-GS1, RIM-ONE, and REFUGE.
However, the corresponding relationship between the above datasets and the dataset site A~D used in the paper seems to be unclear. For example, I can speculate that siteA is DRISHTI-GS1, but RIM-ONE has released three versions, the numbers are 169, 455 and 159 respectively. So, is it siteB?
Similarly, I haven't predict the attribution of siteC and D yet.
Therefore, I have a similar problem with @VivianLiang1108 , and I hope the author can answer it or give a corresponding relationship.
If you can directly give the preprocessed dataset, we will be grateful!!!!
@liuquande

@xxliang99
Copy link
Author

@lichen14
On the line 63 of train_ELCFS.py says "slice_num = np.array([101, 159, 400, 400])", and its following code demonstrates that client_weight depends on the amount of samples from each client. Then we could infer that site B is RIM-ONE v3, while C & D remain unknown(but with the amount of each 400).
The supporting team of REFUGE dataset somehow rejected my application and I still cannot find other resources online. If you are accessible to REFUGE and know its amount of data, then my guess might help. Hope it works :)
(And I would be extremely grateful if there is an accessible download link of REFUGE)

@liuquande
Copy link
Owner

Hi Vivian and lichen,

For fundus datasets, we direct download the data from Fundus.
The detailed information of each dataset could be found in the Supplementary of our arxiv paper.
Among these data, samples of sites A are from Drishti-GS [52] dataset; samples of site B are from RIM-ONE-r3 [10] dataset; samples of site C, D are from the training and testing set of REFUGE [40] dataset.

Also thanks @VivianLiang1108 for the clarification.

Moreover, the input ".npy" files in data loader acturally denote these data.
During preprocessing, we simply transform each data from ".png/jpg" to ".npy" format, in order to speed up the data loading and federated training process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants