Training dataset #8

nmhkahn · 2020-10-02T07:14:33Z

Hi. thanks for sharing a great dataset!

In the paper, it says "The training set consists of 240 LR and HR image sets, and the test set consists of 120 sets of images".
But in the data/normalized directory, I only can get 120 images, which might be the test set only.
I downloaded the raw dataset as well but I think that the dataset preparation code and the raw data are not matched (preparation code assumes the image as png, but raw data is npy files).

I also saw that there was a training dataset on this repo in an early commit (c674e02), but not sure that it is safe to use these old training images.
Please let me know that 1) training dataset from c674e02 is right and 2) if not, where can I get the training dataset.

The text was updated successfully, but these errors were encountered:

majedelhelou · 2020-10-02T15:55:13Z

Hello!

We originally had our own complex normalization strategy, tailored for SIM, but we received a lot of requests for raw data. Accordingly, we made the raw data public, and used the raw data with simpler normalizations. All the paper's results are on these raw data, which we used for training all our models. You can ignore the old data from earlier commits.

For Widefield: we applied z-score normalization. The z-score is computed across all 360x400 captures; avg_value = 154.535390853, std_value = 66.02846351802853
For SIM: the SIM image is normalized with a scaling and shift operation.
These two normalizations are presented in our Supplementary Material Section 3.

Regarding the preparation code, we now modified the notebook w2s/code/generate_h5f.ipynb to read npy instead of png, it is just a file-reading modification.

Thank you for your positive feedback. Please only use the raw data, and we hope it will be useful for your work.

nmhkahn · 2020-10-03T08:07:48Z

Thanks for clarifying this!

nmhkahn · 2020-10-05T22:08:01Z

Hello @majedelhelou sorry for reopening the issue.

For LR (Widefield) images, it looks straightforward since it's just a z-score standardization as you commented.
But for HR (SIM) images, I have a couple of questions which are:

Is calculating alpha and beta to be done in image-by-image?
For example, do I have to calculate alpha and beta of sim/001_1.npy by only using avg1/001_1.npy when training on avg1->sim whereas avg2/001_1.npy when training on avg2->sim?
In eq4 in suppl, the HR image is downsampled but which kernel was used in the paper?

Thanks again 👍

majedelhelou · 2020-10-07T10:58:14Z

Hello @nmhkahn

Yes correct for the LR Widefield images, we note in the readme the mean and standard deviation values we computed for the z-score across all the 400 samples times 360 FOVs.

You should compute a and b (Eq 4 of Supplementary Material) per FOV. But you should compute them using the corresponding normalized LR image. So you can use the normalized avg400 to compute a and b for normalizing the corresponding sim. Using a noisy normalized LR (like avg1) would not affect the result too much, this is just a linear re-scaling that is practical for the deep networks.
You can use the standard bicubic downsampling for this.

These points should not matter too much [the effect of using a noisy or noise-free LR Widefield image for normalizing the HR SIM, and the choice of the downsampling kernel] and you can also use your own strategies for normalizing the HR SIM. The goal is to get a somewhat more uniform intensity distribution for the networks, and small fluctuations in how we get them should not have a significant impact.

Hope this helps!

majedelhelou added the documentation Improvements or additions to documentation label Oct 2, 2020

nmhkahn closed this as completed Oct 3, 2020

nmhkahn reopened this Oct 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training dataset #8

Training dataset #8

nmhkahn commented Oct 2, 2020

majedelhelou commented Oct 2, 2020

nmhkahn commented Oct 3, 2020

nmhkahn commented Oct 5, 2020 •

edited

Loading

majedelhelou commented Oct 7, 2020

Training dataset #8

Training dataset #8

Comments

nmhkahn commented Oct 2, 2020

majedelhelou commented Oct 2, 2020

nmhkahn commented Oct 3, 2020

nmhkahn commented Oct 5, 2020 • edited Loading

majedelhelou commented Oct 7, 2020

nmhkahn commented Oct 5, 2020 •

edited

Loading