Add the bioimageio resource description to deserialized stardist model #187

constantinpape · 2022-03-18T16:52:06Z

to keep the information stored in the bioimageio rdf.yaml available after loading the model.

uschmidt83 · 2022-03-18T17:36:55Z

Can you explain why this is useful? I would really help if I understood what this is supposed to be used for.

uschmidt83 · 2022-03-18T17:39:27Z

stardist/bioimageio_utils.py

+    # update the location of the stardist specific files, since they are the ones being used by the model
+    biomodel.config['stardist']['config'] = outpath / 'config.json'
+    biomodel.config['stardist']['thresholds'] = outpath / 'thresholds.json'
+    biomodel.config['stardist']['weights'] = outpath / weights


Why is this helpful?

E.g. to have the correct weight path that the stardist model is using, for example in the case where weights are updated.

Sorry, I don't get it. If you want the preserve the config, thresholds, and weights at the time when the bioimage.io model was imported, then why should biomodel.config['stardist']['config'] and biomodel.config['stardist']['thresholds'] be overwritten?

stardist/bioimageio_utils.py

constantinpape · 2022-03-18T18:10:06Z

Can you explain why this is useful? I would really help if I understood what this is supposed to be used for.

To access information stored in the rdf.yaml in a programatic way.
For the motivating use-case please see HenriquesLab/ZeroCostDL4Mic#181 (comment). And maybe @esgomezm can expand a bit on which properties she needs in the exact use-case.

uschmidt83 · 2022-03-18T20:26:50Z

To access information stored in the rdf.yaml in a programatic way. For the motivating use-case please see HenriquesLab/ZeroCostDL4Mic#181 (comment). And maybe @esgomezm can expand a bit on which properties she needs in the exact use-case.

Sorry, I really don't get it. Can you (or @esgomezm) please write the steps of a potential use case? E.g.

Import bioimage.io model
Fine tune weights on another dataset
...

constantinpape · 2022-03-21T08:45:02Z

Ok, I understand now why the changes proposed here don't make sense.
What do you think about updating import_bioimageio a bit so that it can be called several times and stores the rdf.yaml and other model related data.
This would also mean that import_bioimageio could be called multiple times.

This would be easy to change and I could propose these changes.

Sorry, I really don't get it. Can you (or @esgomezm) please write the steps of a potential use case? E.g.
1. Import bioimage.io model

2. Fine tune weights on another dataset

3. ...

Just very briefly, a use case would be:

1. import_bioimageio
2. access the `training_data` field from the rdf.yaml
3. download that training data for retraining

esgomezm · 2022-03-21T15:08:49Z

Hi @uschmidt83 and @constantinpape

Thank you for opening the discussion! The use case proposed by @constantinpape is actually very nice. The use-case with ZeroCostDL4Mic notebooks is as follows:

The user initialises the notebooks and sets up the architecture to be trained (or fine-tuned): the grid size and the number of convolutional layers.
The user specifies a pre-trained model from the BioImage Model Zoo.
The notebook tries to load the pre-trained weights. If the pretrained network does not match the specified parameters, the notebook outputs an error message informing the user about the discrepancy in the model architecture.

In step 3 we could suggest uploading the pre-trained model directly and keep training it, but then we would need to create an extra library or code to check that the architecture specified by the user, match the one of the pre-trained model.

Yet, some of the information contained in the model YAML is not contained in configuration or thresholds. I think (but maybe I'm wrong) giving the chance to access this info is not damaging and it can be helpful for the integration of StarDist in different workflows. Note also that the YAML file may contain a technical description about how to use the training data or some training parameters. Parsing this type of information and directly providing it makes the interaction smoother to building up self-contained workflows, in my opinion.

In any case, if this is a big issue, I can program the notebooks in a different manner.

constantinpape · 2022-03-21T15:45:41Z

Thanks for expanding on the use-cases, @esgomezm. To make my proposal from above more concrete, I would adapt the implementation of import_bioimageio as follows:

it downloads all the model data to output_folder, including the rdf.yaml (if it does not exist yet)
if the output folder with the correct files exists already it just loads the model
the deserialized bioimageio model description will be attached to the returned model (as is done here already)
calling import_bioimageio will still create the folder s.t. it can be called with the "normal" stardist model function afterwards.

@uschmidt83 let me know if this makes sense to you. If not we can work around it in zero-cost and other potential use-cases. (But I think the proposed solution would be more elegant, also because it allows calling import_bioimageio multiple times with the same argument)

uschmidt83 · 2022-03-21T21:58:39Z

What do you think about updating import_bioimageio a bit so that it can be called several times and stores the rdf.yaml and other model related data.

Sorry, I don't like the idea that import_bioimageio can be called multiple times and does different things depending on what files exist on disk.

I would adapt the implementation of import_bioimageio as follows:

it downloads all the model data to output_folder, including the rdf.yaml (if it does not exist yet)

This I can get behind, i.e. also saving the rdf.yaml file to the model folder.

if the output folder with the correct files exists already it just loads the model

How do I check whether the folder has the "correct files"? The files can change over time, e.g. when fine-tuning. New weights will then be added to the folder, which may be loaded instead of the weights from the BMZ. This makes it hard to reason in advance what will happen.

I propose the following:

>>> from stardist import import_bioimageio
>>> from pathlib import Path
>>> model = import_bioimageio("10.5281/zenodo.6348084", "./bio-model")
>>> print(Path(model.logdir / "rdf.yaml").exists())
True

Isn't it easy enough for you to then just load the rdf.yaml from disk?

The use-case with ZeroCostDL4Mic notebooks is as follows:

The user initialises the notebooks and sets up the architecture to be trained (or fine-tuned): the grid size and the number of convolutional layers.

The user specifies a pre-trained model from the BioImage Model Zoo.

The notebook tries to load the pre-trained weights. If the pretrained network does not match the specified parameters, the notebook outputs an error message informing the user about the discrepancy in the model architecture.

Wouldn't it be better to have the user select a pre-trained model and then just allow to change parameters that don't affect the architecture of the CNN?

…folder

constantinpape · 2022-03-22T08:31:25Z

I updated the PR to implement your suggestion @uschmidt83: import_bioimageio now downloads the rdf.yaml and the other relevant files to the output folder, but otherwise the behavior stays the same (i.e. it can still be called only once and I don't attach the bioimageio resource description to the stardist model object).

This solution would be totally fine for our use cases as well, since it's indeed very easy to now load the bioimageio resource description via bio_model = bioimageio.core.load_resource_description(outpat / "rdf.yaml").

Let me know if there are any more questions / issues with this or if we can merge it.

- save all files to separate sub-folder - rename weights to "weights_bioimageio.h5"

uschmidt83 · 2022-03-22T14:12:29Z

I've changed the code again. Ok like that?

constantinpape · 2022-03-22T14:51:31Z

I've changed the code again. Ok like that?

The result of your changes is good, i.e. having a subfolder bioimageio with the rdf.yaml and other files.
I think there is a simpler way of achieving this without using the tmpfile, see my comment in the code.
But maybe I am missing something there.

uschmidt83 · 2022-03-22T15:05:34Z

Which comment?

The result of your changes is good, i.e. having a subfolder bioimageio with the rdf.yaml and other files.

I think it's cleaner that way.

I think there is a simpler way of achieving this without using the tmpfile, see my comment in the code.

Which comment?

stardist/bioimageio_utils.py

constantinpape · 2022-03-22T15:09:07Z

Which comment?

I forgot to hit the review button, you should see it on line 436 now.

esgomezm · 2022-03-22T16:06:08Z

Hi @constantinpape and @uschmidt83, with this, should be enough! thank you

Wouldn't it be better to have the user select a pre-trained model and then just allow to change parameters that don't affect the architecture of the CNN?

That's another way of interacting with the models, which is also quite nice and can be motivated by the BMZ (without relying on other software). However, the logic behind the ZeroCostDL4Mic notebooks goes the other way around.

constantinpape · 2022-03-22T17:15:44Z

@uschmidt83 this is good to go from my end :).

constantinpape · 2022-03-22T22:26:42Z

Thanks @uschmidt83. I also tested it locally for our use-case data now and this works as expected.
Do you think you could make a small release soon so that @esgomezm could also use this in HenriquesLab/ZeroCostDL4Mic#181?

uschmidt83 · 2022-03-23T09:50:43Z

Do you think you could make a small release soon so that @esgomezm could also use this in HenriquesLab/ZeroCostDL4Mic#181?

I've just made a new release (0.8.1).

Add the bioimageio resource description to deserialized stardist model

f21515f

constantinpape mentioned this pull request Mar 18, 2022

BioImage Model Zoo model export HenriquesLab/ZeroCostDL4Mic#181

Merged

uschmidt83 reviewed Mar 18, 2022

View reviewed changes

stardist/bioimageio_utils.py Outdated Show resolved Hide resolved

Update import_bioimageio to save complete bioimageio model to output …

8608aec

…folder

Change import_bioimageio

0e531ea

- save all files to separate sub-folder - rename weights to "weights_bioimageio.h5"

constantinpape commented Mar 22, 2022

View reviewed changes

stardist/bioimageio_utils.py Show resolved Hide resolved

uschmidt83 changed the base branch from master to dev March 22, 2022 19:22

uschmidt83 merged commit bbdbf6c into stardist:dev Mar 22, 2022

constantinpape deleted the keep-bio-rdf branch March 22, 2022 22:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the bioimageio resource description to deserialized stardist model #187

Add the bioimageio resource description to deserialized stardist model #187

constantinpape commented Mar 18, 2022

uschmidt83 commented Mar 18, 2022 •

edited

uschmidt83 Mar 18, 2022

constantinpape Mar 18, 2022

uschmidt83 Mar 18, 2022

constantinpape commented Mar 18, 2022

uschmidt83 commented Mar 18, 2022

constantinpape commented Mar 21, 2022 •

edited

esgomezm commented Mar 21, 2022

constantinpape commented Mar 21, 2022

uschmidt83 commented Mar 21, 2022

constantinpape commented Mar 22, 2022

uschmidt83 commented Mar 22, 2022

constantinpape commented Mar 22, 2022

uschmidt83 commented Mar 22, 2022

constantinpape commented Mar 22, 2022

esgomezm commented Mar 22, 2022

constantinpape commented Mar 22, 2022

constantinpape commented Mar 22, 2022

uschmidt83 commented Mar 23, 2022

Add the bioimageio resource description to deserialized stardist model #187

Add the bioimageio resource description to deserialized stardist model #187

Conversation

constantinpape commented Mar 18, 2022

uschmidt83 commented Mar 18, 2022 • edited

uschmidt83 Mar 18, 2022

Choose a reason for hiding this comment

constantinpape Mar 18, 2022

Choose a reason for hiding this comment

uschmidt83 Mar 18, 2022

Choose a reason for hiding this comment

constantinpape commented Mar 18, 2022

uschmidt83 commented Mar 18, 2022

constantinpape commented Mar 21, 2022 • edited

esgomezm commented Mar 21, 2022

constantinpape commented Mar 21, 2022

uschmidt83 commented Mar 21, 2022

constantinpape commented Mar 22, 2022

uschmidt83 commented Mar 22, 2022

constantinpape commented Mar 22, 2022

uschmidt83 commented Mar 22, 2022

constantinpape commented Mar 22, 2022

esgomezm commented Mar 22, 2022

constantinpape commented Mar 22, 2022

constantinpape commented Mar 22, 2022

uschmidt83 commented Mar 23, 2022

uschmidt83 commented Mar 18, 2022 •

edited

constantinpape commented Mar 21, 2022 •

edited