-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usefulness of data_range #55
Comments
It's true that it might not be that useful. This is how I understood/see it: It's the range of values that the input image should have when entering the model (after preprocessing). Even if it is of type float32 and the network computations can be done with values in the range
I wouldn't refer to the batch information by calling it
It might be better to specify it when required (in the preprocessing or any other kind of transformation), IMO. The main reason is that when I read the input specification, I expect to have technical information about a "single" input patch/batch/image of the model that makes the inference possible. However, when talking about global parameters of the data I get confused because I'm not sure whether it refers to a whole set of images (from one experiment), or to a single image, or to the batch. If this parameter goes together with the preprocessing that uses it, I think it will be easier to understand its meaning. |
I think that's pretty much what we intended for the I also agree that having any of the more global options here is rather confusing. |
Interesting. I see inputs/outputs as the API description. imho the inputs should describe the data before preprocessing (as we know how the preprocessing steps change it and any consumer software would have to provide the data (as it is before preprocessing) to the runner. Otherwise inputs/outputs would not truly describe the inputs/outputs of the whole bioimage.io model.
Sorry for being imprecise; that is not what I meant. with data_range of mini-batch, I meant the minimum/maximum of a given mini-batch, irrespective of the length of the mini-batch. However with b>1 for an independent sample in the mini-batch these values become meaningless.
👍 |
Good point, but then what about 'halo' and 'offset'? Those values refer to the raw output of the model rather than to the post-processed output, no? |
I'd say they refer to the postprocessed output. As a consumer software I don't care if the halo was cropped due to a valid convolution in the actual neural network or cropped away in a postprocessing step. Either way I get an output that has a certain shape relative to the reference input. (as you wrote in your example: output_shape = input_shape * scale - 2* halo). |
For the data type and in general, the information about the
For the |
This came up in the last bioimage.io call and we didn't resolve it yet. To summarise, we have two different interpretations of
|
my problem with option 2.: |
a note on file inputs/outpus: I would prefer if we have in memory inputs/outputs only. writing tabular data to a csv, an image to a specific file format etc, should be left to the consumer software. The examples you mention here can all be represented by a tensor. |
I am more thinking about option 3: They describe the input to the preprocessing, the software need to taken care of the input before preprocessing. There are also ambiguity for If that's the case then I would treat Why do we even need the |
what is the difference between option 1 and 3? |
Sorry, I miss read, so option 1 and 3 are the same. |
Fixed in #59. |
I found myself wondering what the actual usefulness of the
data_range
config field is, after commenting in https://github.com/bioimage-io/configuration/pull/54/files#r532405610data_type
does not seem all that useful.What am I missing? Or should we get rid of it in 0.3.1?
The text was updated successfully, but these errors were encountered: