-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify thinking about required and optional model tasks and output types #13
Comments
DECISION: Option 4 🎉 |
I've been working on implementing some of the decisions we made within the schema, in particular with respect to I've created two branches:
Let me know which implementation you prefer. |
I don't have very strong feelings about this -- but maybe a weak preference for the first option because it's more consistent with the other specifictions, and doesn't use the oneOf construction, which I guess maybe fewer people would be familiar with? But I would be happy to accept your recommendation for what you prefer, if you like the other one. |
I have the same response as Evan. |
Thanks both! So the reason I prefer the Having said that, if both are Now that I think of it though, is there a situation where in a given set of task_ids or rounds, one of the output types needs to be specified because it has been included in another round but should not be submitted, in which case |
Thanks -- that makes sense. Is this situation of possibly-repeated values across the For your last point -- in that case, I think that each round (and each task group within that round) only needs to include the output types that are required for that round (or task group). The output type column will still be included, we're just specifying which values of output types are required or optional within that column. |
Currently, the required and optional values of output type ids can in effect also specify whether the corresponding output types as a whole are required: namely, a particular output type is required if it has at least one required
type_id
, and is optional otherwise. This may be confusing. Is there another way? See also the related discussion under issue #9.Current proposed system
To explain the situation, we consider a series examples of hubs with varying modeling task specifications.
Example 1:
For a hub with this specification, a valid submission must include at least the following rows, obtained via a kind of
expand_grid
action across the different combinations of required values for the task id variables and requiredtype_id
s within each output type. Note that in this process, you could imagine first concatenating theoutput_type
s with the options fortype_id
values within eachoutput_type
, so that they are treated as a "unit" when theexpand_grid
happens. Then split them back into two columns. This is necessary to track the nesting oftype_id
values withing the specific output types.Example 2
Example 2 is the same as example 1, but it has only one required quantile level:
For a hub with this specification, a valid submission must include at least the following rows:
Example 3
Example 3 is similar to examples 1 and 2, but now all of the quantile levels are specified as optional.
For a hub with this specification, a valid submission must include at least the following rows:
Example 4
Our final example is similar to example 1, but swaps the specification of
["NA"]
andnull
values in therequired
andoptional
fields for themean
output type:For a hub with this specification, a valid submission must include at least the following rows:
Summary and question for discussion
Summary: Under the current system, the required rows that a submission must minimally obtain are obtained by applying an
expand_grid
type of action to the task id variables and combinations of output types and type ids. This means that if there are no required values under the type_ids for a particular output type, a minimal submission does not need to include any rows with that output type. Effectively, this means that that output type is optional. Saying this again in different words: in this set up, a particular output type is required only if there is at least one value specified asrequired
in the type_ids under that output type. This is illustrated in examples 3 and 4 above.Every time this has come up, this use of required/optional values of a type_id to implicitly set the status of an output type has been non-intuitive. How can we resolve this? Three ideas:
output
column so that it has therequired
andoptional
properties similar to the other columns. We would then perhaps check that the names of any additional properties currently under"output_types"
match the values that were specified asrequired
oroptional
for theoutput
column. We would need to think through and document how this interacts with the "implicit requirement" for output types that comes out of the current procedure as illustrated above.output_type
andtype_id
(and any restrictions onvalue
) as beingrequired
oroptional
.The text was updated successfully, but these errors were encountered: