-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple identical Struct declarations #448
Comments
First pasting for reference the spec language:
There are two slightly different scenarios to think about. First is the "diamond import" pattern (which I think we do see in BioWDL among others), where W.wdl imports the same struct from two different task libraries T1.wdl and T2.wdl, but both T1.wdl and T2.wdl originally imported it from a common struct library S.wdl. I think this pattern is common (used in BioWDL I believe) and doesn't need a warning let alone an error. Longer paths are clearly possible too, but the point is there's only one "original" struct typedef in this case. The second scenario is where the source code of the struct typedef really appears in multiple source documents (with identical members in all respects). I do think the spec is ambiguous as to whether such struct types are conceptually identical, compatible, or incompatible. Miniwdl does consider them identical, i.e. the struct type is defined as the members, which may have one or more aliases (one being the name in the original declaration), and does not care which file they were originally declared in. I can see a warning possibly being justified in this case, but miniwdl does not generate one currently. Also, miniwdl does not care about the order of the members. One more subtlety to track (which I think argues for miniwdl's current approach) is that structs can contain other structs as members, possibly with aliasing, so you have to take care that |
Thank you for the reply. I was considering that the order of members could have an effect on struct IO, but it is true that this would not change the way the pipeline works so the order should be ignored. Your last point raises a issue that I hadn't considered and I will have to consider the best way to deal with this. Does the struct aliasing in an import statement affect only Struct declarations in the imported file or also Struct usages within the file? It is not clear (to me!) in the spec, but I think that both delarations and usages should be aliased. Last (for now) question on Struct aliasing: I was assuming that the aliasing only applies to the document being imported and not to daughter documents. i.e., if A.wdl imports B.wdl that imports C.wdl, any struct aliasing in the import statement in A.wdl would only apply to B.wdl and not to C.wdl. Is that correct? On another issue, is there a set of reference documents that can be used to test the conformance of a parser? Thanks a lot for your help. |
(To be clear, I agree the spec is ambiguous on some of these points -- I can describe what miniwdl does now, but I don't claim that's normative and I don't know how other WDL frontends handle these cases.)
If I
I was expecting a slightly different question like "any struct aliasing in the import statement in B.wdl would only apply to B.wdl and not to A.wdl`. Was that what you meant? If so, miniwdl does import the aliases defined in B.wdl into A.wdl, but I don't have a strong conviction that it's right to do so. |
I don't think these cases should occur very often, but in the interests of reproducibility there should be a clear 'right answer', so it would be nice if future versions of the spec could clarify this. For my questions I think some examples might make them clearer Scenario 1: If I have the following declarations in a single document B.wdl that is imported with the statement
I would then import Car as Car1 into the global namespace, and I should make sure that when we evaluate CarRental we will be able to find Car even though the name has changed. That is what I meant by 'both declarations and usages should be aliased' Scenario 2: We have 3 documents, A.wdl, B.wdl and C.wdl. A.wdl B.wdl C.wdl In this case would the importing of Struct Car from C.wdl be affected by the alias clause in A.wdl so that Car gets imported into the global namespace as Car1? I imagine not, but I could imagine scenarios where this might be wanted. Again, thank you for taking the time to answer my endless questions. I am working on implementing a frontend and I want to nail down as many of these ambiguous cases as possible. |
Maybe a bit of a side track this, but perhaps implementing an Personally, I feel the way struct imports work is one of the most (if not the most) convoluted parts of WDL, due to them getting added to a "global namespace" instead of being namespaced like everything else. Another bit of relevant spec:
Regarding scenario 1, I don't think technically aliasing the usage should be necessary. The structs, when imported, get added to the document's global scope/namespace under the aliased name, but the imported document shouldn't have access to that namespace, I don't think. It has it's own namespace within in which the struct still has the original name. The second And for scenario 2, I never even thought about imported structs getting propagated when they are imported in an imported file... I would actually say that in scenario 2 A.wdl should be invalid, since there is no |
Thanks for your comments - I think I am overcomplicating the matter unnecessarily. I agree that scenario 1 should work without explicitly requiring aliasing the usage, and for scenario 2 I will only apply the aliasing to the file that is imported (so B.wdl in this case and not C.wdl). I also agree that in the case where the aliased struct is not present in the imported document (as in scenario 2) this should probably get an error or at least a warning as the user's intention of aliasing can not be carried out if the struct is not present. |
@mlin FWIW wdlTools has exactly the same behavior for structs as you describe in miniwdl. According to the v1.1 spec: "When a Struct is imported with an alias, it is added to the global namespace only under that alias." So, when importing a document, all of its structs are imported, either under their original name or their alias. |
I think it remains ambiguous whether two struct types with distinct declarations but identical members are compatible or not, ie struct A {
Int x
String s
}
struct B {
Int x
String s
}
workflow w {
input {
A a
}
B b = a # allowed???
} This single-file case isn't very interesting, but consider if we have numerous struct types imported through a hierarchy of files, with different aliases in them, it can get very confusing to keep track of where the 'original' definitions were...giving a certain appeal to deciding just based on whether the struct members are the same. |
I honestly think that aliasing was one of my poorer decisions. If we could retcon the So this is probably not a helpful comment, but I would be happy to sponsor a change for 2.0 that brings namespacing to structs. IF we could do it in a graceful non breaking way that would be amazing too |
I propose to add an allowed
Will open a PR with this change for review. |
What is the correct behaviour for the parser if multiple identical Struct definitions with the same name are declared either from the same document or different documents? We can see this, for example, with several of the BioWDL workflows where some basic Structs are declared multiple times with identical declarations without using aliasing to disambiguate them. In these cases should the parser (a) raise an error, (b) raise an error only if the definitions are not identical (same member names & types in the same order) and issue a warning otherwise or (c) raise an error if the definitions are not identical and ignore the duplicate definition otherwise. I would favour (b) unless it is clear that this is allowed in which case (c).
The text was updated successfully, but these errors were encountered: