New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix param graph when mixing offset and conditional #3452
Conversation
Hi @hjoliver While reading The regex used is So the returned text is And the returned values in
Here's an example workflow I used to test it.
The reference graph for Cylc 8
And for this branch:
From what I understood, the But after the
Gets simplified into
While it doesn't print exactly that, I think after this change the user would get the same graph & dependencies? i.e., Added some unit tests to demonstrate the above. Tests (unit & functional) may fail, but I will deal with that later if you confirm this PR is going in the right direction 👍 Thanks! |
7ae4397
to
85a85d4
Compare
@kinow - thanks for the fix! From a quick look, you're right; but a proper review later.
I wouldn't say either one of those is simpler, they're just equivalent. And the "simplification" may have been done in my head, I wouldn't necessarily expect the graph parser to do that transformation. |
Ah! That makes sense! I thought for other graph expressions Cylc would have some sort of optimization/rewriting/etc happening somewhere for that sort of modification 😄 |
85a85d4
to
449c21e
Compare
cylc/flow/param_expand.py
Outdated
@@ -261,7 +261,7 @@ class GraphExpander(object): | |||
|
|||
_REMOVE = -32768 | |||
_REMOVE_REC = re.compile( | |||
r'(?:^|\s*=>).*' + str(_REMOVE) + r'.*?(?:$|=>\s*?)') | |||
r'(?:^|\s*=>).*' + str(_REMOVE) + r'.*?(?:$|=>\s*?|&\s*)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simpler fix. Just stop the greedy regex once it finds a) end of stream $, b) an arrow followed by spaces =>\s*, or c) ampersand symbol followed by spaces &\s*.
The last c) item covers the bug reported in the issue linked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would these solutions handle the other forms of the problem e.g:
baz & foo<m-1> => foo<m>
baz & foo<m-1> & pub => foo<m>
bar & foo<m-1> & pub<m-1> & qux => foo<m>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oliver-sanders not really good at parsing the graph expressions, but here's the output of the examples you provided.
With
params_map = {'m': ["cat", "dog"]}
templates = {'m': '_%(m)s'}
baz & foo<m-1> => foo<m>
- is expanded to
{'baz & foo_cat => foo_dog', ' foo_cat'}
- is expanded to
baz & foo<m-1> & pub => foo<m>
- is expanded to
{'baz & foo_cat & pub => foo_dog', 'pub => foo_cat'}
- is expanded to
bar & foo<m-1> & pub<m-1> & qux => foo<m>
- is expanded to
{'bar & foo_cat & pub_cat & qux => foo_dog', 'qux => foo_cat'}
- is expanded to
These look OK to me, but can't really say I'm sure these are really correct (@hjoliver do these look correct?)
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expansion of the parameterised items seems correct to me but there is a bit of a discrepancy with the non-parameterised items.
E.g in example (2) we get pub => foo_cat
but not baz => foo_cat
similar to example (3) but different to example (1) where we just get foo_cat
.
I'm not entirely sure how we should expand these things, here are my guesses:
baz & foo<m-1> => foo<m>
baz & foo_cat => foo_dog
[foo_cat]
(baz => foo_cat)
baz & foo<m-1> & pub => foo<m>
baz & foo_cat & pub => foo_dog
[foo_cat]
(baz => foo_cat)
(pub => foo_cat)
bar & foo<m-1> & pub<m-1> & qux => foo<m>
bar & foo_cat & pub_cat & qux => foo_dog
[foo_cat]
(bar => foo_cat)
(qux => foo_cat)
Where:
compulsory => dependency
- needed for graphing[optional => line]
- not needed as task is referenced elsewhere but not harmful(pre-initial-type => dependency)
- not sure if we should have these or not, but if so they should be consistent
6463397
to
1e19221
Compare
(rebased) |
CI test failure not related to this change I think
|
1e19221
to
9955260
Compare
Rebased |
9955260
to
58b3263
Compare
Rebased, and moved the change log entry from 8.0a2 to 8.0a3. |
And this time the functional tests failed (passed before rebasing). Maybe new tests caught something I had missed before 🤓 |
Kicked GH actions and now the builds passed 👀 |
Just came back to look at this, and it doesn't seem to be working correctly (although I haven't tried to understand why). To pick one example
This should expand as follows:
In terms of individual dependencies this is:
If I use
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not working correctly - see prev comment)
I'm actually not sure if the kind of examples above were considered when parameters where originally implemented. Do they make sense? Maybe we should omit the whole line if one or more parameters goes beyond its bounds?
This works for But for
Maybe the former choice is more sensible? I suspect we originally only considered examples like the following, where the two choices give exactly the same result:
If we just ignore individual tasks that are undefined, this gives:
Whereas if we only accept lines where all tasks are defined, we just get:
... which is the same result, because the m = 1, 2 lines in the first case are redundant with the m = 3 line. |
If doing parameterized tasks the old way, with Jinja2 loops, going beyond the list bounds is an error, so you can't loop over the whole range if there's an "index offset". Then you can treat the edge case separately if you need it.
The equivalent for parameterized tasks is:
However this still means this branch isn't working right.
this should generate only:
(and it should not generate |
In terms of implementation, at least it should be easier to ignore whole lines that contain invalid parameterized tasks than to cut out the individual invalid tasks. This also gets rid of the following problem:
if |
(So I seem to have just proven that my own original advice on this problem was wrong, sorry @kinow 😬 ) |
The issue with this is that you get inconsistent behaviour depending on the number of tasks in the parameterisation e.g:
Would give you:
However:
Would give you nothing, omitting
This is somewhat closer to Cylc's pre-initial logic e.g:
Would give you:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new logic doesn't quite work sorry! (although it should be much easier to modify, to make it work, I hope).
This:
foo => bar => baz
is equivalent to this:
foo => bar
bar => baz
so they must yield the same result after parsing.
In the parameter case though:
# m = cat
foo => bar<m-1> => baz
the split (two line) version gives a different result (via cylc graph
) because your logic treats the first token in the chain as special.
So close!!! 😄 |
I think this shows we can't treat out-of-bounds parameters like pre-initial cycle dependence, because of the fact that out-of-bounds parameters can appear in on the right side of a trigger arrow. So:
This means:
This should yield only
This should also yield only IMO what I'm arguing makes sense in spite of @oliver-sanders pre-initial-logic based argument above because:
|
More thoughts on the pre-initial analogy:
Here
In the parameter case, the same logic arguably makes sense here
but does it make sense here:?
(This is another difference between parameters and pre-initial tasks: not all tasks are parameterized, but all tasks do have a cycle point "parameter"). |
The good news is, maybe it only matters for the case of a single out-of-bounds parameter. This should expand to nothing, not
but with |
Upshot: this issue is a nightmare 😬 However, it seems to me that we have to go with what I've suggested above because of the fact that parameterized tasks can appear on the right side of a trigger, and we need consistency between chained and list-of-pairs graph strings. Would you agree @oliver-sanders ? (before @kinow bashes his head on this wall any further). |
Oh dammit, this one is the gift that just keeps giving. I'm going to have to go and work through some examples to suss out the implications. |
@kinow - I meant to make a PR to your branch, but I accidentally pushed the commit directly to it. Doh! (not a disaster, as it can be backed out if necessary, of course). Anyhow, it simplifies your new logic to make it do what I think is correct, which is: in a chain of dependencies stop at an out-of-bounds task (whether it is the first on in the chain or not).
If it passes all the test, we'll just wait to see if @oliver-sanders can convince himself that this is the correct way to go. Note:
|
52ff7da
to
72bac5a
Compare
Feel free to update this branch/PR as necessary :) Thanks @hjoliver ! |
Tests all pass 🎉 |
## the parameters (prefixes by ##)
non =
one = 1
two = 1, 2
# the results (prefixed by #)
## an empty parameter yields no tasks
<non>
## a non-empty parameter does yield tasks
# 1
<one>
## a dependency on an empty parameter under SoD is like a dependency on a task
## that is not in the graph, <one> will never run so we might as well ignore it here???
<non> => <one>
## ok but what about the other way around if:
## a => b
## is equivalent to
## a
## a => b
## then the same should be true for parameters?
## <one>
## <one> => <non>
## (I'm going to go with yes):
# 1
<one> => <non>
## ok, now for offsets...
## a parameter expands simply
# 1 => a
# 2 => a
<two> => a
## but how does a parameter offset like this expand?
## Is is equivalent to:
## <two>
## <two - 1> => a
## or:
## <two - 1>
## <two - 1> => a
## (I'm going to go with <two - 1> == [1])
# 1 => a
<two - 1> => a
## if you stick another task at the beginning of the chain, the offset
## should still behave the same, but should the head of the chain expand
## <two>
## <two> => <two -1> => a
# 1 # ???
# 2 =1 => a
<two> => <two - 1> => a
## If so
# 1
<one> => <one - 1> => a Resulting rules:
|
@oliver-sanders - in your example above, does "non" mean:
It seems to me you mean (i); however that fails validation so we don't need to worry about it (which I think is sensible). So we only need to address:
Maybe I didn't clearly state my rule above in one place:
I think that corresponds to your rule 1
but I'm not sure about your rule 2 ??
I would say don't add the first item either if it has an invalid parameter value. E.g. for
This becomes:
When
My rule says this should be interpreted as |
An empty parameter list (m = # nothing with foo)
What I'm asking is should this:
Result in this:
Or this:
Based on the logic that this:
Is equivalent to this:
I'm inclined to go with the first example. |
Ah, I think your example got munged. Presumably what you mean is, should this:
Result in this:
Or this:
If that's what you meant, then we're in agreement that Option 1 is correct. And that's what the current implementation does:
|
@oliver-sanders - requesting re-approval from you (it looks like you are happy with it, based on previous discussion, but just in case). |
Yep looks like we are in agreement, good, bye bye to one of our oldest PRs, sorry @kinow! |
Quick lets just get this in before we change our minds again! |
These changes close #2608
Requirements check-list
CONTRIBUTING.md
and added my name as a Code Contributor.