New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel statement timing model #1555
Comments
A |
Thanks for the quick reply, then it was definitely a misunderstanding from my side. Though I do have to admit that in my opinion, this limits the power of the |
Where do you draw the line? |
Sounds fair, I understand the interpretation that is implemented. The model I had in mind was parallelizing every function call in the For example, the (incomplete) snippet below would do a parallel detection using an array of edge counters. Like I said, it might be that we have a different opinion on the timing model, though I would argue that drawing the line at function calls (hereby I am considering the from artiq.experiment import *
class ParallelLoopExperiment(EnvExperiment):
def build(self):
self.setattr_device('core')
self.pmt_array = [self.get_device(f'edge_counter{i}' for i in range(4))]
@kernel
def run(self):
self.core.reset()
with parallel:
for pmt in self.pmt_array:
pmt.gate_rising(100 * ms) |
Well, technically that can be done easily, the issue is breaking legacy code. Let's open this for discussion as we may want to introduce it with the other breaking changes in the upcoming compiler rewrite. |
I appreciate the open mindset and I am definitely willing to contribute to such discussions. Two things I wanted to add:
|
To me this indicates a misunderstanding of what There are sensible existing solutions for the kind of programmatic timing tasks you want to perform (which I can understand and appreciate the utility of, especially for larger systems). For example, using the from artiq.experiment import *
class ParallelLoopExperiment(EnvExperiment):
def build(self):
self.setattr_device('core')
self.pmt_array = [self.get_device(f'edge_counter{i}' for i in range(4))]
@kernel
def run(self):
self.core.reset()
t_start = self.core.now_mu()
for pmt in self.pmt_array:
self.core.at_mu(t_start)
pmt.gate_rising(100 * ms) I am not sure that adding compiler flags will help the situation, it's a lot of clutter. Part of the issue here is a more fundamental one, that people expect Python programming to be "pythonic", while ARTIQ kernel programming needs to be a little more spelled out in a lot of instances. There are tough choices to be made here. See for example the discussion in #1542. |
Yes, just calling And as @dhslichter points out, the current behaviour is desirable in a lot of cases, and very simple to explain (time is reset at the beginning of each top-level statement, i.e. usually line underneath the with statement). Another type of context along the lines of what you are suggesting could be added in addition to the current ones, but it would be more complex to specify, and more complex to implement in the compiler than the current types (with the exception of Clarifying the |
(Note aside: In many cases, it is preferable to add a coarse RTIO clock cycle (typically 8 ns) of delay between similar events submitted in a loop, as it is easy to exhaust the number of available RTIO lanes otherwise.) |
Thanks for contributing to the discussion. I agree with @dhslichter that breaking changes should not be taken lightly, but that does not mean that my suggestion is invalid. Especially when there is a plan to make breaking changes anyway, the interpretation of For the current compiler infrastructure, there are multiple options mentioned here to implement such a change without influencing existing code (i.e. a compiler flag added to the arguments of the core device, a separate context), and I think some of these options could be reasonable but I do not decide about that. Regarding the semantics, I still stand with my idea that the implicit """One or three parallel pulses.
For current semantics, we need multiple parallel contexts
"""
# current semantics
with parallel:
self.device_a.pulse(10*ms)
if self.device_b_enabled:
with parallel: # Have to enter parallel context again...
self.device_b0.pulse(10*ms)
self.device_b1.pulse(10*ms)
# alternative semantics
with parallel:
self.device_a.pulse(10*ms)
if self.device_b_enabled:
self.device_b0.pulse(10*ms)
self.device_b1.pulse(10*ms) """One or three pulses of which the two additional ones are sequential.
Current semantics is more compact due to the "implicit" sequential timing
"""
# current semantics
with parallel:
self.device_a.pulse(10*ms)
if self.device_b_enabled: # Implicit sequential
self.device_b0.pulse(10*ms)
self.device_b1.pulse(10*ms)
# alternative semantics
with parallel:
self.device_a.pulse(10*ms)
if self.device_b_enabled:
with sequential: # Explicit sequential, improves readability
self.device_b0.pulse(10*ms)
self.device_b1.pulse(10*ms) """Parallel detection.
For current semantics, the parallel context lacks power and can not be used
"""
# current semantics
t = now_mu()
for pmt in self.pmt_array:
at_mu(t)
pmt.gate_rising(10*ms)
# alternative semantics
with parallel:
for pmt in self.pmt_array:
pmt.gate_rising(10*ms) I would argue that the current Regarding explaining the semantics of the Finally I wanted to mention that the current ARTIQ kernel language is designed as a subset of Python, and staying as close as possible to the Python semantics would therefore make sense. The ARTIQ manual does not describe every aspect of the kernel language because the implemented language subset is supposed to work similar to Python. Besides from that, "pythonic" programming does not conflict with low-level programming in kernels and even encourages to be explicit. |
This argument doesn't support your suggested interpretation either, as Python In general, I'd avoid introducing multiple language variants (e.g. via compiler flags) as much as possible, since it carries a hefty cost, not only in terms of compiler maintenance (an exponential number of combinations now needs to be tested), but also in terms of language complexity affecting teachability (documentation needs to mention that there are two distinct meanings that look visually identical) and programmer workload (reading code, one needs to watch out for the different meanings), as well as shareability/reusability of code (one might be unable to use two libraries that require disparate language variants). In this particular instance, most of these issues (except for the increase in overall compiler/language complexity) could be addressed by simply making the distinction local, e.g. by locally specifying a language variant ( |
I've just had a quick look through part of our code base (some 60k LOC), and there was only a single case where the behaviour would actually change, as we tend to write out As such, I'm not necessarily opposed to this because of backwards compatibility (although other code bases might look different); rather, I'm mostly just not sure whether the added complexity is worth it. |
I think we interpret the application of the Regarding legacy code, I have the impression that the change is not as invasive as it sounds, though obviously other users will have to agree with that. As you mention, the semantics change would require to add some I do not think I am in a position to talk about implementation, though I do agree with the points you made about "multiple language variants". |
The time manager example, however, doesn't actually model the semantics as currently implemented (and neither as proposed here). Really, we should probably delete As for which interpretation is closer to regular Python, semantically, in both cases implicit What is, however, the case, is that the proposed semantics are more complex to implement in the compiler, and define in a hypothetical formal language specification, than the current ones. For the current "shallow" case, the implementation for That being said, "deep" |
It's probably not that horrid, but yes, nobody seems to care about |
Based on your explanation, it does seem that the timing model with a "timing stack" is not how it is currently implemented in the compiler. Though even if the The implicit adding of I do not know enough about the implementation of the ARTIQ compiler to argue if a deep parallel would be easy to implement, you would know better. Though I would guess it would not be too hard from a compiler perspective. I would guess that in the AST/IR, each function call under a Finally, I wanted to state that with either semantics for |
Yes, please post it. That helps us know which parts of ARTIQ are important or not... before your messages we thought nobody was using |
Or, even better, consider developing such general components with the idea to submit them back to upstream ARTIQ to begin with. :) As for implementability, I was referring to similar behaviour in regular host Python. If every function you call is annotated with a decorator ( That being said, if simulation in this fashion is actually useful to people for work on real experiments, easily being able to simulate "deep" parallel is indeed an argument in favour of the latter. For us, we'd typically need to mock out a considerable amount of extra hardware interfaces and complexity in order to be able to usefully simulate kernels, so we tend just to test on actual hardware. There is still the question of backwards-compatibility of course, which I can't answer for the wider ARTIQ community. For our code bases, quickly auditing every
Global state as in state kept while translating the AST into IR, yes. I wonder where the timeline resets/duration tracking would best be inserted in the deep model. Before/after each
(Just for the sake of completeness: There is an |
Changing to "deep" parallel necessarily requires the use of explicit sequential. Basically none of our code uses explicit I think if anything is to be considered, it should be the addition of a new timing context
In short, the fact that ARTIQ has been used in dozens of experimental setups in many different laboratories over 5+ years without issues being filed about how |
As mentioned before, I am ok with the current implementation of For the existing compiler, I understand we can not suddenly change the semantics of @dnadlinger regarding the implementation in the compiler, I also think the timeline "reset" would be inserted before each For the "deep" vs "shallow" parallel discussion:
@sbourdeauducq regarding Just as a bonus, I quickly wrote a flake8 plugin that can check ARTIQ code and can flag lines that are potentially relying on the "shallow" parallel semantics. It also checks for a couple of other potential bugs specific to ARTIQ code. See https://gitlab.com/duke-artiq/flake8-artiq . |
OK to delete them then?
This should not be connected to ARTIQ stable - in fact we do not accept new features in release branches, only bug fixes. |
@sbourdeauducq if it was up to me, feel free to delete Once we are ready, we can contribute components of our code to the master/development branch. Though in our lab we run on the ARTIQ (5) stable channel. Hence we decided to set up our project as an extension to ARTIQ. |
Hmm, the SAWG gateware simulation uses |
Looks good - I have added it to Nix, conda since that seems simple enough (but not tested, I'm just running the generic conda package builder in Nix), and https://m-labs.hk/experiment-control/resources/ |
If |
Simulator can probably use this: https://rushter.com/blog/python-bytecode-patch/ AST macros are perhaps unwieldy as we have to handle function calls in expressions e.g.
would have to be transformed to:
With lazy-evaluated boolean operators ( The debugger doesn't look like it can be used as I don't see an API for setting breakpoints to specific places in the AST or bytecode. |
Bug Report
One-Line Summary
@sbourdeauducq , maybe I am having a misunderstanding, but the timing model of the
parallel
statement might be fundamentally off.Issue Details
Steps to Reproduce
Expected Behavior
I expect both scenario's to be equivalent from a timeline perspective. Expected delay 0.1 s for both scenario's.
Actual (undesired) Behavior
the
for
loop in theparallel
statement is not parallel from a timeline perspective.Your System (omit irrelevant parts)
The text was updated successfully, but these errors were encountered: