Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vivado CDC constraints #227

Closed
nmigen-issue-migration opened this issue Sep 22, 2019 · 30 comments
Closed

Vivado CDC constraints #227

nmigen-issue-migration opened this issue Sep 22, 2019 · 30 comments

Comments

@nmigen-issue-migration
Copy link

Issue by dlharmon
Sunday Sep 22, 2019 at 20:56 GMT
Originally opened as m-labs/nmigen#227


Related: #212

This tags the first register in each MultiReg or ResetSynchronizer
with the attribute nmigen_async_ff and then applies a false path and
max delay constraint to all registers tagged with that attribute in
the .xdc file.

The max delay defaults to 5 ns and has an override, max_delay where
it can be changed for the > whole project. It's possible to make this
an argument to MultiReg instead, but is more complex. > git commit
-m "add clock domain crossing constraints on Vivado This tags the
first register in each MultiReg or ResetSynchronizer with the
attribute nmigen_async_ff and then applies a false path and max
delay constraint to all registers tagged with that attribute in the
.xdc file.

The max delay defaults to 5 ns and has an override, max_delay where
it can be changed for the whole project. It's possible to make this an
optional argument to MultiReg instead, but is more complex. It would
probably work to set nmigen_async_ff to the desired delay rather
than just TRUE. I'm not sure how hard it would be to extract that in
the TCL or if it would be easier to keep a dict of all used delay
values and put a line for each into the .xdc file.


dlharmon included the following code: https://github.com/m-labs/nmigen/pull/227/commits

@nmigen-issue-migration
Copy link
Author

Comment by codecov[bot]
Sunday Sep 22, 2019 at 20:58 GMT


Codecov Report

Merging #227 into master will decrease coverage by 0.05%.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #227      +/-   ##
==========================================
- Coverage   82.38%   82.32%   -0.06%     
==========================================
  Files          33       33              
  Lines        5483     5489       +6     
  Branches     1172     1174       +2     
==========================================
+ Hits         4517     4519       +2     
- Misses        834      836       +2     
- Partials      132      134       +2
Impacted Files Coverage Δ
nmigen/lib/cdc.py 84% <50%> (-6.91%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update da53048...b78ef18. Read the comment docs.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Sunday Sep 22, 2019 at 21:04 GMT


References:
https://forums.xilinx.com/t5/Timing-Analysis/how-to-constrain-a-CDC/td-p/748174

I expect a few more constraint requirements once #40 is merged due to a timing path between the write clock and async outputs on the LUT RAMs. It's a challenge to deal with: https://forums.xilinx.com/t5/Vivado-TCL-Community/set-disable-timing-for-hierarchical-cell/td-p/495808 Alternately, set the clocks to be asynchronous, but getting correct clock names is a challenge.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Sunday Sep 22, 2019 at 22:15 GMT


Xilinx UG903 is self contradictory and the set_false_path constraint appears to override the set_max_delay constraint. I'm looking into how to solve this.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 02:14 GMT


Ideally, we would use set_max_delay -datapath_only, but that option requires providing -from with the source clock greatly complicating things.

I also noticed that there is a "critical warning" generated when the added lines are present in the .xdc file and there are no flip flops with the nmigen_async_ff attribute.

Direction would be appreciated.

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 06:55 GMT


This tags the first register in each MultiReg or ResetSynchronizer
with the attribute nmigen_async_ff

This isn't directly related to the PR at hand (I'll look more into how Vivado works before commenting here), but I have a question you might answer regarding attributes. nMigen emits some internal attributes like nmigen.hierarchy, nmigen.decoding, src (which should really be nmigen.src), etc. Vivado complains about these. Is here a way to tell it "just ignore any attribute that starts with nmigen", other than the nmigen_async_ff and other semantically important ones?

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 07:07 GMT


Ideally, we would use set_max_delay -datapath_only, but that option requires providing -from with the source clock greatly complicating things.

Can you elaborate on the exact nature of complication here?

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 07:48 GMT


Vivado complains about these

Actually, I was wrong. Vivado doesn't mind. ISE complains:

WARNING:Xst:37 - Detected unknown constraint/property "src". This constraint/property is not supported by the current software release and will be ignored.
WARNING:Xst:37 - Detected unknown constraint/property "src". This constraint/property is not supported by the current software release and will be ignored.
WARNING:Xst:37 - Detected unknown constraint/property "nmigen.hierarchy". This constraint/property is not supported by the current software release and will be ignored.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 15:18 GMT


Ideally, we would use set_max_delay -datapath_only, but that option requires providing -from with the source clock greatly complicating things.

Can you elaborate on the exact nature of complication here?

We would either need an i_domain argument to MultiReg and to know the final Verilog name of the clock or some way to find the clock source of the i argument (may not exist if it came from a pin which is OK or could be combinatorial logic from multiple domains). I assumed the latter to be difficult and possibly would add much more complexity than required. If that's not the case, it may provide a bit more flexibility, especially for things like a LUTRAM based FIFO. It might be worth an optional i_domain argument for the LUTRAM FIFO case.

I did find that instead of the set_min_delay -1000 -to <REG>, a better way to do it is to add set_false_path -hold -to <REG>. I'll push that change to this branch after testing it.

I think I'll need to add a dict keyed with (source_clk, max_delay) to platform for keeping track of which constraints need to be added to the .xdc file. Of course, if we don't do per CDC max_delay or set_max_delay -datapath_only -from <X> -to <Y> it could simply be a bool to enable adding to the constraints file.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 15:25 GMT


nMigen emits some internal attributes like nmigen.hierarchy, nmigen.decoding, src (which should really be nmigen.src), etc. Vivado complains about these. Is here a way to tell it "just ignore any attribute that starts with nmigen", other than the nmigen_async_ff and other semantically important ones?

As you found, Vivado does seem to ignore any attribute it doesn't use. It seems safe enough to assume it won't use anything starting with nmigen. Perhaps, I should swap my nmigen_async_ff to nmigen.async_ff for consistency.

I do see Vivado complaining about the init attribute on Verilog wires.

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 15:30 GMT


We would either need an i_domain argument to MultiReg and to know the final Verilog name of the clock or some way to find the clock source of the i argument (may not exist if it came from a pin which is OK or could be combinatorial logic from multiple domains). I assumed the latter to be difficult and possibly would add much more complexity than required.

We'll have this information once #4 is implemented, but right now this isn't viable.

If that's not the case, it may provide a bit more flexibility, especially for things like a LUTRAM based FIFO. It might be worth an optional i_domain argument for the LUTRAM FIFO case.

I don't understand this statement. We already have r_domain and w_domain in AsyncFIFO. Or do you mean some other kind of FIFO?

I think I'll need to add a dict keyed with (source_clk, max_delay) to platform for keeping track of which constraints need to be added to the .xdc file. Of course, if we don't do per CDC max_delay or set_max_delay -datapath_only -from <X> -to <Y> it could simply be a bool to enable adding to the constraints file.

I would personally opt for a simpler solution: grab the attribute with Tcl and set it, again, with Tcl. I'll throw together a snippet soon so you don't have to do it yourself.

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 15:30 GMT


Perhaps, I should swap my nmigen_async_ff to nmigen.async_ff for consistency.

Sounds good.

I do see Vivado complaining about the init attribute on Verilog wires.

This is #220. Unfortunately an upstream issue.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 15:44 GMT


If that's not the case, it may provide a bit more flexibility, especially for things like a LUTRAM based FIFO. It might be worth an optional i_domain argument for the LUTRAM FIFO case.

I don't understand this statement. We already have r_domain and w_domain in AsyncFIFO. Or do you mean some other kind of FIFO?

I was referring to MultiReg/FFSynchronizer which is used within AsyncFIFO and doesn't currently get passed the input clock. It's simple enough to add an argument i_domain = None to FFSynchronizer if it's not None add the -from and -datapath_only arguments to set_max_delay, omit set_false_path.

The LUTRAM on 7series has a timing path from the write clock to the async read data. I want to make that a false path / max delay while keeping the path from read address to read data. Having the from clock makes that possible.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 15:46 GMT


I would personally opt for a simpler solution: grab the attribute with Tcl and set it, again, with Tcl. I'll throw together a snippet soon so you don't have to do it yourself.

My knowledge of Tcl is quite limited, so I'll leave that to you.

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 16:50 GMT


@dlharmon I think I understand how to do this in Tcl, but could you please explain what set_max_delay actually does here? I've been looking at various documentation for quite some time and I still do not understand why it is useful.

@nmigen-issue-migration
Copy link
Author

Comment by jordens
Monday Sep 23, 2019 at 17:05 GMT


This one is good and what I based the migen async reset synchronizer constraints on.

There are more good posts by that guy in the xilinx forums: http://www.colmryan.org/avrums-clock-domain-crossing-widsom.html

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 17:22 GMT


@dlharmon I think I understand how to do this in Tcl, but could you please explain what set_max_delay actually does here? I've been looking at various documentation for quite some time and I still do not understand why it is useful.

The first link mentioned by @jordens does a decent job explaining it.

Simply adding a false path to the flip flop is the easiest, but allows the place and route unlimited delay in that net which is problematic in some cases. For instance, I have a bunch of 250 MHz same frequency, unknown phase shift crossings that need to come out of reset within a few cycles of each other, so a 5 ns max delay works well there.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 17:26 GMT


Screenshot from 2019-09-23 11-26-25

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 17:32 GMT


For instance, I have a bunch of 250 MHz same frequency, unknown phase shift crossings that need to come out of reset within a few cycles of each other, so a 5 ns max delay works well there.

Do I understand it correctly that set_max_delay reduces (such that it meets the constraint) the uncertainty in phase between different synchronizers?

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 17:39 GMT


Do I understand it correctly that set_max_delay reduces (such that it meets the constraint) the uncertainty in phase between different synchronizers?

Yes, that's a great way to put it.

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 18:26 GMT


@dlharmon Here is my take on adding false path and max delay constraints for Vivado:

diff --git a/nmigen/vendor/xilinx_7series.py b/nmigen/vendor/xilinx_7series.py
index b25c4e9..30dfa64 100644
--- a/nmigen/vendor/xilinx_7series.py
+++ b/nmigen/vendor/xilinx_7series.py
@@ -78,6 +78,10 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
             {% endfor %}
             {{get_override("script_after_read")|default("# (script_after_read placeholder)")}}
             synth_design -top {{name}} -part {{platform.device}}{{platform.package}}-{{platform.speed}}
+            set_false_path -hold -to [get_cells -hier -filter {nmigen.vivado.false_path == TRUE}]
+            foreach cell [get_cells -hier -filter {nmigen.vivado.max_delay != ""}] {
+                set_max_delay [get_property nmigen.vivado.max_delay $cell] -to $cell
+            }
             {{get_override("script_after_synth")|default("# (script_after_synth placeholder)")}}
             report_timing_summary -file {{name}}_timing_synth.rpt
             report_utilization -hierarchical -file {{name}}_utilization_hierachical_synth.rpt
@@ -367,7 +371,26 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
                         reset=ff_sync._reset, reset_less=ff_sync._reset_less,
                         attrs={"ASYNC_REG": "TRUE"})
                  for index in range(ff_sync._stages)]
+        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        flops[0].attrs["nmigen.vivado.max_delay"]  = "5.0" # FIXME
         for i, o in zip((ff_sync.i, *flops), flops):
             m.d[ff_sync._o_domain] += o.eq(i)
         m.d.comb += ff_sync.o.eq(flops[-1])
         return m
+
+    def get_reset_sync(self, reset_sync):
+        m = Module()
+        m.domains += ClockDomain("reset_sync", async_reset=True, local=True)
+        flops = [Signal(1, name="stage{}".format(index), reset=1,
+                        attrs={"ASYNC_REG": "TRUE"})
+                 for index in range(reset_sync._stages)]
+        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        flops[0].attrs["nmigen.vivado.max_delay"]  = "5.0" # FIXME
+        for i, o in zip((0, *flops), flops):
+            m.d.reset_sync += o.eq(i)
+        m.d.comb += [
+            ClockSignal("reset_sync").eq(ClockSignal(reset_sync._domain)),
+            ResetSignal("reset_sync").eq(reset_sync.arst),
+            ResetSignal(reset_sync._domain).eq(flops[-1])
+        ]
+        return m

It looks like you can't use arbitrary Tcl commands in XDC files (AR#54842), so I moved it to the main script. Please let me know if this works properly.

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 18:29 GMT


For instance, I have a bunch of 250 MHz same frequency, unknown phase shift crossings that need to come out of reset within a few cycles of each other, so a 5 ns max delay works well there.

So in the diff above I hardcoded max_delay to 5 ns. This of course should not be hardcoded nor use a global override; it should be set per-synchronizer. However, one concern I have is how it should be set. You are using 5 ns, but you are saying that you want the delay to be no longer than several cycles. So, should this perhaps be derived from the clock? Could set_multicycle_path actually be more appropriate for this case? Should we support both?

@nmigen-issue-migration
Copy link
Author

Comment by jordens
Monday Sep 23, 2019 at 18:31 GMT


Do I understand it correctly that set_max_delay reduces (such that it meets the constraint) the uncertainty in phase between different synchronizers?

It limits the skew and allows for metastability resolution. Between two flip flops in a synchronizer chain in the target clock domain, there is t_clk-t_delay time to resolve metastability. Ergo the delay must be as short as possible and can not be unlimited.

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 18:34 GMT


It limits the skew and allows for metastability resolution. Between two flip flops in a synchronizer chain in the target clock domain, there is t_clk-t_delay time to resolve metastability. Ergo the delay must be as short as possible and can not be unlimited.

OK. Do I understand correctly that there are two independent purposes served by set_max_delay in this use case then? So even if you don't care about skew, you must limit the delay anyway, if you want the synchronizer to work.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 18:44 GMT


It limits the skew and allows for metastability resolution. Between two flip flops in a synchronizer chain in the target clock domain, there is t_clk-t_delay time to resolve metastability. Ergo the delay must be as short as possible and can not be unlimited.

OK. Do I understand correctly that there are two independent purposes served by set_max_delay in this use case then? So even if you don't care about skew, you must limit the delay anyway, if you want the synchronizer to work.

I have personally just used ASYNC_REG="TRUE" for constraining the latter flip flops in synchronizers. It's supposed to pack them all into a single CLB and minimize timing delays as much as possible. Some do suggest doing set_max_delay on these as well with a small time like 3 ns. Vivado seems to do the right thing without it. One thing to watch out for is that the time set must be sufficiently short for the fastest clock used.

Edit: The first flip flop certainly does need set_max_delay.

I'll try your patch shortly. It looks promising.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 19:00 GMT


It looks like you can't use arbitrary Tcl commands in XDC files (AR#54842), so I moved it to the main script. Please let me know if this works properly.

I merged this with my local copy and built a bitstream. The timing report and constraints look good. I like how this works. Thanks.

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 19:07 GMT


OK, so now the question is how to make the delay configurable. Here's my proposal:

  • max_input_delay becomes an argument to the constructor of FFSynchronizer/ResetSynchronizer;
  • It is None by default;
  • If it is not None and it is not supported by the toolchain, an exception is raised.

Opinion?

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 19:44 GMT


OK, so now the question is how to make the delay configurable. Here's my proposal:

* `max_input_delay` becomes an argument to the constructor of `FFSynchronizer`/`ResetSynchronizer`;

* It is `None` by default;

* If it is not `None` and it is not supported by the toolchain, an exception is raised.

Opinion?

That seems fine.

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 19:49 GMT


Allow the option of a full false path, not just hold only:

diff --git a/nmigen/vendor/xilinx_7series.py b/nmigen/vendor/xilinx_7series.py
index 953229a..62ab94a 100644
--- a/nmigen/vendor/xilinx_7series.py
+++ b/nmigen/vendor/xilinx_7series.py
@@ -79,7 +79,8 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
             {% endfor %}
             {{get_override("script_after_read")|default("# (script_after_read placeholder)")}}
             synth_design -top {{name}} -part {{platform.device}}{{platform.package}}-{{platform.speed}}
-            set_false_path -hold -to [get_cells -hier -filter {nmigen.vivado.false_path == TRUE}]
+            set_false_path -hold -to [get_cells -hier -filter {nmigen.vivado.false_path == HOLD_ONLY}]
+            set_false_path -to [get_cells -hier -filter {nmigen.vivado.false_path == TRUE}]
             foreach cell [get_cells -hier -filter {nmigen.vivado.max_delay != ""}] {
                 set_max_delay [get_property nmigen.vivado.max_delay $cell] -to $cell
             }
@@ -372,7 +373,7 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
                         reset=ff_sync._reset, reset_less=ff_sync._reset_less,
                         attrs={"ASYNC_REG": "TRUE"})
                  for index in range(ff_sync._stages)]
-        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        flops[0].attrs["nmigen.vivado.false_path"] = "HOLD_ONLY"
         flops[0].attrs["nmigen.vivado.max_delay"]  = "5.0" # FIXME
         for i, o in zip((ff_sync.i, *flops), flops):
             m.d[ff_sync._o_domain] += o.eq(i)
@@ -385,7 +386,7 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
         flops = [Signal(1, name="stage{}".format(index), reset=1,
                         attrs={"ASYNC_REG": "TRUE"})
                  for index in range(reset_sync._stages)]
-        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        flops[0].attrs["nmigen.vivado.false_path"] = "HOLD_ONLY"
         flops[0].attrs["nmigen.vivado.max_delay"]  = "5.0" # FIXME
         for i, o in zip((0, *flops), flops):
             m.d.reset_sync += o.eq(i)```

@nmigen-issue-migration
Copy link
Author

Comment by whitequark
Monday Sep 23, 2019 at 19:51 GMT


Allow the option of a full false path, not just hold only

The other way around?

@nmigen-issue-migration
Copy link
Author

Comment by dlharmon
Monday Sep 23, 2019 at 19:59 GMT


Allow the option of a full false path, not just hold only

The other way around?

My intent was to make it clear that it's a hold only false path in the FFSynchronizer and provide the possibility of making a setup + hold false path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant