Adjust VPR flags to reduce runtime #1735

HackerFoo · 2020-10-30T00:18:02Z

With these settings, baselitex on the 50t is 24% faster on pack, place and route, while ibex is 16% faster, with scalable_proc tests running faster as well.

xc/common/cmake/arch_define.cmake

litghost · 2020-10-30T16:05:48Z

CI is red, looks like flags were not deduped correctly.

Also I'd like to see what the runtime / QoR trade off looked like.

HackerFoo · 2020-10-30T16:55:16Z

Comparing the last run to #1726 (old) for baselitex:

old runtime:
  pack:   147.20 seconds (max_rss 3498.1 MiB)
  place:  185.75 seconds (max_rss 3471.0 MiB)
  route:  488.92 seconds (max_rss 3471.1 MiB)

new runtime:
  pack:   152.99 seconds (max_rss 3499.8 MiB)
  place:  175.32 seconds (max_rss 3472.9 MiB)
  route:  293.65 seconds (max_rss 3472.3 MiB)

old CPD:
  sys_clk to sys_clk CPD:       18.2112  ns (54.9112 MHz)
  clk200_clk to clk200_clk CPD:  5.77223 ns (173.243 MHz)

new CPD:
  sys_clk to sys_clk CPD:       18.2289  ns (54.8581 MHz)
  clk200_clk to clk200_clk CPD:  5.66468 ns (176.533 MHz)

litghost · 2020-10-30T16:59:22Z

Also I'd like to see what the runtime / QoR trade off looked like.

Let me be more specific. Much like inner num for the placer, A* is a runtime versus quality trade off. The lookahead quality determines the sharpness of the trade off. I'd like to see how close to the edge we are with an A* of 1.8. Also at some point as A* gets higher, the router may actually slow down again as the router excessively trusts the lookahead.

So I'm basically asking for two graphs, with A* on the x-axis of both. On the y-axis of one is the runtime, and the other is CPD. A third possible graph would be iterations to convergence on the y-axis.

How many circuits are you testing with?

HackerFoo · 2020-10-30T17:08:29Z

I've analyzed the runtime tradeoffs of 5 different parameters for baselitex and ibex on the arty. Here is the Colab I've been working from.

litghost · 2020-10-30T17:10:19Z

Two circuits is likely too small to be confident that the new parameters are robust. At a minimum I would add something like the scalable proc so you can increase the fabric usage pressure and make sure the new parameters are not too optimistic.

litghost · 2020-10-30T17:12:39Z

Did I miss it, or did you not test A* = 1.2 (the VPR default)? Also I recommend running at least one run at A* <= 1, as this will approach the best case QoR (from a router standpoint), and give you a ratio of best QoR / worst runtime point versus a best runtime / ??? QoR point.

HackerFoo · 2020-10-30T17:32:59Z

I did run 1.2, but the data isn't there because I focused the matrix of parameters on what was working well. As you can see from the data above, though, there is little to no impact on CPD.

For scalable_proc:

old:
top_bram_n8: 164.11 seconds (max_rss 747.3 MiB)
top_bram36_n8: 166.70 seconds (max_rss 746.6 MiB)
top_dram_n3: 65.49 seconds (max_rss 706.4 MiB)

new:
top_bram_n8: 95.26 seconds (max_rss 746.9 MiB)
top_bram36_n8: 154.12 seconds (max_rss 746.3 MiB)
top_dram_n3: 48.55 seconds (max_rss 706.7 MiB)

HackerFoo · 2020-10-30T17:36:53Z

sqlite3.OperationalError: database or disk is full

litghost · 2020-10-30T17:43:19Z

sqlite3.OperationalError: database or disk is full

We've been seeing this, but it isn't clear why. df -h reports that the working disk is 4 TB which feels like a network backed disk. You can examine the logs from #1725 to see this behavior. My best guess is even though df -h is reporting plenty of free space, there is an effective limit that is not obvious.

========================================
Disk usage
----------------------------------------
Filesystem      Size  Used Avail Use% Mounted on
udev             52G     0   52G   0% /dev
tmpfs            11G  8.6M   11G   1% /run
/dev/sda1        99G   69G   26G  73% /
tmpfs            52G     0   52G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            52G     0   52G   0% /sys/fs/cgroup
/dev/sdc1       246G   43G  203G  18% /opt/Xilinx
cgmfs           100K     0  100K   0% /run/cgmanager/fs
/dev/sdb1       4.0T  5.7G  4.0T   1% /tmpfs
tmpfs            11G     0   11G   0% /run/user/1000
----------------------------------------

The working directory is /tmpfs for kokoro.

acomodi · 2020-11-03T13:26:47Z

I think that if we get good improvements in run-time at a slight CPD cost, it might be worth adding these flags.

One thing though would be to compare the xc7_qor produced by this PR's Kokoro CI and the one produced by the current master.

litghost · 2020-11-03T15:52:41Z

One thing though would be to compare the xc7_qor produced by this PR's Kokoro CI and the one produced by the current master.

xc7 QoR looks really good. The results from the arch-defs show this appears to be a solid point, at least per the circuits in arch-defs.

@HackerFoo can we add curves to the Colab page? I think we should be prepared to show these results this Thursday and see if we can get some insight from Vaughn.

xc/common/cmake/arch_define.cmake

HackerFoo · 2020-11-05T16:27:46Z

I'm running a sweep and also documenting how I do this here.

HackerFoo · 2020-11-05T17:35:09Z

I've added a scatter plot of runtime vs. max frequency for each of ibex, baselitex, and bram-n3.

xc/common/cmake/arch_define.cmake

HackerFoo · 2020-11-11T00:01:24Z

@litghost I've added more detailed instructions to the colab.

litghost · 2020-11-11T01:12:44Z

@litghost I've added more detailed instructions to the colab.

New instructions look good. Please update the Colab copy in git. Last thing (besides looking at CI results) is to change the references for https://github.com/HackerFoo/nix-symbiflow to https://github.com/Symbiflow/nix-symbiflow . You need to add DCO checks to that repo, and add DCO on your commits.

GitHub
HackerFoo/nix-symbiflow
Nix packages for SymbiFlow projects and dependencies - HackerFoo/nix-symbiflow

GitHub
Build software better, together
GitHub is where people build software. More than 50 million people use GitHub to discover, fork, and contribute to over 100 million projects.

HackerFoo · 2020-11-12T20:06:06Z

New instructions look good. Please update the Colab copy in git. Last thing (besides looking at CI results) is to change the references for https://github.com/HackerFoo/nix-symbiflow to https://github.com/Symbiflow/nix-symbiflow . You need to add DCO checks to that repo, and add DCO on your commits.

@litghost I've updated the Colab in git, and changed the references to https://github.com/SymbiFlow/nix-symbiflow, which has DCO checks.

I'm re-running the "Xilinx Series 7 - Install (Presubmit)" test which failed due to an infrastructure failure.

Assuming there are no problems with that, is this PR okay to merge?

GitHub
HackerFoo/nix-symbiflow
Nix packages for SymbiFlow projects and dependencies - HackerFoo/nix-symbiflow

GitHub
SymbiFlow/nix-symbiflow
Nix packages for SymbiFlow projects and dependencies - SymbiFlow/nix-symbiflow

HackerFoo · 2020-11-12T20:54:48Z

Runtime is 24% faster for ibex (2% higher CPD), 10% faster for litex (7% lower CPD.)

litghost · 2020-11-12T23:07:01Z

Runtime is 24% faster for ibex (2% higher CPD), 10% faster for litex (7% lower CPD.)

So the previous settings from ead80ae were a Pareto improvement on geomean(route_time) and geomean(CPD) over master. That is not the case on the latest settings. I believe the settings from ead80ae were a better point?

HackerFoo · 2020-11-12T23:45:25Z

@litghost Which designs are worse? I can revert the settings to ead80ae.

litghost · 2020-11-13T00:05:03Z

ibex and ddr_uart_arty both show significant change, but wide array of designs are worse at that design point. Mean percentage change is 4% worse, geomean CPD is 3% worse. The other point had a geomean CPD change of less than 0.2 %. Given that ead80ae has most of the performance gain with basically no geomean CPD change feels like a superior trade.

HackerFoo · 2020-11-13T00:12:26Z

@litghost Okay, I've reverted the settings. Anything else before I merge this?

litghost · 2020-11-13T01:18:48Z

@litghost Okay, I've reverted the settings. Anything else before I merge this?

Just waiting for green. Please rebase on master to grab the other fixes.

litghost

LGTM, merge once green. Recommend rebasing on master.

Signed-off-by: Dusty DeWeese <dustin.deweese@gmail.com>

HackerFoo · 2020-11-13T13:47:52Z

Vendor tool tests are failing due to unrelated compilation errors.

litghost · 2020-11-13T18:12:32Z

requirements.txt

@@ -1,6 +1,6 @@
 cairosvg
 gitpython
-hilbertcurve
+hilbertcurve==1.0.5


@HackerFoo Please file an issue with upstream hilbertcurve, and create an issue (and PR to add a TODO comment here) to remove the pin once upstream hilbertcurve is fixed.

I don't think the issue is upstream. The API changed, which is okay for a major version change (2.x)

The PR comment in galtay/hilbertcurve#25 (comment) indicated that that specific PR was supposed to be backwards compatible. It is unclear if the API interface was supposed to change here, or if it was an accident.

My comment was basically to post an issue upstream showing that from 1.0.5 to 2.0.x that this API no longer existed/worked, and determine if that break was intentional?

Although the notes from 2.0.3 do say New API, which is foreboding.

Regardless, we should likely create an issue to remove the pin in the future to avoid being in a situation where we have a very stale dependency. This isn't equivalent, but a numpy pin on prjxray eventually resulted in the pip install for numpy to require building it instead of using pip binary caches. Ideally we'd like to avoid something like that.

I propose removing the dependency and using VPR's RR node reordering option: #1773

Sure. Can you please open a PR to that effect?

Yeah, I'll assign the issue to me.

…time" This reverts commit b175e3a, reversing changes made to f71a554.

…time" This reverts commit b175e3a, reversing changes made to f71a554. Signed-off-by: Keith Rothman <537074+litghost@users.noreply.github.com>

litghost reviewed Oct 30, 2020

View reviewed changes

xc/common/cmake/arch_define.cmake Show resolved Hide resolved

HackerFoo force-pushed the flags_reduce_runtime branch from 452b070 to ead80ae Compare October 30, 2020 16:19

HackerFoo added the kokoro:force-run label Oct 30, 2020

symbiflow-kokoro removed the kokoro:force-run label Oct 30, 2020

HackerFoo added the kokoro:force-run label Oct 30, 2020

symbiflow-kokoro removed the kokoro:force-run label Oct 30, 2020

HackerFoo added the kokoro:force-run label Nov 2, 2020

symbiflow-kokoro removed the kokoro:force-run label Nov 2, 2020

HackerFoo requested review from litghost and acomodi November 3, 2020 02:02

litghost reviewed Nov 3, 2020

View reviewed changes

xc/common/cmake/arch_define.cmake Show resolved Hide resolved

litghost reviewed Nov 3, 2020

View reviewed changes

xc/common/cmake/arch_define.cmake Show resolved Hide resolved

HackerFoo force-pushed the flags_reduce_runtime branch from ead80ae to 8ed6589 Compare November 9, 2020 19:20

HackerFoo requested a review from litghost November 9, 2020 23:50

litghost reviewed Nov 10, 2020

View reviewed changes

xc/common/cmake/arch_define.cmake Show resolved Hide resolved

HackerFoo added the kokoro:force-run label Nov 10, 2020

symbiflow-kokoro removed the kokoro:force-run label Nov 10, 2020

HackerFoo mentioned this pull request Nov 11, 2020

CI is broken #1768

Closed

HackerFoo force-pushed the flags_reduce_runtime branch 2 times, most recently from 5e93ce2 to 7b5826b Compare November 11, 2020 20:03

HackerFoo added the kokoro:force-run label Nov 12, 2020

symbiflow-kokoro removed the kokoro:force-run label Nov 12, 2020

HackerFoo added the kokoro:force-run label Nov 12, 2020

symbiflow-kokoro removed the kokoro:force-run label Nov 12, 2020

litghost approved these changes Nov 13, 2020

View reviewed changes

HackerFoo added 3 commits November 12, 2020 17:25

Adjust VPR flags to reduce runtime

78a2278

Signed-off-by: Dusty DeWeese <dustin.deweese@gmail.com>

Add utils/ipynb/Parameter_Sweep_using_fpga_tool_perf.ipynb

b1817ea

Signed-off-by: Dusty DeWeese <dustin.deweese@gmail.com>

Revert to settings used in ead80ae

f463d20

Signed-off-by: Dusty DeWeese <dustin.deweese@gmail.com>

HackerFoo force-pushed the flags_reduce_runtime branch from 99d1d54 to f463d20 Compare November 13, 2020 01:26

Prevent hilbertcurve from updating to 2.x

f9862c2

Signed-off-by: Dusty DeWeese <dustin.deweese@gmail.com>

HackerFoo merged commit b175e3a into f4pga:master Nov 13, 2020

litghost mentioned this pull request Nov 13, 2020

Revert "Adjust VPR flags to reduce runtime" #1771

Closed

litghost reviewed Nov 13, 2020

View reviewed changes

litghost added a commit to litghost/symbiflow-arch-defs that referenced this pull request Nov 16, 2020

Revert "Merge pull request f4pga#1735 from HackerFoo/flags_reduce_run…

fcdf51a

…time" This reverts commit b175e3a, reversing changes made to f71a554.

litghost mentioned this pull request Feb 26, 2021

Trade-off pnr time and performance #2058

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjust VPR flags to reduce runtime #1735

Adjust VPR flags to reduce runtime #1735

HackerFoo commented Oct 30, 2020 •

edited

litghost commented Oct 30, 2020

HackerFoo commented Oct 30, 2020 •

edited

litghost commented Oct 30, 2020

HackerFoo commented Oct 30, 2020 •

edited

litghost commented Oct 30, 2020 •

edited

litghost commented Oct 30, 2020 •

edited

HackerFoo commented Oct 30, 2020 •

edited

HackerFoo commented Oct 30, 2020

litghost commented Oct 30, 2020 •

edited

acomodi commented Nov 3, 2020

litghost commented Nov 3, 2020 •

edited

HackerFoo commented Nov 5, 2020

HackerFoo commented Nov 5, 2020

HackerFoo commented Nov 11, 2020

litghost commented Nov 11, 2020 •

edited by unfurl-links bot

HackerFoo commented Nov 12, 2020 •

edited by unfurl-links bot

HackerFoo commented Nov 12, 2020

litghost commented Nov 12, 2020 •

edited

HackerFoo commented Nov 12, 2020

litghost commented Nov 13, 2020 •

edited

HackerFoo commented Nov 13, 2020

litghost commented Nov 13, 2020

litghost left a comment

HackerFoo commented Nov 13, 2020

litghost Nov 13, 2020 •

edited

HackerFoo Nov 13, 2020 •

edited

litghost Nov 13, 2020

litghost Nov 13, 2020

litghost Nov 13, 2020

HackerFoo Nov 13, 2020

litghost Nov 13, 2020

HackerFoo Nov 13, 2020

Adjust VPR flags to reduce runtime #1735

Adjust VPR flags to reduce runtime #1735

Conversation

HackerFoo commented Oct 30, 2020 • edited

litghost commented Oct 30, 2020

HackerFoo commented Oct 30, 2020 • edited

litghost commented Oct 30, 2020

HackerFoo commented Oct 30, 2020 • edited

litghost commented Oct 30, 2020 • edited

litghost commented Oct 30, 2020 • edited

HackerFoo commented Oct 30, 2020 • edited

HackerFoo commented Oct 30, 2020

litghost commented Oct 30, 2020 • edited

acomodi commented Nov 3, 2020

litghost commented Nov 3, 2020 • edited

HackerFoo commented Nov 5, 2020

HackerFoo commented Nov 5, 2020

HackerFoo commented Nov 11, 2020

litghost commented Nov 11, 2020 • edited by unfurl-links bot

HackerFoo commented Nov 12, 2020 • edited by unfurl-links bot

HackerFoo commented Nov 12, 2020

litghost commented Nov 12, 2020 • edited

HackerFoo commented Nov 12, 2020

litghost commented Nov 13, 2020 • edited

HackerFoo commented Nov 13, 2020

litghost commented Nov 13, 2020

litghost left a comment

Choose a reason for hiding this comment

HackerFoo commented Nov 13, 2020

litghost Nov 13, 2020 • edited

Choose a reason for hiding this comment

HackerFoo Nov 13, 2020 • edited

Choose a reason for hiding this comment

litghost Nov 13, 2020

Choose a reason for hiding this comment

litghost Nov 13, 2020

Choose a reason for hiding this comment

litghost Nov 13, 2020

Choose a reason for hiding this comment

HackerFoo Nov 13, 2020

Choose a reason for hiding this comment

litghost Nov 13, 2020

Choose a reason for hiding this comment

HackerFoo Nov 13, 2020

Choose a reason for hiding this comment

HackerFoo commented Oct 30, 2020 •

edited

HackerFoo commented Oct 30, 2020 •

edited

HackerFoo commented Oct 30, 2020 •

edited

litghost commented Oct 30, 2020 •

edited

litghost commented Oct 30, 2020 •

edited

HackerFoo commented Oct 30, 2020 •

edited

litghost commented Oct 30, 2020 •

edited

litghost commented Nov 3, 2020 •

edited

litghost commented Nov 11, 2020 •

edited by unfurl-links bot

HackerFoo commented Nov 12, 2020 •

edited by unfurl-links bot

litghost commented Nov 12, 2020 •

edited

litghost commented Nov 13, 2020 •

edited

litghost Nov 13, 2020 •

edited

HackerFoo Nov 13, 2020 •

edited