-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust VPR flags to reduce runtime #1735
Conversation
CI is red, looks like flags were not deduped correctly. Also I'd like to see what the runtime / QoR trade off looked like. |
452b070
to
ead80ae
Compare
Comparing the last run to #1726 (old) for baselitex:
|
Let me be more specific. Much like inner num for the placer, A* is a runtime versus quality trade off. The lookahead quality determines the sharpness of the trade off. I'd like to see how close to the edge we are with an A* of 1.8. Also at some point as A* gets higher, the router may actually slow down again as the router excessively trusts the lookahead. So I'm basically asking for two graphs, with A* on the x-axis of both. On the y-axis of one is the runtime, and the other is CPD. A third possible graph would be iterations to convergence on the y-axis. How many circuits are you testing with? |
I've analyzed the runtime tradeoffs of 5 different parameters for baselitex and ibex on the arty. Here is the Colab I've been working from. |
Two circuits is likely too small to be confident that the new parameters are robust. At a minimum I would add something like the scalable proc so you can increase the fabric usage pressure and make sure the new parameters are not too optimistic. |
Did I miss it, or did you not test A* = 1.2 (the VPR default)? Also I recommend running at least one run at A* <= 1, as this will approach the best case QoR (from a router standpoint), and give you a ratio of best QoR / worst runtime point versus a best runtime / ??? QoR point. |
I did run 1.2, but the data isn't there because I focused the matrix of parameters on what was working well. As you can see from the data above, though, there is little to no impact on CPD. For scalable_proc:
|
|
We've been seeing this, but it isn't clear why.
The working directory is |
I think that if we get good improvements in run-time at a slight CPD cost, it might be worth adding these flags. One thing though would be to compare the xc7_qor produced by this PR's Kokoro CI and the one produced by the current master. |
xc7 QoR looks really good. The results from the arch-defs show this appears to be a solid point, at least per the circuits in arch-defs. @HackerFoo can we add curves to the Colab page? I think we should be prepared to show these results this Thursday and see if we can get some insight from Vaughn. |
I'm running a sweep and also documenting how I do this here. |
I've added a scatter plot of runtime vs. max frequency for each of ibex, baselitex, and bram-n3. |
ead80ae
to
8ed6589
Compare
@litghost I've added more detailed instructions to the colab. |
New instructions look good. Please update the Colab copy in git. Last thing (besides looking at CI results) is to change the references for https://github.com/HackerFoo/nix-symbiflow to https://github.com/Symbiflow/nix-symbiflow . You need to add DCO checks to that repo, and add DCO on your commits.
|
5e93ce2
to
7b5826b
Compare
@litghost I've updated the Colab in git, and changed the references to https://github.com/SymbiFlow/nix-symbiflow, which has DCO checks. I'm re-running the "Xilinx Series 7 - Install (Presubmit)" test which failed due to an infrastructure failure. Assuming there are no problems with that, is this PR okay to merge?
|
Runtime is 24% faster for ibex (2% higher CPD), 10% faster for litex (7% lower CPD.) |
So the previous settings from ead80ae were a Pareto improvement on |
ibex and ddr_uart_arty both show significant change, but wide array of designs are worse at that design point. Mean percentage change is 4% worse, geomean CPD is 3% worse. The other point had a geomean CPD change of less than 0.2 %. Given that ead80ae has most of the performance gain with basically no geomean CPD change feels like a superior trade. |
@litghost Okay, I've reverted the settings. Anything else before I merge this? |
Just waiting for green. Please rebase on master to grab the other fixes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, merge once green. Recommend rebasing on master.
Signed-off-by: Dusty DeWeese <dustin.deweese@gmail.com>
Signed-off-by: Dusty DeWeese <dustin.deweese@gmail.com>
Signed-off-by: Dusty DeWeese <dustin.deweese@gmail.com>
99d1d54
to
f463d20
Compare
Signed-off-by: Dusty DeWeese <dustin.deweese@gmail.com>
Vendor tool tests are failing due to unrelated compilation errors. |
@@ -1,6 +1,6 @@ | |||
cairosvg | |||
gitpython | |||
hilbertcurve | |||
hilbertcurve==1.0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HackerFoo Please file an issue with upstream hilbertcurve
, and create an issue (and PR to add a TODO comment here) to remove the pin once upstream hilbertcurve
is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the issue is upstream. The API changed, which is okay for a major version change (2.x)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR comment in galtay/hilbertcurve#25 (comment) indicated that that specific PR was supposed to be backwards compatible. It is unclear if the API interface was supposed to change here, or if it was an accident.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was basically to post an issue upstream showing that from 1.0.5 to 2.0.x that this API no longer existed/worked, and determine if that break was intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the notes from 2.0.3 do say New API
, which is foreboding.
Regardless, we should likely create an issue to remove the pin in the future to avoid being in a situation where we have a very stale dependency. This isn't equivalent, but a numpy pin on prjxray eventually resulted in the pip install for numpy to require building it instead of using pip binary caches. Ideally we'd like to avoid something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose removing the dependency and using VPR's RR node reordering option: #1773
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Can you please open a PR to that effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll assign the issue to me.
With these settings,
baselitex
on the 50t is 24% faster on pack, place and route, whileibex
is 16% faster, withscalable_proc
tests running faster as well.