Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tight packing #47

Merged
merged 4 commits into from
Sep 22, 2021
Merged

Tight packing #47

merged 4 commits into from
Sep 22, 2021

Conversation

definelicht
Copy link
Owner

@definelicht definelicht commented Sep 16, 2021

Support for lower-than-1-byte types to be tightly packed in DataPack implemented by @derpda.

derpda and others added 3 commits September 16, 2021 14:03
* Initial try at WidthCalculator for DataPack

More template magic could make this nicer.

* Fix: Was using wrong type

* Small cleanup

* Specializations improved,ap_int & ap_ufixed added

* Catch update, test cases added for Xilinx types

* Bit width was too small for test value

* Prevent segfaults by explicit type usage

Function resolution seems to not work as intended for some reason...

* Fix: Replace reinterpret_cast for xilinx types

* Comment finished

* TypeHandler made struct, moved to namespace

* Removed C-style type conversions

* Typo fixed in DataPack.h

Co-authored-by: definelicht <johannes@musicmedia.dk>

* Code review: Pass ap_uint by refernce, not value

Co-authored-by: definelicht <johannes@musicmedia.dk>

Co-authored-by: definelicht <johannes@musicmedia.dk>
@definelicht
Copy link
Owner Author

@derpda This is what I was talking about, just adding a kernel that can is run through HLS to make sure that it synthesizes correctly.

@derpda
Copy link
Contributor

derpda commented Sep 17, 2021

I see, that was simpler than I thought... Thank you!
I didn't find the time to understand the test framework for HLS, but it looks like you have a great system set up here.

In other news, my code now works with the new tight DataPack! I don't quite understand what went wrong, but changing a part (seemingly) unrelated to the DataPack changes made it work again.

Should we add more tests to this regardless?

@definelicht
Copy link
Owner Author

It's okay like this unless you found something that could cause the issue in your code that we should check for

@derpda
Copy link
Contributor

derpda commented Sep 17, 2021

I do see the expected (in my case significant) decreases in BRAM and URAM usage, and the code works so that's great.
However one thing I am noticing is that (with the otherwise exact same code) my DSP usage goes way up (almost doubles).
There could probably be many reasons for this, but might there be something in the change we made here that could cause such an increase?

@derpda
Copy link
Contributor

derpda commented Sep 17, 2021

It must have something to do with the to_range and from_range functions, but I don't see how they would be using DSPs.
Could inlining help? I'm a bit out of my depth here I think.

@definelicht
Copy link
Owner Author

definelicht commented Sep 17, 2021

You can provide the exact minimal example where you see this happening

@definelicht
Copy link
Owner Author

Could you provide the code so we can take a look at what's going on?

@derpda
Copy link
Contributor

derpda commented Sep 21, 2021

Sorry, it was a holiday weekend over here!
I am having difficulty reproducing the issue in a minimal code.
Maybe HLS just failed at some optimization for my particular application code, leading to the difference?
Even in my full application I cannot tell for sure if it always causes that problem.
I doubled the size of my central systolic array and the number of DSPs used did not increase (which is strange, that array uses most of my resources).

Either way, you can find the (fairly) minimal example I was tinkering with here:
https://github.com/derpda/tight_pack_minimal
Configuration can be done as follows:
cmake -DTIGHT_PACKING=<ON|OFF> -DPLATFORM=<platform>..
This will switch the git branch used for the download of hlslib.
If you want to try it, make sure to use separate build folders since running cmake again with a changed TIGHT_PACKING doesn't re-download hlslib.

@definelicht
Copy link
Owner Author

definelicht commented Sep 21, 2021

Sorry, it was a holiday weekend over here!

Of course, no worries.

In the diff I get, the tightly packed version seems to use less LUTs, FFs, and BRAM, and does not change the DSP usage (left is master, right is tight-packing):

diff.txt

To me this looks exactly like what I'd expect. What was your concern?

@definelicht
Copy link
Owner Author

By the way, check out the new add_vitis_kernel and add_vitis_program, maybe they're useful to you

@derpda
Copy link
Contributor

derpda commented Sep 22, 2021

Well, like I said I had trouble reproducing the issue.

HLS compilation and linking is kind of an opaque process, so it's hard to say why it caused the increased usage in my code.
I cannot come up with a reason why it would have caused increased usage, but for some reason it did.
However with some recent changes to my code (that should not be related) usage miraculously went back down so I think we are good here.

@derpda
Copy link
Contributor

derpda commented Sep 22, 2021

And thank you for the tip about add_vitis_kernel and add_vitis_program! I'll check it out.

@definelicht definelicht merged commit 25add79 into master Sep 22, 2021
@definelicht definelicht deleted the tight-packing branch September 22, 2021 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants