Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tight packing for DataPack on hardware #43

Closed
derpda opened this issue Sep 2, 2021 · 2 comments
Closed

Tight packing for DataPack on hardware #43

derpda opened this issue Sep 2, 2021 · 2 comments

Comments

@derpda
Copy link
Contributor

derpda commented Sep 2, 2021

In DataPack, the data is stored in an ap_uint of bit width 8*<bytes in T>.

class DataPack {
static_assert(width > 0, "Width must be positive");
public:
static constexpr int kBits = 8 * sizeof(T);
static constexpr int kWidth = width;
using Pack_t = ap_uint<kBits>;
using Internal_t = ap_uint<width * kBits>;

This of course is necessary on a CPU since it can only access memory on a resolution of bytes.

However, on an FPGA this introduces significant storage inefficiency. I want to store an array of fixed-point data in the URAM of the FPGA. I have some liberty to decide the bit-width of a single data element. Considering that it is best to use power-of-two shaped accesses, we want to choose a bit width so that some power-of-two elements fit into 72 bits. This means we can choose 9, 18, 36 or 72 as the bit-width for best efficiency in the URAM. In our case, we were able to make do with 9bits.

My first try was thus doing something like the following:

hlslib::DataPack<ap_fixed<9, 1>, 8> array[N];
#pragma HLS bind_storage variable=array type=RAM_1P impl=URAM

Due to line 31 in the code I cited above however, the DataPack actually internally uses a uint<128> to store the 8 fixed-point numbers, not uint<72>.
This of course means I need twice as much URAM for this type of array.

To solve this for my case, I made my own tiny class (taking the important parts from your DataPack):

template<int width>
class TightPack {
 public:
  using Data_t = ap_fixed<9, 1>;
  static constexpr int kWidth = width;
  #if defined(HLSLIB_SYNTHESIS)
  static constexpr int kBits = Data_t::width;
  #else
  static constexpr int kBits = 8 * sizeof(Data_t);
  #endif
 private:
  ap_uint<width*kBits> data_;
  using Pack_t = ap_uint<kBits>;
 public:
  Data_t Get(int i) const {
    // Just like regular DataPack
  }

  void Set(int i, weight_t val) {
    // Just like regular DataPack
  }
};

The only difference (aside from a lot of missing functionality) is the calculation of kWidth, and I made that depend on whether we are in synthesis or not.
It is also specialized for ap_fixed.

Ideally, I would like to make a small template specialization of your DataPack class instead, as in

template<int width>
class DataPack<ap_fixed> {
  // Only change kWidth implementation
};

However, this would mean I would need to copy the entire DataPack class content in there, defeating the purpose of specialization.
Maybe putting a helper class like the following in between might improve this:

template<typename T>
WidthCalculator {
 public:
  static constexpr int value = 8  * sizeof(T);
};

DataPack could use this class to calculate kBits in line 31 like:

  static constexpr int kBits = WidthCalculator<T>::value;

Then I could subclass the WidthCalculator to my liking, for example

template<>
template<int W, int I>
WidthCalculator<ap_fixed<W, I> {
 public:
  static constexpr int value = W;
};

and use DataPack<ap_fixed<9, 1>, 8> for optimal storage in URAM (and BRAM for that matter).

Actually, might this specialization (and similar ones for the other fixed width types in HLS) be important enough to include in the library itself?

@definelicht
Copy link
Owner

I think we should go with the width calculator approach. I've commented on the PR.

@definelicht
Copy link
Owner

Solved with #47

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants