Tight packing for DataPack on hardware #43

derpda · 2021-09-02T04:10:12Z

In DataPack, the data is stored in an ap_uint of bit width 8*<bytes in T>.

Lines 25 to 34 in 54b23af

    
           class DataPack { 
        
             static_assert(width > 0, "Width must be positive"); 
        
            public: 
        
             static constexpr int kBits = 8 * sizeof(T); 
        
             static constexpr int kWidth = width; 
        
             using Pack_t = ap_uint<kBits>; 
        
             using Internal_t = ap_uint<width * kBits>;

This of course is necessary on a CPU since it can only access memory on a resolution of bytes.

However, on an FPGA this introduces significant storage inefficiency. I want to store an array of fixed-point data in the URAM of the FPGA. I have some liberty to decide the bit-width of a single data element. Considering that it is best to use power-of-two shaped accesses, we want to choose a bit width so that some power-of-two elements fit into 72 bits. This means we can choose 9, 18, 36 or 72 as the bit-width for best efficiency in the URAM. In our case, we were able to make do with 9bits.

My first try was thus doing something like the following:

hlslib::DataPack<ap_fixed<9, 1>, 8> array[N];
#pragma HLS bind_storage variable=array type=RAM_1P impl=URAM

Due to line 31 in the code I cited above however, the DataPack actually internally uses a uint<128> to store the 8 fixed-point numbers, not uint<72>.
This of course means I need twice as much URAM for this type of array.

To solve this for my case, I made my own tiny class (taking the important parts from your DataPack):

template<int width>
class TightPack {
 public:
  using Data_t = ap_fixed<9, 1>;
  static constexpr int kWidth = width;
  #if defined(HLSLIB_SYNTHESIS)
  static constexpr int kBits = Data_t::width;
  #else
  static constexpr int kBits = 8 * sizeof(Data_t);
  #endif
 private:
  ap_uint<width*kBits> data_;
  using Pack_t = ap_uint<kBits>;
 public:
  Data_t Get(int i) const {
    // Just like regular DataPack
  }

  void Set(int i, weight_t val) {
    // Just like regular DataPack
  }
};

The only difference (aside from a lot of missing functionality) is the calculation of kWidth, and I made that depend on whether we are in synthesis or not.
It is also specialized for ap_fixed.

Ideally, I would like to make a small template specialization of your DataPack class instead, as in

template<int width>
class DataPack<ap_fixed> {
  // Only change kWidth implementation
};

However, this would mean I would need to copy the entire DataPack class content in there, defeating the purpose of specialization.
Maybe putting a helper class like the following in between might improve this:

template<typename T>
WidthCalculator {
 public:
  static constexpr int value = 8  * sizeof(T);
};

DataPack could use this class to calculate kBits in line 31 like:

  static constexpr int kBits = WidthCalculator<T>::value;

Then I could subclass the WidthCalculator to my liking, for example

template<>
template<int W, int I>
WidthCalculator<ap_fixed<W, I> {
 public:
  static constexpr int value = W;
};

and use DataPack<ap_fixed<9, 1>, 8> for optimal storage in URAM (and BRAM for that matter).

Actually, might this specialization (and similar ones for the other fixed width types in HLS) be important enough to include in the library itself?

The text was updated successfully, but these errors were encountered:

definelicht · 2021-09-02T08:54:56Z

I think we should go with the width calculator approach. I've commented on the PR.

definelicht · 2021-09-22T08:56:35Z

Solved with #47

derpda mentioned this issue Sep 2, 2021

Tight packing on hardware for DataPack #44

Merged

definelicht closed this as completed Sep 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tight packing for DataPack on hardware #43

Tight packing for DataPack on hardware #43

derpda commented Sep 2, 2021 •

edited

Loading

definelicht commented Sep 2, 2021

definelicht commented Sep 22, 2021

Tight packing for DataPack on hardware #43

Tight packing for DataPack on hardware #43

Comments

derpda commented Sep 2, 2021 • edited Loading

definelicht commented Sep 2, 2021

definelicht commented Sep 22, 2021

derpda commented Sep 2, 2021 •

edited

Loading