## LUT conversion Tool 

Converting an approximate multiplier function into a pre-computed C header file

 * An example toolflow is presented of importing an axx_mult from EvoApprox library (Step 1a) but any user-defined axx_mult can be used (Step 1b)
 * Start from "Step 1a" or "Step 1b" depending on your case
 
 * **Important** : Only 8bit singed multipliers are supported at the moment. So the output header file must be a 256x256 C-array

## Step 1a (case for EvoApprox multiplier)

Select an axx_mult from EvoApprox library and run the provided script from EvoApprox (python pyx method)

In [None]:
use_evo = True

# example used: mul8s_1L2H
################
!curl -s "https://ehw.fit.vutbr.cz/evoapproxlib/v1.0?folder=multiplers/8x8_signed/pareto_pwr_wce&file=mul8s_1L2H.c&pyx=bash" | bash
import pyximport
pyximport.install()
import mul8s_1L2H

def u2s(v): # 16b unsigned to 16b signed
    if v & 32768:
        return v - 65536
    return v
################


## Step 1b (case for user-defined multiplier)

Create a class for your custom multiplier. The function 'mult' inside can be user-defined


In [None]:
use_evo = False

class my_accurate_mult(object):
    def mul(self, a, b):
        return a * b

## Step 2

Select the current multiplier


In [None]:
if use_evo:
    # for EvoApprox multiplier
    axx_mult = mul8s_1L2H # change name appropriately
else:
    # for user-defined multiplier
    axx_mult = my_accurate_mult()

## Step 3

### unsigned to signed conversion for user-defined multiplier

If the user-defined multiplier outputs unsigned numbers you need to set the following flag to 'True' to convert to signed, otherwise set to 'False'. 

Leave the flag also to 'True' for the case of EvoApprox multiplier

**Important** The function is used for 8-bit multipliers - no support for other arithmetic at the moment


In [None]:
use_signed_conversion = True

In [None]:
###### DO NOT CHANGE ######

nbits = 8

if use_signed_conversion:
    #for the case of signed conversion
    def u2s(v): # 16b unsigned to 16b signed
        if v & 32768:
            return v - 65536
        return v
else:  
    #for the case of no conversion
    def u2s(v): 
        return v

## Step 4

Set the name of the header file to write on disk


In [None]:
mult_name = 'mul8s_1L2H'

## Step 5

Write the header file to disk

This script will write the C header file (*.h) to current folder path

You can then move it to 'ext_modules/include/nn/cuda/axx_mults' and use its file name as the arguement to adapt layers

In [None]:
with open(mult_name + '.h', 'w') as filehandle: 
    bits = int(pow(2,nbits))
    lut_size_str = str(bits)

    filehandle.write('#include <stdint.h>\n\n')
    filehandle.write('const __device__  int' + str(2*nbits) + '_t lut [' + lut_size_str + '][' + lut_size_str +'] = {')       
    
    for i in range (0,bits//2):
        filehandle.write('{')
        for j in range (0,bits//2):
            x = u2s(axx_mult.mul(i,j))
            filehandle.write('%s' % x)
            filehandle.write(', ')  
        for j in range (bits//2,bits):
            x = u2s(axx_mult.mul(i,(bits-j)*-1))
            filehandle.write('%s' % x)
            if j!=bits-1:
                filehandle.write(', ') 
        filehandle.write('},')  
        filehandle.write('\n')
        
    for i in range (bits//2,bits):
        filehandle.write('{')
        for j in range (0,bits//2):
            x = u2s(axx_mult.mul((bits-i)*-1,j))        
            filehandle.write('%s' % x)
            filehandle.write(', ')  
        for j in range (bits//2,bits):
            x = u2s(axx_mult.mul((bits-i),(bits-j)))
            filehandle.write('%s' % x)
            if j!=bits-1:
                filehandle.write(', ')
        if(i!=bits-1):        
            filehandle.write('},')
            filehandle.write('\n')
    filehandle.write('}};\n')        