# Transform the Keras model into a HLS project

Import libraries, including hls4ml

In [None]:
import numpy as np
from qkeras.utils import load_qmodel
import hls4ml
import json
from sklearn.metrics import mean_squared_error

from utils.utils import preproc

Define the project name

In [None]:
modelname = "QKeras_Model_60_50_30_40_15"
outputname ="QKeras_Model_60_50_30_40_15_HLS"

Load the model that was created in the previous notebook:

In [None]:
model = load_qmodel('model.h5')

X_test,Y_test = preproc(test=True)

Now we have to config hls4ml, with some initial options:
- **Granularity**: sets the level of granularity to the given options. `name` means that a per-layer configuration is given, generating separate config keys for highly specific tweaks;
- **Reuse factor**: Defines the level of parallelisation required. A low reuse factor achieves lower latencies and higher throughputs, but uses most resources. An higher reuse factor save resources at the expense of longer latency and lower throughput.
![immagine.png](images/reuse_factor.png)

In [None]:
#Creating configuration dictionary
config = hls4ml.utils.config_from_keras_model(model, granularity='name',default_reuse_factor=1) 

#Activating tracing (i.e. saving also the results passed between hidden layers)
for layer in config['LayerName'].keys():
    config['LayerName'][layer]['Trace'] = True

In [None]:
print(json.dumps(config, indent=4))

Now that the configuration dictionary has been created, we can convert the QKeras model created before in a hls4ml-ready model. Here is important to notice that we have to specify the **FPGA hardware**, so that a correct mapping of the device hardware can be made:

In [None]:
hls_model = hls4ml.converters.convert_from_keras_model(model, hls_config=config, output_dir=modelname +'/'+ outputname + '/HLS_Project',fpga_part='xc7z020-1clg400c') ## FPGA Part: PYNQ-Z2

Let’s visualise what we created. The model architecture is shown, annotated with the shape and data types

In [None]:
hls4ml.utils.plot_model(hls_model, show_shapes=True, show_precision=True, to_file=None)

## Compile the model with hls4ml

Now we need to check that this model performance is still good. We compile the hls_model, and then use `hls_model.predict` to execute the FPGA firmware with bit-accurate emulation on the CPU. On the other hand, the predictions made by QKeras are computed using `model.predict()` as usual. 

In [None]:
hls_model.compile()
X_test = np.ascontiguousarray(X_test)
Y_hls = hls_model.predict(X_test)

Y_keras = model.predict(X_test) 

Now let’s see how the performance compares to QKeras, by computing the root mean squared error:

In [None]:
qkeras_rmse = np.sqrt(mean_squared_error(Y_test, Y_keras))
hls_rmse = np.sqrt(mean_squared_error(Y_test, Y_hls))

print("QKeras  RMSE: {}".format(qkeras_rmse))
print("hls4ml RMSE: {}".format(hls_rmse))

## Synthesize the model 

New final step would be the synthesis of the model, using the Vivado HLS tool from Xilinx (now part of AMD). The software depends on the hardware used, in our case the **Xilinx PYNQ-Z2 board**:

![immagine.png](images/pynq-z2.png)

In [None]:
hls_model.build()

This step however will fail (here), because we don't have Vivado HLS installed in the system. This is normal because it requires licenced software that cannot fit in this platform. This step has been done separately and, after creating the actual firmware, we are ready to continue in the FPGA board...