# OpenCV Overlay: Filter2D and Remap Example

This notebook illustrates the kinds of things you can do with accelerated openCV cores built as PYNQ overlay. The overlay consists of a 2D filter and a remap function and this example notebook does the following.
1. Sets up HDMI drivers
2. Sets up widget for controlling the position, size, and zoom of the "mangifying glass"
3. Run software only remap on HDMI input and output results on HDMI output
4. Run hardware accelerated remap function
5. Run hardware accelerated filter2D + remap function

NOTE: Rough FPS values are computed for each stage

Program overlay load python libraries for memory manager and accelerator drivers.

NOTE: All overlay and python libraries should be loaded prior to assigning the HDMI input/outputs. This is necessary right now to ensure correct functionality but will be enhanced in future releases. For now, please copy this block as is when using it in your own designs.

In [None]:
from pynq.lib.video import *
from pynq.overlays.bare_hdmi import BareHDMIOverlay
base = BareHDMIOverlay("/home/xilinx/pynq/overlays/computer_vision/xv2Filter2DRemap.bit")
from pynq import Xlnk
mem_manager = Xlnk()
import pynq.overlays.xv2Filter2DRemap as xv2

hdmi_in = base.video.hdmi_in
hdmi_out = base.video.hdmi_out

Setup and configure HDMI drivers (~10 seconds to initialize HDMI input/output)

In [None]:
hdmi_in.configure(PIXEL_GRAY)
hdmi_out.configure(hdmi_in.mode)

hdmi_in.start()
hdmi_out.start()

Setup up HDMI input/output parameters as well as remap demo parameters which is referenced in later function calls

In [None]:
mymode = hdmi_in.mode
print("My mode: "+str(mymode))

height = hdmi_in.mode.height
width = hdmi_in.mode.width
bpp = hdmi_in.mode.bits_per_pixel

cx = 700
cy = 500
radius = 60
zoom = 2

Define helper functions

This function computes the map matrices used in the remap function. This is not part of the accelerated function and is only called when the "magnifying glass" is moved or its parameters changed.

In [None]:
import numpy as np
import cv2

def makeMapCircleZoom(width, height, cx, cy, radius, zoom):
    mapY, mapX = np.indices((height,width),dtype=np.float32)
    
    for (j,i),x in np.ndenumerate(mapX[cy-radius:cy+radius,cx-radius:cx+radius]):
        x = i - radius
        y = j - radius
        i += cx-radius
        j += cy-radius
        mapX[(j,i)] = (cx + x/zoom) if (np.sqrt(x*x+y*y)<radius) else i
        mapY[(j,i)] = (cy + y/zoom) if (np.sqrt(x*x+y*y)<radius) else j

    return(mapX,mapY)

Setup control widgets

This widget controls our "magnfiying glass." This demo allows the uses to change the position (x,y), size and zoom of the magnfiying glass via slider controls.

In [None]:
from ipywidgets import interact, interactive, fixed, interact_manual, IntSlider, FloatSlider
import ipywidgets as widgets

var_changed_g = 0
cx_g = 700
cy_g = 500
radius_g = 100
zoom_g = 1.4

def makeMapCircleZoomAndUpdate(width, height, cx, cy, radius, zoom):
    global var_changed_g
    global cx_g
    global cy_g
    global radius_g
    global zoom_g
    #print(var_changed_g,cx,cy,radius,zoom)
    if var_changed_g == 0:
        cx_g = cx
        cy_g = cy
        radius_g = radius
        zoom_g = zoom
        var_changed_g = 1
    #print(cx_g,cy_g,radius_g,zoom_g)
    
width_widget = width;
height_widget = height;
cx_widget = IntSlider(min=0,max=width-1, step=1, value=width/2,continuous_update=False)
cy_widget = IntSlider(min=0,max=height-1, step=1, value=height/2,continuous_update=False)
radius_widget = IntSlider(min=1,max=100,step=1,value=60,continuous_update=False)
zoom_widget = FloatSlider(min=0.1,max=4.0,step=0.1,value=2.0,continuous_update=False)

interact(makeMapCircleZoomAndUpdate, width=fixed(width_widget), height=fixed(height_widget), cx=cx_widget,cy=cy_widget,radius=radius_widget,zoom=zoom_widget)

Run SW remap (~20 seconds)

This runs the remap kernel which puts a "magnifying glass" circle over a section of the image. The circle is small and can have it's position, size and zoom modified using the sliders above. Doing so will call the makeMapCircleZoom function in python which takes a few seconds to recompute the new maps each time an adjustment is made on any slider. Not moving the slider will give a more accurate FPS measurement of the remap function on its own.

NOTE: In order to allow kernel redefintion onthe fly, subsequent function call are run as threads. This means you will not know if the cell is finished based on the cell status on the left. Be sure to wait until FPS information is reported before running other cells. Also note that if you use the widget to change the kernel, the FPS info will show up underneath the widget cell rather than the function block cell.

In [None]:
import numpy as np
import cv2

def loop_sw_app():
    global var_changed_g

    map1, map2 = makeMapCircleZoom(width,height,cx,cy,radius,zoom)

    var_changed_g == 0
    numframes = 20

    start = time.time()
    for _ in range(numframes):
        inframe = hdmi_in.readframe()
        outframe = hdmi_out.newframe()
        if var_changed_g == 1:
            map1, map2 = makeMapCircleZoom(width, height, cx_g, cy_g, radius_g, zoom_g)
            var_changed_g = 0
        cv2.remap(inframe, map1, map2, cv2.INTER_LINEAR, dst=outframe)
        inframe.freebuffer()
        hdmi_out.writeframe(outframe)
    end = time.time()
    print("Frames per second:  " + str(numframes / (end - start)))

from threading import Thread

t = Thread(target=loop_sw_app, )
t.start()

Run HW remap (~12 seconds, with no parameter changes)

Based on a kernel frequency of 100 MHz, this block should run at ~40 fps. This is also true for the filter2D block as well. Similar to the software version, moving the sliders will cause a makeMapCircleZoom function call which will affect the final FPS measurement as that computation is added into the total. Letting this cell run to completion on its own will provide a more accurate hardware remap performance.

In [None]:
def loop_hw_app():
    global var_changed_g

    map1, map2 = makeMapCircleZoom(width,height,cx,cy,radius,zoom)

    bufferMap1 =  mem_manager.cma_alloc(length=width*height,data_type='float',cacheable=0)
    xFmap1 = np.reshape(np.frombuffer(mem_manager.cma_get_buffer(bufferMap1, length=width*height*4),dtype=np.float32),(height,width))
    xFmap1[:] = map1[:]

    bufferMap2 = mem_manager.cma_alloc(length=width*height,data_type='float',cacheable=0)
    xFmap2 = np.reshape(np.frombuffer(mem_manager.cma_get_buffer(bufferMap2, length=width*height*4),dtype=np.float32),(height,width))
    xFmap2[:] = map2[:]

    var_changed_g == 0
    numframes = 500

    start=time.time()
    for _ in range(numframes):
        inframe = hdmi_in.readframe()
        outframe = hdmi_out.newframe()
        if var_changed_g == 1:
            map1, map2 = makeMapCircleZoom(width, height, cx_g, cy_g, radius_g, zoom_g)
            xFmap1[:] = map1[:]
            xFmap2[:] = map2[:]
            var_changed_g = 0
        xv2.remap(inframe, xFmap1, xFmap2, cv2.INTER_LINEAR, dst=outframe)
        inframe.freebuffer()
        hdmi_out.writeframe(outframe)
    end=time.time()

    mem_manager.cma_free(bufferMap1)
    mem_manager.cma_free(bufferMap2)

    print("Frames per second:  " + str(numframes / (end - start)))

from threading import Thread

t = Thread(target=loop_hw_app, )
t.start()

Run HW filter2D + remap (~20 seconds, with no parameter changes)

Running both blocks in series means the effective performance is approximately halved or ~20 fps.

In [None]:
def loop_hw2_app():
    global var_changed_g

    buf =np.ones((height,width),np.uint8)
    kernel = np.array([[0.0, 1.0, 0],[1.0,-4,1.0],[0,1.0,0.0]],np.float32)
    map1, map2 = makeMapCircleZoom(width,height,cx,cy,radius,zoom)

    buffer =  mem_manager.cma_alloc(length=width*height,data_type='unsigned char',cacheable=0)
    xFbuf = np.reshape(np.frombuffer(mem_manager.cma_get_buffer(buffer, length=width*height),dtype=np.uint8),(height,width))   

    bufferMap1 =  mem_manager.cma_alloc(length=width*height,data_type='float',cacheable=0)
    xFmap1 = np.reshape(np.frombuffer(mem_manager.cma_get_buffer(bufferMap1, length=width*height*4),dtype=np.float32),(height,width))
    xFmap1[:] = map1[:]

    bufferMap2 = mem_manager.cma_alloc(length=width*height,data_type='float',cacheable=0)
    xFmap2 = np.reshape(np.frombuffer(mem_manager.cma_get_buffer(bufferMap2, length=width*height*4),dtype=np.float32),(height,width))
    xFmap2[:] = map2[:]
   
    var_changed_g == 0
    numframes = 500

    start=time.time()
    for _ in range(numframes):
        inframe = hdmi_in.readframe()
        outframe = hdmi_out.newframe()
        if var_changed_g == 1:
            map1, map2 = makeMapCircleZoom(width, height, cx_g, cy_g, radius_g, zoom_g)
            xFmap1[:] = map1[:]
            xFmap2[:] = map2[:]
            var_changed_g = 0
        xv2.filter2D(inframe, -1, kernel, xFbuf, (-1,-1), 0.0, borderType=cv2.BORDER_CONSTANT)
        xv2.remap(xFbuf, xFmap1, xFmap2, cv2.INTER_LINEAR, dst=outframe)
        inframe.freebuffer()
        hdmi_out.writeframe(outframe)
    end=time.time()

    mem_manager.cma_free(bufferMap1)
    mem_manager.cma_free(bufferMap2)
    mem_manager.cma_free(buffer)

    print("Frames per second:  " + str(numframes / (end - start)))

from threading import Thread

t = Thread(target=loop_hw2_app, )
t.start()

Clean up hdmi drivers

In [None]:
hdmi_out.close()
hdmi_in.close()