<a href="https://colab.research.google.com/github/kjxlstad/gestures/blob/main/gestures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Oppsetting av CUDA-kompatibel OpenCV**

For å kunne anvende grafikkortprosesering med OpenCV er det nødvendig å bygge kildekoden selv med egendefinerte opsjoner. Den bygde versjonen blir installert, men denne installajsonen vedvarer ikke videre til fremtidg kjøretid. Vi behøver derfor å kunne lagre og hente det bygde OpenCV biblioteket eksternt. Her er Colab ypperlig, med Drive integrasjon.

Følgende celler er definert nedenfor:

1.  Kloning fra repo, kompilering og installering av OpenCV
2.  Kopiering av installasjonen til Drive
3.  Installering av kompilert versjon fra Drive

**1. Kloning fra repo, kompilering og installering av OpenCV**

In [None]:
%cd /content
!git clone https://github.com/opencv/opencv
!git clone https://github.com/opencv/opencv_contrib
!mkdir /content/build
%cd /content/build

!cmake -DOPENCV_EXTRA_MODULES_PATH=/content/opencv_contrib/modules -DBUILD_SHARED_LIBS=OFF -DBUILD_TESTS=OFF -DBUILD_PERF_TESTS=OFF -DBUILD_EXAMPLES=OFF -DWITH_OPENEXR=OFF -DWITH_CUDA=ON -DWITH_CUBLAS=ON -DWITH_CUDNN=ON -DOPENCV_DNN_CUDA=ON /content/opencv
!make -j8 install
import cv2
cv2.__version__

/content
fatal: destination path 'opencv' already exists and is not an empty directory.
fatal: destination path 'opencv_contrib' already exists and is not an empty directory.
mkdir: cannot create directory ‘/content/build’: File exists
/content/build
-- The CXX compiler identification is GNU 7.5.0
-- The C compiler identification is GNU 7.5.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Detected processor: x86_64
-- Found PythonInterp: /usr/bin/python2.7 (found suitable version "2.7.17", minimum required is "2.7") 
-- Found PythonLibs: 

'4.1.2'

**2.  Kopiering av installasjonen til Drive**

In [None]:
!mkdir  "/content/drive/My Drive/cv2_cuda"
!cp  /content/build/lib/python3/cv2.cpython-36m-x86_64-linux-gnu.so   "/content/drive/My Drive/cv2_cuda"

**3. Installering av kompilert versjon fra Drive**

In [2]:
!cp "/content/drive/My Drive/cv2_cuda/cv2.cpython-36m-x86_64-linux-gnu.so" .

## Hente bildedata fra webkamera
Da vi ønsker tilgang til Colabs grafikkort kjører vi kode på deres maskin, vi må derfir gjøre en liten *hack* for å få tilgang til vårt kamera. Dette kan gjøres ganske simpelt med et lite JavaScript som regelmessig sender bildedata til Python, ved hjelp av en felles forstått datatype.

Nedenfor er to funksjoner definert: ***start_input*** og ***take_photo***

Ved et kall til ***start_input*** åpner vi webkameret (*obs. pass på at kamera er tillat for nettstedet*) og lager en canvas med tilhørende DOM elementer for visning av bildene. ***take_photo*** returnerer et JavaScript objekt som inneholder bildedataen som skal prosesseres.

In [3]:
import base64
import html
import io
import time

from IPython.display import display, Javascript
from google.colab.output import eval_js
import numpy as np
from PIL import Image
import cv2

def start_input():
  js = Javascript('''
    var video;
    var div = null;
    var stream;
    var captureCanvas;
    var imgElement;
    var labelElement;
    
    var pendingResolve = null;
    var shutdown = false;
    
    function removeDom() {
       stream.getVideoTracks()[0].stop();
       video.remove();
       div.remove();
       video = null;
       div = null;
       stream = null;
       imgElement = null;
       captureCanvas = null;
       labelElement = null;
    }
    
    function onAnimationFrame() {
      if (!shutdown) {
        window.requestAnimationFrame(onAnimationFrame);
      }
      if (pendingResolve) {
        var result = "";
        if (!shutdown) {
          captureCanvas.getContext('2d').drawImage(video, 0, 0, 640, 640);
          result = captureCanvas.toDataURL('image/jpeg', 0.8)
        }
        var lp = pendingResolve;
        pendingResolve = null;
        lp(result);
      }
    }
    
    async function createDom() {
      if (div !== null) {
        return stream;
      }
      div = document.createElement('div');
      div.style.background = '#363537'
      div.style.padding = '3px';
      div.style.width = '100%';
      div.style.maxWidth = '600px';
      document.body.appendChild(div);
      
      const modelOut = document.createElement('div');
      modelOut.innerHTML = '<span style="color: #ED7D3A;">Status:</span>';
      labelElement = document.createElement('span');
      labelElement.innerText = 'No data';
      labelElement.style.color = '#ED7D3A'
      labelElement.style.fontWeight = 'bold';
      modelOut.appendChild(labelElement);
      div.appendChild(modelOut);
           
      video = document.createElement('video');
      video.style.display = 'block';
      video.width = div.clientWidth - 6;
      video.setAttribute('playsinline', '');
      video.onclick = () => { shutdown = true; };
      stream = await navigator.mediaDevices.getUserMedia(
          {video: { facingMode: "environment"}});
      div.appendChild(video);
      imgElement = document.createElement('img');
      imgElement.style.position = 'absolute';
      imgElement.style.zIndex = 1;
      imgElement.onclick = () => { shutdown = true; };
      div.appendChild(imgElement);
      
      const instruction = document.createElement('div');
      instruction.innerHTML = 
          '<span style="color: #2FBF71; font-weight: bold;">' +
          'Klikk her eller på video for å stoppe</span>';
      div.appendChild(instruction);
      instruction.onclick = () => { shutdown = true; };
      
      video.srcObject = stream;
      await video.play();
      captureCanvas = document.createElement('canvas');
      captureCanvas.width = 640; //video.videoWidth;
      captureCanvas.height = 640; //video.videoHeight;
      window.requestAnimationFrame(onAnimationFrame);
      
      return stream;
    }
    async function takePhoto(label, imgData) {
      if (shutdown) {
        removeDom();
        shutdown = false;
        return '';
      }
      var preCreate = Date.now();
      stream = await createDom();
      
      var preShow = Date.now();
      if (label != "") {
        labelElement.innerHTML = label;
      }
            
      if (imgData != "") {
        var videoRect = video.getClientRects()[0];
        imgElement.style.top = videoRect.top + "px";
        imgElement.style.left = videoRect.left + "px";
        imgElement.style.width = videoRect.width + "px";
        imgElement.style.height = videoRect.height + "px";
        imgElement.src = imgData;
      }
      
      var preCapture = Date.now();
      var result = await new Promise(function(resolve, reject) { 
        pendingResolve = resolve;
      });
      shutdown = false;
      
      return {'create': preShow - preCreate, 
              'show': preCapture - preShow, 
              'capture': Date.now() - preCapture,
              'img': result};
    }
    ''')

  display(js)
  
def take_photo(label, img_data):
  data = eval_js('takePhoto("{}", "{}")'.format(label, img_data))
  return data

Dekodern tar inn et JavaScript objekt som inneholder bildedataen som parameter. Returnerer en 640x640x3 numpy matrise, som representerer bildet med 3 fargekanaler.

In [4]:
def decode(js_reply):
    """
    input: 
          js_reply: JavaScript object, contain image from webcam
    output: 
          image_array: image array RGB size 640 x 640 from webcam
    """
    jpeg_bytes = base64.b64decode(js_reply['img'].split(',')[1])
    image_PIL = Image.open(io.BytesIO(jpeg_bytes))
    image_array = np.array(image_PIL)

    return image_array

Enkodern tar in et 4-kanals bilde i form av en 640x640x4 matrise. Returnerer en Base64 enkodet streng, leselig av JavaScrip koden.

In [5]:
def encode(overlay):
    """
    input: 
          overlayy: image RGBA size 640 x 640 
                              contain bounding box and text from yolo prediction, 
                              channel A value = 255 if the pixel contains drawing properties (lines, text) 
                              else channel A value = 0
    output: 
          drawing_b64: string, encoded from overlay
    """

    drawing_PIL = Image.fromarray((overlay), 'RGBA')
    iobuf = io.BytesIO()
    drawing_PIL.save(iobuf, format='png')
    drawing_bytes = 'data:image/png;base64,{}'.format((str(base64.b64encode(iobuf.getvalue()), 'utf-8')))
    return drawing_bytes

## Kalkulering av transparent
Kode ansvarlige for alle funksjonsdefinisjonene som inngår i kalkulasjon av transparenten

In [6]:
def get_overlay(frame): 
    global threshold, inWidth, inHeight, frameWidth, frameHeight, protoFile, weightsFile, nPoints

    """
    input: 
          frame: image array RGB size 640 x 640 from webcam
    output: 
          overlay: image RGBA size 640 x 640 only contain bounding box and text, 
                              channel A value = 255 if the pixel contains drawing properties (lines, text) 
                              else channel A value = 0
    """
    
    care = 320
    lower = (640 // 2) - (care // 2)
    upper = (640 // 2) + (care // 2)

    empty = np.zeros((640, 640, 4), dtype=np.uint8)

    cv2.rectangle(empty, (lower, lower), (upper, upper), (227, 181, 5, 255), thickness = 3)
    roi = frame[lower:upper,lower:upper]
    overlay = np.dstack((roi, np.zeros((care, care), dtype=np.uint8)))

    net = cv2.dnn.readNetFromCaffe(protoFile, weightsFile)
    net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
    net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
    
    inpBlob = cv2.dnn.blobFromImage(roi, 1.0 / 255, (care, care), (0, 0, 0), swapRB=True, crop=False)
    net.setInput(inpBlob)
    output = net.forward()
    
    points = []
    
    for i in range(nPoints):
        probMap = output[0, i, :, :]
        probMap = cv2.resize(probMap, (care, care))
        minVal, prob, minLoc, point = cv2.minMaxLoc(probMap)

        if prob > threshold:
            cv2.circle(overlay, (int(point[0]), int(point[1])), 8, (219, 80, 74, 255), thickness=-1, lineType=cv2.FILLED)
            #cv2.putText(overlay, "{}".format(i), (int(point[0]), int(point[1])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255, 255), 1, lineType=cv2.LINE_AA)
            points.append((int(point[0]), int(point[1])))
        else:
            points.append(None)

    """
    for pair in POSE_PAIRS:
        partA = pair[0]
        partB = pair[1]

        if points[partA] and points[partB]:
            cv2.line(overlay, points[partA], points[partB], (0, 97, 83, 204), 2, lineType=cv2.LINE_AA)
            cv2.circle(overlay, points[partA], 5, (54, 53, 55, 255), thickness=-1, lineType=cv2.FILLED)
            cv2.circle(overlay, points[partB], 5, (54, 53, 55, 255), thickness=-1, lineType=cv2.FILLED)
    """

    empty[lower:upper, lower:upper] = overlay
    return empty

## Driver-kode for tegning
Denne viser kanvasen definert i JavaScript koden, og legger transparenten over.

In [8]:
import numpy as np
import cv2
from google.colab.patches import cv2_imshow
start_input()

label_html = 'Capturing...'
img_data = ''

threshold = 0.2
inWidth = 640
inHeight = 640
frameWidth = 640
frameHeight = 640
protoFile = "/content/drive/My Drive/cv2_cuda/pose_deploy.prototxt"
weightsFile = "/content/drive/My Drive/cv2_cuda/pose_iter_102000.caffemodel"
nPoints = 22

POSE_PAIRS = [ [0,1],[1,2],[2,3],[3,4],[0,5],[5,6],[6,7],[7,8],[0,9],[9,10],[10,11],[11,12],[0,13],[13,14],[14,15],[15,16],[0,17],[17,18],[18,19],[19,20] ]

while True:
    js_reply = take_photo(label_html, img_data)
    capture_end = time.time()
    
    if not js_reply:
        break

    webcam_frame = decode(js_reply)
    overlay = get_overlay(webcam_frame)
    drawing_bytes = encode(overlay)
    img_data = drawing_bytes

<IPython.core.display.Javascript object>