# **<center><font style="color:rgb(100,109,254)">Module 6: AI Video Director For Automating Multi-Camera Setup</font> </center>**

<center>
    <img src='https://drive.google.com/uc?export=download&id=19tHZtvNS8ot5c9jvsbjPRk2SkI-8_1_Z' width=800> 
</center>
    

## **<font style="color:rgb(134,19,348)"> Module Outline </font>**

The module can be split into the following parts:

- *Lesson 1: Extract Eyes and Nose Keypoints*

- *Lesson 2: Create an AI Director for Automating a Multi-Camera Setup in OpenCV*

- ***Lesson 3:* Utilize the AI Director for Automating a Multi-Camera Setup in OBS** *(This Tutorial)* 


**Please Note**, these Jupyter Notebooks are not for sharing; do read the Copyright message below the Code License Agreement section which is in the last cell of this notebook.
-Taha Anwar

Alright, let's get started.

### **<font style="color:rgb(134,19,348)">Installation & OBS Websocket Setup</font>**

First, we will have to install the required [WebSocket API for OBS Studio](https://github.com/obsproject/obs-websocket). You must have [OBS Studio](https://obsproject.com) installed in your system beforehand. To install the WebSocket, you just have to download and run a setup (depending upon the OS, you are using) from the list below.

- [obs-websocket-4.9.0-macOS.pkg](https://github.com/obsproject/obs-websocket/releases/download/4.9.0/obs-websocket-4.9.0-macOS.pkg) (for Mac)

- [obs-websocket-4.9.0-Windows-Installer.exe](https://github.com/obsproject/obs-websocket/releases/download/4.9.0/obs-websocket-4.9.0-Windows-Installer.exe) (for Windows)

- [obs-websocket_4.9.0-1_amd64.deb](https://github.com/obsproject/obs-websocket/releases/download/4.9.0/obs-websocket_4.9.0-1_amd64.deb) (for  Linux)

It can be used to remotely control OBS from a phone or tablet on the same local network, change your stream overlay/graphics based on the current scene, and automate scene switching with a third-party program.

After that, we will install the Python library ([obs-websocket-py](https://github.com/Elektordi/obs-websocket-py)) required to communicate with an obs-WebSocket server.

**Helpful:** [OBS Websocket Python Docs.](https://obs-ws-rc.readthedocs.io/en/latest/)

In [2]:
# Install the required library.
!pip install obs-websocket-py



### **<font style="color:rgb(134,19,348)"> Import the Libraries</font>**

After completing all the installations, we will import the required libraries.

In [7]:
import cv2
import sys
import mediapipe as mp
from collections import deque


from importlib.metadata import version
print(f"Mediapipe version: {version('mediapipe')}, it should be 0.8.9.1")

import logging
logging.basicConfig(level=logging.INFO)
sys.path.append('../')

from obswebsocket import obsws, requests
from previous_lesson import detectFacialLandmarks, getFaceKeypoints, getHeadScore

Mediapipe version: 0.8.9.1, it should be 0.8.9.1


## **<font style="color:rgb(134,19,348)">Connect to the Websocket Server</font>**

Now you need to go and launch OBS Studio, Go to **Tools** then click **Web Socket Server Settings** , in the dialog box that pops up, you can change the password to "**`secret`**" then click ok. Alright now don't close the OBS application, just minimize it and then run the cell below.

Now we will utilize the [obs-websocket-py](https://github.com/Elektordi/obs-websocket-py) library to establish a connection with the obs-WebSocket server. 

In [16]:
# Specify the host, port, and the password.
host = "localhost"
port = 4444
password = "secret"

# Connect to the websocket server.
ws = obsws(host, port, password)
ws.connect()

INFO:obswebsocket.core:Connecting...
INFO:obswebsocket.core:Connected!


## **<font style="color:rgb(134,19,348)">Initializations</font>**

After that, in this step, we will perform all the required initializations. First, we will initialize a list containing the indexes of the cameras we want to use in this application, then we will initialize the **`mp.solutions.face_mesh`** class and then set up the **`mp.solutions.face_mesh.FaceMesh()`** function with appropriate arguments (for each webcam) as we had done in the previous lesson. We will also create a scene with a `av_capture_input` source for each webcam in OBS studio utilizing the [obs-websocket-py](https://github.com/Elektordi/obs-websocket-py) library.

In [9]:
# Initialize  a list to store the indexes of the cameras.
CAMERAS_INDEXES = [0, 1, 2]

# Initialize the mediapipe face mesh class.
mp_face_mesh = mp.solutions.face_mesh

# Initialize a dictionary to store the facemesh functions for different webcam feeds.
facemesh_functions = {}

# Iterate over the indexes of the cameras.
for index in CAMERAS_INDEXES:
    
    # Make a call to the OBS server through the Websocket and ask to create a new scene.
    ws.call(requests.CreateScene('Scene '+ str(index)))
    
    #NOTE USE THIS 'sourceKind= av_capture_input' if dshow_input does'nt creat actual cameras. dshow_input
    # Make another call to the OBS server through the Websocket and ask to create a new video capture source.
    ws.call(requests.CreateSource(sourceName='Video Capture Device '+ str(index), sourceKind='av_capture_input', 
                                  sceneName='Scene '+ str(index), sourceSettings=None))
    
    # Setup the face landmarks function for the camera.
    facemesh_functions[index] = mp_face_mesh.FaceMesh(static_image_mode=False, max_num_faces=1, 
                                                      refine_landmarks=True, 
                                                      min_detection_confidence=0.5, min_tracking_confidence=0.3)

Now we will simply iterate over the sources we have created and display their settings.

In [10]:
# Iterate over the indexes of the cameras.
for index in CAMERAS_INDEXES:
    
    # Make another call to the OBS server through the Websocket and  get the 'Video Capture Device index' source settings.
    print(ws.call(requests.GetSourceSettings(sourceName='Video Capture Device '+ str(index))), end='\n\n')

<GetSourceSettings request ({'sourceName': 'Video Capture Device 0', 'sourceType': None}) called: success ({'sourceName': 'Video Capture Device 0', 'sourceSettings': {}, 'sourceType': 'av_capture_input'})>

<GetSourceSettings request ({'sourceName': 'Video Capture Device 1', 'sourceType': None}) called: success ({'sourceName': 'Video Capture Device 1', 'sourceSettings': {}, 'sourceType': 'av_capture_input'})>

<GetSourceSettings request ({'sourceName': 'Video Capture Device 2', 'sourceType': None}) called: success ({'sourceName': 'Video Capture Device 2', 'sourceSettings': {}, 'sourceType': 'av_capture_input'})>



You may have noticed that the sources we have created don't have a camera (device) specified to them. So for this, you will have to manually go to the OBS studio double-click on the source from the sources list;

<center>
    <img src='https://drive.google.com/uc?export=download&id=1DHpN_aAZshHaCryPuOmKDH1La18mXhyP' width=600>
</center>
<br>
<br>

and then select a camera (device) from the dropdown menu in the poped window and hit ok for each source.

<center>
    <img src='https://drive.google.com/uc?export=download&id=15xyd2WQXnDWzunN3VKrLrhIlySK4WBiB' width=600>
</center>


**Note that** you must select the camera for each source according to the indexes assigned to the cameras in your system. For example, for the source `Video Capture Device 0` you must select the camera which has the index `0`. Now again display the settings of the sources by running the cell below.

In [11]:
# Iterate over the indexes of the cameras.
for index in CAMERAS_INDEXES:
    
    # Make another call to the OBS server through the Websocket and  get the 'Video Capture Device i' source settings.
    print(ws.call(requests.GetSourceSettings(sourceName='Video Capture Device '+ str(index))), end='\n\n')

<GetSourceSettings request ({'sourceName': 'Video Capture Device 0', 'sourceType': None}) called: success ({'sourceName': 'Video Capture Device 0', 'sourceSettings': {'device': 'EAB7A68FEC2B4487AADFD8A91C1CB782', 'device_name': 'FaceTime HD Camera'}, 'sourceType': 'av_capture_input'})>

<GetSourceSettings request ({'sourceName': 'Video Capture Device 1', 'sourceType': None}) called: success ({'sourceName': 'Video Capture Device 1', 'sourceSettings': {'device': '0x11400017ef4831', 'device_name': 'Lenovo FHD Webcam'}, 'sourceType': 'av_capture_input'})>

<GetSourceSettings request ({'sourceName': 'Video Capture Device 2', 'sourceType': None}) called: success ({'sourceName': 'Video Capture Device 2', 'sourceSettings': {'device': '0x11300009da2692', 'device_name': 'A4tech FHD 1080P PC Camera'}, 'sourceType': 'av_capture_input'})>



You can see that now the sources have a `device` and `device_name` property in the `sourceSettings`. You can note these properties values and next time pass these `sourceSettings` to the **`CreateSource()`** to automate this camera selection process.


Now we will utilize the function **`getHeadScore()`** created in the previous lesson, to get the score for each camera and use these values to switch between scenes in OBS studio in real-time depending upon which camera the person is looking at.

In [17]:
#Initialize a dictionary to store the VideoCapture objects of different webcams.
cameras_readers = {}
print('Initializing...')

# Iterate over the indexes of the cameras.
for camera_index in CAMERAS_INDEXES:
    
    # Add a VideoCapture object into the dictionary.
    cameras_readers[camera_index] = cv2.VideoCapture(camera_index)

    # Set the webcam feed width and height.
    # cameras_readers[camera_index].set(3,1280)
    # cameras_readers[camera_index].set(4,960)

# Make a call to the OBS server through the Websocket and 
# get the list of scenes in the currently active profile.  
scenes = ws.call(requests.GetSceneList()).getScenes()

# Initialize a variable to store the active scene name.
active_scene_name = scenes[0]['name']
print('Running...', active_scene_name)

# Initialize a buffer to store the scene name with the minimum score.
min_score_scene_buffer = deque([], maxlen=3)

# Create a try block to avoid the KeyboardInterrupt error,
# that is occurred when the kernel is interrupted.
# This is done to properly end the program execution when the kernel is interrupted.      
try:

    # Iterate until a termination (break) statement is executed.
    while True:

        # Initialize a variable to store the minimum score across all the webcam feeds.
        min_score = 1000
        
        # Initialize a variable to store the scene name with the minimum score.
        min_score_scene = active_scene_name
        
        # Iterate over the VideoCapture objects. 
        for camera_index, camera_reader in cameras_readers.items():

            # Read a frame.
            ok, frame = camera_reader.read()

            # Check if frame is not read properly then 
            # continue to the next iteration to read the ne xt frame.
            if not ok:
                print(f'Failed to read Frame from Camera {camera_index}')
                continue

            # Flip the frame horizontally for natural (selfie-view) visualization.
            frame = cv2.flip(frame, 1)

            # Perform Face landmarks detection.
            frame, face_landmarks = detectFacialLandmarks(frame, facemesh_functions[camera_index], 
                                                          draw=False, display=False)
            
            # Check if the Face landmarks in the frame are detected.
            if len(face_landmarks)>0:

                # Get the nose, left eye center, and right eye center landmarks.
                frame, keypoints = getFaceKeypoints(frame, face_landmarks, draw=False, display=False)

                # Calculate the difference between the nose tip and both eyes mid-point.
                score = getHeadScore(keypoints)
                # print(camera_index, score)

                # Check if the calculated score is less than the minimum score.
                if score < min_score:

                    # Update the minimum score and the scene name with the minimum score.
                    min_score = score
                    min_score_scene = 'Scene ' + str(camera_index)
                    
        min_score_scene_buffer.append(min_score_scene)
        
        # print(min_score_scene_buffer)
        
        # Check if the scene with the minimum score is not the active scene. 
        if max(min_score_scene_buffer) != active_scene_name:
            
            # Make a call to the OBS server through the Websocket and switch the scene.
            ws.call(requests.SetCurrentScene(max(min_score_scene_buffer)))
            print('switched')
            
            # Update the active scene name.
            active_scene_name = max(min_score_scene_buffer)

# Handle the KeyboardInterrupt exception, if it is raised.
except KeyboardInterrupt:
    pass
        
# Iterate over the VideoCapture objects. 
for camera_reader in cameras_readers.values():
    
    # Release the VideoCapture Object.                  
    camera_reader.release()

# Close the windows and disconnect from the websocket server.
cv2.destroyAllWindows()
ws.disconnect()

Initializing...
Running... Scene
switched
switched
switched
switched
switched
switched
switched


INFO:obswebsocket.core:Disconnecting...


Fascinating! right? now you can move around in a multi-camera setup without worrying about manually switching between the cameras you are looking at.

**Below Code is Just For Debugging Purposes**

In [11]:
cameras_readers = {}
CAMERAS_INDEXES =[1,0]

# Iterate over the indexes of the cameras.
for camera_index in CAMERAS_INDEXES:
    
    # Add a VideoCapture object into the dictionary.
    cameras_readers[camera_index] = cv2.VideoCapture(camera_index)    


In [6]:
for camera_index, camera_reader in cameras_readers.items():
       ok, frame = camera_reader.read()
       print(ok)

False
False
False


In [70]:
# Iterate over the VideoCapture objects. 
for camera_reader in cameras_readers.values():
    
    # Release the VideoCapture Object.                  
    camera_reader.release()

INFO:obswebsocket.core:Disconnecting...
INFO:obswebsocket.core:Connecting...
Exception in thread Thread-10:
Traceback (most recent call last):
  File "C:\Users\TEXON WARE\.conda\envs\workingenv\lib\site-packages\obswebsocket\core.py", line 226, in run
    message = self.ws.recv()
  File "C:\Users\TEXON WARE\.conda\envs\workingenv\lib\site-packages\websocket\_core.py", line 362, in recv
    opcode, data = self.recv_data()
  File "C:\Users\TEXON WARE\.conda\envs\workingenv\lib\site-packages\websocket\_core.py", line 385, in recv_data
    opcode, frame = self.recv_data_frame(control_frame)
  File "C:\Users\TEXON WARE\.conda\envs\workingenv\lib\site-packages\websocket\_core.py", line 406, in recv_data_frame
    frame = self.recv_frame()
  File "C:\Users\TEXON WARE\.conda\envs\workingenv\lib\site-packages\websocket\_core.py", line 445, in recv_frame
    return self.frame_buffer.recv_frame()
  File "C:\Users\TEXON WARE\.conda\envs\workingenv\lib\site-packages\websocket\_abnf.py", line 341, i

### **<font style="color:rgb(255,140,0)"> Code License Agreement </font>**
```
Copyright (c) 2022 Bleedai.com

Feel free to use this code for your own projects commercial or noncommercial, these projects can be Research-based, just for fun, for-profit, or even Education with the exception that you’re not going to use it for developing a course, book, guide, or any other educational products.

Under *NO CONDITION OR CIRCUMSTANCE* you may use this code for your own paid educational or self-promotional ventures without written consent from Taha Anwar (BleedAI.com).

```

