# AI VISION WORKSHOP
Our first step, as in most Python programs, is to import the external Python code modules that we will be using.  Note this first step assumes that your Python virtual environment is activated and all of these packages/modules have already been installed in the environment.

In [1]:
import os
import cv2
from PIL import Image
import webcolors
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.ai.vision.imageanalysis.models import VisualFeatures
from azure.core.credentials import AzureKeyCredential
from azure.storage.blob import BlobServiceClient
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentials
import serial
import time

Next we need to set the credentials that we'll use to connect to the Azure AI services and get responses to our API calls.  Note that these credentials are unique to each individual user.  It is best practice to store these in a separate local file and then import them into your code.  The local file is marked so as not to be uploaded or shared in any version control systems such as GitHub.

Be sure to log in to your Azure account (portal.azure.com), create a 'Computer Vision' resource and a storage blob (details in workshop lab instructions), and note the unique endpoint, security key, and storage blob connection string.  Those values should be saved into a local file named config.py which we'll import below...

In [2]:
# Import the credentials that we will use to connect to the various Azure services (stored in a separate local file outside of git)
import config

Now we'll set some static global variables that will be used later in the program... Note that you might need to change the COM port based on what is used on your system.  You can determine what port to use by opening the Windows Device Manager after plugging in your Arduino board and check under 'Ports (COM & LPT)' to see which port has been configured as the USB-SERIAL connection.

In [3]:
# Static global variables are defined here...
img_file = '.\\WEBCAM-IMAGES\\sort-object.jpg'  # The name (and file path) of the image file which will be captured and analyzed
com_port = 'COM6'  # COM port for Arduino board connection (USB)

Next we'll create the basic functions of our code that depend on your local computer:

1. Capturing an image from your webcam
2. Opening an image for display on the screen
3. Uploading an image into a storage blob container in Azure
4. Move the servo motor

In [4]:
def get_image_frame_from_webcam():
    """Returns a single frame (image) from the webcam"""
    print("Press c to capture the frame...")

    #Loop to continuously get frames from the webcam until user types a 'c'
    while(True):
        #Read a frame from the webcam
        ret, frame = cap.read()

        #If frame was read correctly, show it
        if ret:
            cv2.imshow("Webcam Feed", frame)

     #Break the loop when 'c' key is pressed
        if cv2.waitKey(1) & 0xFF == ord('c'):
            break

    # Capture a frame
    ret, frame = cap.read()

    # Save the captured image
    if ret:
        cv2.imwrite(img_file, frame) # This overwrites the existing jpeg file defined in the static global variable defined above
        print("\nWebcam image captured...\n")
    else:
        print("\nError capturing image\n")

    #Release the capture object and destroy all windows
    cap.release()
    cv2.destroyAllWindows()



def open_image_in_new_window(image_path):
    """Opens an image file in a new window using Pillow."""
    try:
        img = Image.open(image_path)
        img.show()
    except FileNotFoundError:
        print(f"Error: Image file not found at '{image_path}'")
    except Exception as e:
        print(f"An error occurred: {e}")



def upload_blob(img_file, container_name, blob_name):
    """Uploads a file to an Azure blob storage container."""
    # Create a BlobServiceClient
    blob_service_client = BlobServiceClient.from_connection_string(config.storage_connection_string)

    # Get a client to interact with the specified container
    container_client = blob_service_client.get_container_client(container_name)

    # Create the container if it does not exist
    try:
        container_client.create_container()
    except Exception as e:
        #print(f"Error detail: {e}")  # Uncomment this if more detail on the error message is needed
        print("Azure blob container already exists or could not be created")

    # Upload the file
    with open(img_file, "rb") as data:
        container_client.upload_blob(name=blob_name, data=data, overwrite=True)

    # Construct and return the URL of the uploaded file
    blob_url = f"https://{blob_service_client.account_name}.blob.core.windows.net/{container_name}/{blob_name}"
    return blob_url



def move_servo(angle):
    """Moves the servo motor connected to the Arduino to the specified angle."""
    try:
        arduino = serial.Serial(com_port, 9600, timeout=1)
        time.sleep(2)   # Give time for the serial port to initialize
        arduino.write(bytes([angle]))
        time.sleep(1)  # Give time for the servo to move

        arduino.close()  # Close the serial port connection
        
    except serial.SerialException as e:
        print(f"Error communicating with Arduino: {e}")


Now we'll use these functions to open up the webcam and capture an image.  Note that the camera index might change based on your system.  Also, the 'img_file' variable value defined above has to already exist - if you get errors when trying to write the image it's likely a path issue (check what your current working directory is when running this notebook/cell) 

In [5]:
# Initialize the webcam
cap = cv2.VideoCapture(0)  # Note we need to use the correct index of the usb attached camera 

# Check if the camera opened successfully
if not cap.isOpened():
    raise IOError("Cannot open webcam - make sure it is attached")

# Create a window named "Webcam Feed" to display what the webcam is seeing on to the console
cv2.namedWindow("Webcam Feed")

# Call the capture frame function here
get_image_frame_from_webcam()

# Call the function to open the webcam image in a new window and show it on the console
open_image_in_new_window(img_file)

Press c to capture the frame...

Webcam image captured...



Now we'll upload the image file into an Azure storage blob so that we can use it as input (via it's public url) into various Azure Vision API calls

In [6]:
# Now write the image into the azure blob storage container so that we can run various Azure AI tools against it...
container_name = "ai-vision-test"  # This is the storage container created in the Azure environment
blob_name = "sort-object"  # The name to assign to the uploaded file

# Upload the file and print the URL to the screen
uploaded_file_url = upload_blob(img_file, container_name, blob_name)
print("\nUploaded file URL in Azure is: ", uploaded_file_url)

Azure blob container already exists or could not be created

Uploaded file URL in Azure is:  https://aistoragewkshp.blob.core.windows.net/ai-vision-test/sort-object


Now we use the Azure AI Vision APIs to analyze the image...

In [None]:
# Analyze the image...

# Create an Image Analysis client to talk to the Azure Vision API service
ia_client = ImageAnalysisClient(
    endpoint=config.az_endpoint,
    credential=AzureKeyCredential(config.key)
)

# Analyze the image.
result = ia_client.analyze_from_url(
    image_url=uploaded_file_url,
    visual_features=[VisualFeatures.CAPTION, VisualFeatures.READ,VisualFeatures.TAGS,VisualFeatures.OBJECTS,VisualFeatures.PEOPLE],
    gender_neutral_caption=True,  # Optional (default is False)
)

# Print generated caption results to the console
print("\nImage Analysis result returned this caption: ")
if result.caption is not None:
    print(f"   '{result.caption.text.capitalize()}'")

# Print text (OCR) analysis results to the console
print("\nText found in the image (if any): ")
if result.read.blocks:
    for line in result.read.blocks[0].lines:
        print(f"   '{line.text}'")

# Print TAG analysis results to the console
print("\nTags for this image (if any): ")
if result.tags:
    for tag in result.tags.list:
            print(f"   '{tag.name}', Confidence {tag.confidence:.4f}")

# Print OBJECT analysis results to the console
print("\nObjects found in this image (if any): ")
if result.objects is not None:
    for object in result.objects.list:
        print(f"   '{object.tags[0].name}', {object.bounding_box}, Confidence: {object.tags[0].confidence:.4f}")

# Print PEOPLE analysis results to the console
print("\nPeople found in this image (if any): ")
if result.people is not None:
    for person in result.people.list:
        print(f"   {person.bounding_box}, Confidence {person.confidence:.4f}")

IndentationError: unexpected indent (358340941.py, line 41)

Code for moving the servo - note that you can edit the angle value to test this

In [10]:
angle = 90 # This is an example angle to move the servo to 0 play around with changing it and rerunning this code block
move_servo(angle)

Finally - tie the two together and move the servo based on some output from the Azure Vision API calls.
Here is an example where the servo rotates right/left based on whether the image contains a phone or not:

In [9]:
if "phone" in result.caption.text:
    print("\nThere is a phone in the image so rotate the servo to 180 degrees...")
    move_servo(180)
else:
    print("\nNo phone detected in the image so rotate the servo to 0 degrees...")
    move_servo(0)


There is a phone in the image so rotate the servo to 180 degrees...
