# OCI Vision Video Analysis (Beginner Notebook)

### What this file does:
Analyze video with Oracle Cloud Vision's pre-trained models for labels, objects, text, and faces, using the official Python SDK. This walkthrough notebook is adapted from the logic in `vision/oci_vision_video.py`.

**Documentation to reference:**
- OCI Vision Video Analysis: https://docs.oracle.com/en-us/iaas/Content/vision/using/stored_video_analysis.htm#pretrained_image_analysis_video
- OCI Python SDK: https://github.com/oracle/oci-python-sdk/tree/master/src/oci/ai_vision

**Relevant slack channels:**
- #oci_ai_vision_support: *for OCI Vision API questions*
- #igiu-innovation-lab: *general discussions on your project*
- #igiu-ai-learning: *help with sandbox environment or help with running this code*

**Env setup:**
- sandbox.yaml: Contains OCI config, compartment, and bucket details.
- .env: Load environment variables if needed.
- configure cwd for jupyter match your workspace python code: 
    -  vscode menu -> Settings > Extensions > Jupyter > Notebook File Root
    -  change from `${fileDirname}` to `${workspaceFolder}`


**How to run in notebook:**
- Make sure your runtime environment has all dependencies and access to required config files.
- Run the notebook cells in order.

---

## Step 1: Setup and Requirements

**Key Concepts:**
- **Environment Setup:** Before interacting with OCI services, you need to configure your environment with credentials, compartment IDs, and bucket details. This is typically done via a YAML config file (sandbox.yaml) and environment variables.
- **Dependencies:** Install necessary libraries like the OCI SDK and configuration loaders.
- **Libraries:** Use packages for config management, environment variables, and OCI interactions.

In this step, we'll install dependencies and import required libraries.

In [None]:
# Import necessary libraries
import os
from dotenv import load_dotenv
from envyaml import EnvYAML
from pathlib import Path
import json
import time
import oci
from oci.object_storage import ObjectStorageClient

# Load environment variables
load_dotenv()

print("Libraries imported and environment loaded.")

## Step 2: Load OCI/Sandbox Configuration

**Key Concepts:**
- **Configuration Loading:** Securely load settings from a config file to authenticate and specify resources without hardcoding credentials.
- **OCI Config:** Includes paths to config files, profiles, compartments, and bucket details.
- **Error Handling:** Validate that the config is loaded correctly to avoid runtime errors.

In this step, we'll load and validate the configuration.

In [None]:
# Define paths and load configuration
# Make sure your sandbox.yaml file is set up for your environment. You might have to specify the full path depending on your `cwd`.
# You can also try making your cwd for jupyter match your workspace python code:
# vscode menu -> Settings > Extensions > Jupyter > Notebook File Root
# change from ${fileDirname} to ${workspaceFolder}

SANDBOX_CONFIG_FILE = "sandbox.yaml"
VIDEO_PATH = Path("vision/mall.mp4")

def load_config(config_path):
    try:
        return EnvYAML(config_path)
    except FileNotFoundError:
        print(f"Error: Configuration file '{config_path}' not found.")
        return None
    except Exception as e:
        print(f"Error loading config: {e}")
        return None

scfg = load_config(SANDBOX_CONFIG_FILE)
assert scfg is not None and 'oci' in scfg and 'bucket' in scfg, "Check your sandbox.yaml config!"
oci_cfg = oci.config.from_file(os.path.expanduser(scfg["oci"]["configFile"]), scfg["oci"]["profile"])
bucket_cfg = scfg["bucket"]
compartment_id = scfg["oci"]["compartment"]

print("Configuration loaded successfully.")

## Step 3: Upload Video to Object Storage

**Key Concepts:**
- **Object Storage:** OCI's scalable storage for files. Videos must be uploaded here for analysis.
- **Bucket and Namespace:** Organize files in buckets within a namespace for access control.
- **Prefix:** Use prefixes to categorize objects, similar to folders.

In this step, we'll upload the selected video to Object Storage.

In [None]:
# Function to upload file to Object Storage
def upload(oci_cfg, bucket_cfg, file_path):
    if not file_path.exists():
        print(f"Error: File '{file_path}' not found.")
        return False
    object_storage_client = ObjectStorageClient(oci_cfg)
    print(f"Uploading file {file_path} ...")
    object_storage_client.put_object(
        bucket_cfg["namespace"], 
        bucket_cfg["bucketName"], 
        f"{bucket_cfg['prefix']}/{file_path.name}", 
        open(file_path, 'rb')
    )
    print("Upload completed!")
    return True

# Perform the upload
uploaded = upload(oci_cfg, bucket_cfg, VIDEO_PATH)
if not uploaded:
    raise ValueError("Upload failed. Check your video and configuration.")
else:
    print("Video uploaded successfully to Object Storage.")

## Step 4: Set Up Analysis Features

**Key Concepts:**
- **Video Analysis Features:** OCI Vision can analyze videos for labels (scenes), objects, text (OCR), and faces over time.
- **Feature Selection:** Choose analyses based on your needs to optimize cost and performance.
- **Input/Output Locations:** Specify where the video is stored and where results will be saved.

In this step, we'll define the features and prepare the analysis request.

In [None]:
# Import specific models
from oci.ai_vision.models import (
    ObjectLocation, ObjectListInlineInputLocation, OutputLocation,
    CreateVideoJobDetails, VideoObjectDetectionFeature,
    VideoFaceDetectionFeature, VideoLabelDetectionFeature, VideoTextDetectionFeature
)

# Define helper functions for input and output locations
def get_input_location(bucket_cfg, file_name):
    object_location = ObjectLocation(
        namespace_name=bucket_cfg["namespace"],
        bucket_name=bucket_cfg["bucketName"],
        object_name=f"{bucket_cfg['prefix']}/{file_name}",
    )
    return ObjectListInlineInputLocation(object_locations=[object_location])

def get_output_location(bucket_cfg):
    return OutputLocation(
        namespace_name=bucket_cfg["namespace"],
        bucket_name=bucket_cfg["bucketName"],
        prefix=bucket_cfg["prefix"],
    )

# Configure features: text detection, face detection, label detection, object detection
features = [
    VideoTextDetectionFeature(),
    VideoFaceDetectionFeature(),
    VideoLabelDetectionFeature(),
    VideoObjectDetectionFeature()
]

print("Analysis features configured.")

## Step 5: Submit Video Analysis Job

**Key Concepts:**
- **Asynchronous Processing:** Video analysis is asynchronous, meaning you submit a job and check its status later.
- **Job Creation:** Create a job with specified features, input location, output location, and compartment.
- **Job ID:** Use the returned job ID to track progress.

In this step, we'll submit the video for analysis.

In [None]:
# Initialize the Vision client
vision_client = oci.ai_vision.AIServiceVisionClient(config=oci_cfg)

# Create and submit the job
job_details = CreateVideoJobDetails(
    features=features,
    input_location=get_input_location(bucket_cfg, VIDEO_PATH.name),
    output_location=get_output_location(bucket_cfg),
    compartment_id=compartment_id,
)
res = vision_client.create_video_job(create_video_job_details=job_details)
job_id = None
if res is not None and hasattr(res, 'data') and hasattr(res.data, 'id'):
    job_id = res.data.id
    print(f"Job {job_id} created. State: {getattr(res.data,'lifecycle_state','?')}")
else:
    print(f"Error submitting video job: {res}")
    raise ValueError("Failed to create video job.")

## Step 6: Poll for Job Completion

**Key Concepts:**
- **Polling:** Regularly check the job's status until it completes.
- **Lifecycle States:** Jobs progress through states like 'ACCEPTED', 'IN_PROGRESS', 'SUCCEEDED', or 'FAILED'.
- **Progress Tracking:** Monitor percentage complete for user feedback.

In this step, we'll wait for the analysis job to finish.

In [None]:
# Poll for completion
poll_seconds = 0
state = None
if job_id is not None:
    while True:
        get_res = vision_client.get_video_job(video_job_id=job_id)
        state = getattr(get_res.data, 'lifecycle_state', None)
        percent = getattr(get_res.data, 'percent_complete', None)
        print(f"Status: {state} after {poll_seconds}s (progress={percent}%)")
        if state not in ["IN_PROGRESS", "ACCEPTED"]:
            break
        time.sleep(5)
        poll_seconds += 5
    print(f"Job finished with state: {state}")
else:
    raise ValueError("No job ID available.")

if state != "SUCCEEDED":
    raise ValueError(f"Job did not succeed. Final state: {state}")

## Step 7: Download and Parse Video Results

**Key Concepts:**
- **Result Retrieval:** Results are stored as JSON in Object Storage after job completion.
- **Result Structure:** Includes detections with timestamps, labels, confidence scores, and segments.
- **Parsing:** Extract and display insights in a readable format.

In this step, we'll retrieve the results and display them.

In [None]:
# Function to pretty print the response
def pretty_print_response(data):
    if not isinstance(data, dict):
        print("Not a dict response!")
        return
    video_labels = data.get("videoLabels", [])
    if video_labels:
        print("\nLabels:")
        for lab in video_labels:
            print(f"  {lab.get('name', 'Unknown')}")
            for seg in lab.get("segments", []):
                span = seg.get("videoSegment", {})
                start = span.get("startTimeOffsetMs", "N/A")
                end = span.get("endTimeOffsetMs", "N/A")
                conf = seg.get("confidence", 0)
                print(f"    {start}-{end}ms  conf={conf:.2f}")
    video_objects = data.get("videoObjects", [])
    if video_objects:
        print("\nObjects:")
        for obj in video_objects:
            label = obj.get("name", "Unknown")
            for seg in obj.get("segments", []):
                span = seg.get("videoSegment", {})
                start = span.get("startTimeOffsetMs", "N/A")
                end = seg.get("videoSegment", {}).get("endTimeOffsetMs", "N/A")
                conf = seg.get("confidence", 0)
                print(f"    {start}-{end}ms  {label}  conf={conf:.2f}")
    video_texts = data.get("videoTexts", [])
    if video_texts:
        print("\nText Lines:")
        for txt in video_texts:
            content = txt.get("text", "")
            for seg in txt.get("segments", []):
                span = seg.get("videoSegment", {})
                start = span.get("startTimeOffsetMs", "N/A")
                end = span.get("endTimeOffsetMs", "N/A")
                conf = seg.get("confidence", 0)
                print(f"    {start}-{end}ms  \"{content}\"  conf={conf:.2f}")
    video_faces = data.get("videoFaces", [])
    if video_faces:
        print("\nFaces:")
        for face in video_faces:
            for seg in face.get("segments", []):
                span = seg.get("videoSegment", {})
                start = span.get("startTimeOffsetMs", "N/A")
                end = span.get("endTimeOffsetMs", "N/A")
                conf = seg.get("confidence", 0)
                print(f"    {start}-{end}ms  Face  conf={conf:.2f}")
    if not any([video_labels, video_objects, video_texts, video_faces]):
        print("No features detected.")

# Download and parse results
if job_id is not None and state == "SUCCEEDED":
    object_storage_client = ObjectStorageClient(oci_cfg)
    prefix = bucket_cfg['prefix']
    object_name = f"{prefix}/{job_id}/{prefix}/{VIDEO_PATH.name}.json"
    response = object_storage_client.get_object(
        bucket_cfg['namespace'], bucket_cfg['bucketName'], object_name)
    if hasattr(response, 'data') and hasattr(response.data, 'content'):
        json_data = json.loads(response.data.content.decode('utf-8'))
        pretty_print_response(json_data)
        print("\nResults parsed successfully.")
    else:
        print('Failed to download or parse job results!')
else:
    print("Job not successful, cannot retrieve results.")

## üéâ Done! Next Steps

- Try different video files or formats.
- Experiment with enabling/disabling specific features.
- Monitor progress in real-time for longer videos.

## üßë‚Äçüíª Project Ideas for Practice

Below are some practical project prompts. Try one (or all) after you run a basic video through the models!

1. **Security Monitoring:** Detect and report when a person enters a building without wearing a helmet or safety gear.
2. **Content Moderation:** Flag videos with inappropriate objects, text, or faces for automated review.
3. **Traffic Analysis:** Track vehicles and pedestrians in surveillance footage, counting entries/exits.
4. **Sports Highlight Detection:** Identify key moments in a game video based on object and label detection.
5. **Accessibility Tool:** Generate descriptions of video content for visually impaired users using detected labels and objects.

If you see errors, double-check credentials or configurations. Refer to comments or docs for help.

---
**Happy building!**