# DEEPX Tutorial 02 - Usage of DX_APP

In this second tutorial, we will introduce DX_APP and learn how to utilize a converted DXNN model in an AI application. Additionally, we will cover how to use a USB webcam as an input source.

This tutorial is based on dx-all-suite v2.0.0, released in September 2025.

## What is DX_APP?

**DX-APP** is a sample application that demonstrates how to run compiled models on actual DEEPX NPU
using DX-RT. It includes ready-to-use code for common vision tasks such as object detection, face
recognition, and image classification. DX-APP helps developers quickly set up the runtime environment
and serves as a template for building and customizing their own AI applications.

For more details, download DX_APP User Guide from 👉 [here](https://developer.deepx.ai/?files=MjUxOA==)!

Let's see the file structure of DX_APP:

In [None]:
%cd dx-all-suite

In [None]:
!tree -L 1 dx-runtime

In [None]:
#!tree -L 1 dx-runtime/dx_app
!tree -L 1 dx-runtime/dx_app/bin
!tree -L 1 dx-runtime/dx_app/demos

**DX-APP** demos are optimized to showcase pre-compiled models on DEEPX NPUs with minimal setup.
Each demo represents a common AI task and can be executed using images, videos, or live camera
input.

**Classification**
- Executes classification models with image inputs (e.g., 224x224 ).
- Outputs the Top-1 predicted class.
- Example: example/run_classifier/imagenet_example.json

**Object Detection**
- For image input, outputs result.jpg and prints detected objects to the terminal.
- For video input, displays bounding boxes on the output video.

**Pose Estimation**
- Detects people and estimates keypoints (joints) using image, video, or camera input.
- The output includes both bounding boxes and joint coordinates rendered on screen.

**Segmentation**
- For image input, saves results to result.jpg and prints info to the terminal.
- For video input, displays output with both detection boxes and segmentation masks. 

## Prerequisites

1. Move to `dx_app` directory:

In [None]:
%cd dx-runtime/dx_app

2. Download required models and sample videos by running the following command:

In [None]:
# Assets (models + videos) are downloaded and placed in the assets/ directory.
!./setup.sh

3. Verify that both models and videos are downloaded as expected:

In [None]:
# AI models converted to DXNN format
!tree assets/models

In [None]:
# video files for demo inputs
!tree assets/videos

## USB Webcam Basics

This hands-on notebook shows how to:
- Discover USB webcams and inspect capabilities with **V4L2** (`v4l2-ctl`).
- Configure formats (e.g., **MJPEG** or **YUYV**), resolution, FPS, and camera controls (exposure, focus, WB).
- Capture images and video with **OpenCV** (both windowed & headless modes).

### 0. Prerequisites (run once):

In [None]:
!sudo apt update
!sudo apt install -y v4l-utils
!pip install opencv-python

### 1. Environment Check

In [None]:
import sys, platform, subprocess, shutil, os, time, re, json, glob, pathlib
import cv2
from IPython.display import display, Markdown, clear_output

print("Python:", sys.version)
print("Platform:", platform.platform())
print("OpenCV:", cv2.__version__)

# Check v4l2-ctl availability
v4l2_path = shutil.which("v4l2-ctl")
print("v4l2-ctl:", v4l2_path if v4l2_path else "NOT FOUND - please `sudo apt install v4l-utils`")

### 2. Discover Video Devices

In [None]:
# List /dev/video* nodes
video_nodes = sorted(glob.glob("/dev/video*"))
print("Detected video nodes:", video_nodes)

# v4l2-ctl --list-devices gives a nice mapping (device -> /dev/videoX)
if v4l2_path:
    print("\n== v4l2-ctl --list-devices ==")
    print(subprocess.run(["v4l2-ctl", "--list-devices"], capture_output=True, text=True).stdout)
else:
    print("v4l2-ctl not available; skipping device listing via v4l2-ctl.")

In [None]:
#!ls /dev/video*
#!v4l2-ctl --list-devices
#!cat /sys/class/video4linux/video*/name

### 3. Choose Your Webcam Device
Set `DEVICE` to the `/dev/videoX` node of your webcam. If you're unsure, pick the first one that shows UVC capabilities in the previous step.

In [None]:
# Change this to your webcam node if needed
DEVICE = "/dev/video0"
DEVICE

### 4. Inspect Capabilities, Formats, and Frame Sizes

In [None]:
if not os.path.exists(DEVICE):
    raise FileNotFoundError(f"{DEVICE} not found. Update DEVICE to a valid /dev/videoX.")

if v4l2_path:
    print("== v4l2-ctl --device --all ==")
    print(subprocess.run(["v4l2-ctl", f"--device={DEVICE}", "--all"], capture_output=True, text=True).stdout)

    print("\n== v4l2-ctl --device --list-formats-ext ==")
    print(subprocess.run(["v4l2-ctl", f"--device={DEVICE}", "--list-formats-ext"], capture_output=True, text=True).stdout)
else:
    print("v4l2-ctl not available; cannot show capabilities/formats.")

In [None]:
!v4l2-ctl --device {DEVICE} --all

In [None]:
!v4l2-ctl --device {DEVICE} --list-formats-ext

### 5. (Optional) Configure Format & FPS via V4L2

Two common pixel formats:
- **MJPG** (Motion JPEG): lower USB bandwidth, lighter CPU decode than raw → often best for 1080p+ over USB.
- **YUYV** (YUYV 4:2:2): raw frames, higher bandwidth but low latency and no compression artifacts.

> We'll try setting **1920x1080 @ 30fps** with **MJPG**. Adjust if unsupported by your camera.

In [None]:
PREFERRED_WIDTH, PREFERRED_HEIGHT, PREFERRED_FPS = 1920, 1080, 30
PREFERRED_FOURCC = "MJPG"  # or "YUYV"

if v4l2_path:
    print("Setting format via v4l2-ctl ...")
    cmds = [
        ["v4l2-ctl", f"--device={DEVICE}", f"--set-fmt-video=width={PREFERRED_WIDTH},height={PREFERRED_HEIGHT},pixelformat={PREFERRED_FOURCC}"],
        ["v4l2-ctl", f"--device={DEVICE}", f"--set-parm={PREFERRED_FPS}"]
    ]
    for c in cmds:
        res = subprocess.run(c, capture_output=True, text=True)
        print("$", " ".join(c))
        if res.stderr.strip():
            print("stderr:", res.stderr.strip())
        if res.stdout.strip():
            print(res.stdout.strip())
else:
    print("v4l2-ctl not available; skip format set.")

### 6. OpenCV Capture Basics

We'll show two patterns:

- **Headless (Notebook)**: display a few frames inline (no GUI windows).
- **Windowed (Desktop)**: show a live window. Use this on a local desktop with a display server.

In [None]:
def open_capture(dev="/dev/video0", width=1280, height=720, fps=30, fourcc="MJPG"):
    cap = cv2.VideoCapture(dev, cv2.CAP_V4L2)  # prefer V4L2 backend on Linux
    if fourcc:
        # set FOURCC before size/fps for reliability
        cap.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*fourcc))
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
    cap.set(cv2.CAP_PROP_FPS, fps)
    # Some drivers report after opening; query back
    actual = {
        "fourcc": int(cap.get(cv2.CAP_PROP_FOURCC)),
        "width": int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)),
        "height": int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)),
        "fps": cap.get(cv2.CAP_PROP_FPS),
        "backend": int(cap.get(cv2.CAP_PROP_BACKEND))
    }
    return cap, actual

cap, actual = open_capture(DEVICE, PREFERRED_WIDTH, PREFERRED_HEIGHT, PREFERRED_FPS, PREFERRED_FOURCC)
print("Actual settings:", actual)
if not cap.isOpened():
    raise RuntimeError("Failed to open the camera. Check permissions and device node.")

#### 6.1 Headless Preview (Inline Frames)

In [None]:
import numpy as np
from IPython.display import display
import ipywidgets as widgets

# Capture 2 snapshots
n_frames = 2
imgs = []
for i in range(n_frames):
    ok, frame = cap.read()
    if not ok:
        print("Failed to read frame")
        break
    # Optional: convert color if needed (OpenCV default is BGR)
    # display inline
    _, buf = cv2.imencode(".jpg", frame)
    display(Markdown(f"**Frame {i+1}**"))
    display(widgets.Image(value=buf.tobytes(), format='jpg', width=512))

# Keep cap open for subsequent cells

#### 6.2 Windowed Live View

> Run this only on a local desktop with a display (won't work on headless servers). Press **q** to exit.

In [None]:
import cv2, time
win = "Live"
#cv2.namedWindow(win, cv2.WINDOW_NORMAL)
while True:
    ok, frame = cap.read()
    if not ok:
        break
    cv2.imshow(win, frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

### 7. Handling YUYV (Raw 4:2:2)

In [None]:
# If MJPG is unavailable or you prefer raw frames, try YUYV.
# We'll reopen with YUYV to demonstrate conversion.

cap.release()
cap, actual = open_capture(DEVICE, 640, 480, 30, "YUYV")
print("Reopened with YUYV. Actual:", actual)

ok, frame = cap.read()
if not ok:
    print("Failed to read YUYV frame; your camera/driver may not support raw at this size/fps.")
else:
    # Some backends already convert to BGR; if you get a single channel or strange shape, use cvtColor:
    # Example: yuyv_bgr = cv2.cvtColor(frame, cv2.COLOR_YUV2BGR_YUY2)
    # For demo, we will just show whatever we get:
    _, buf = cv2.imencode(".jpg", frame)
    display(Markdown(f"**Frame {i+1}**"))
    display(widgets.Image(value=buf.tobytes(), format='jpg', width=512))

## Run Demos

In [None]:
#!ls bin
!./bin/classification -h
#!./bin/classification -m assets/models/EfficientNetB0_4.dxnn -i sample/ILSVRC2012/1.jpeg

In [None]:
#!./bin/yolo -h
!./bin/yolo -m assets/models/YOLOV5S_3.dxnn -i sample/face_sample.jpg -p 1
#!eog result.jpg
#!./bin/yolo -m assets/models/YOLOV5S_3.dxnn -v assets/videos/boat.mp4 -p 1
#!./bin/yolo -m assets/models/YOLOV5S_3.dxnn -p 1 -c
#!./bin/yolo -m assets/models/YOLOV5S_3.dxnn -p 1 -c --camera_path /dev/video0

In [None]:
!./bin/yolo_multi -h
#!cat example/yolo_multi/yolo_multi_demo.json
#!./bin/yolo_multi -c example/yolo_multi/yolo_multi_demo.json

In [None]:
!./bin/pose -h
#!./bin/pose -m assets/models/YOLOV5Pose640_1.dxnn -i sample/7.jpg -p 0
#!./bin/pose -m assets/models/YOLOV5Pose640_1.dxnn -v assets/videos/dance-solo.mov -p 0
#!./bin/pose -m assets/models/YOLOV5Pose640_1.dxnn -c -p 0

In [None]:
!./bin/segmentation -h
#!./bin/segmentation -m assets/models/DeepLabV3PlusMobileNetV2_2.dxnn -i sample/8.jpg
#!./bin/segmentation -m assets/models/DeepLabV3PlusMobileNetV2_2.dxnn -v assets/videos/blackbox-city-road.mp4
#!./bin/segmentation -m assets/models/DeepLabV3PlusMobileNetV2_2.dxnn -c

In [None]:
#!./bin/od_segmentation -h
#!./bin/od_segmentation -m0 assets/models/YoloV7.dxnn -p0 3 -m1 assets/models/DeepLabV3PlusMobileNetV2_2.dxnn -i sample/8.jpg
#!./bin/od_segmentation -m0 assets/models/YoloV7.dxnn -p0 3 -m1 assets/models/DeepLabV3PlusMobileNetV2_2.dxnn -v assets/videos/blackbox-city-road2.mov
!./bin/od_segmentation -m0 assets/models/YoloV7.dxnn -p0 3 -m1 assets/models/DeepLabV3PlusMobileNetV2_2.dxnn -c