# Part 1
Perform the camera calibration using built-in tools in OpenCV or MATLAB – use a smartphone camera for this assignment. 

In [1]:
PRE_CALIBRATED = True

## ChArUco Board + OpenCV Settings (Matches Printed Target)

This notebook uses a **ChArUco calibration target** to estimate the camera’s intrinsic parameters (camera matrix) and lens distortion coefficients using OpenCV.

### Printed board (calib.io default file)
The printed target corresponds to:
- **Rows × Columns:** 5 × 7 squares  
- **Checker (square) size:** 28 mm  
- **Marker size:** 21 mm  
- **Dictionary:** DICT_4X4  

These physical dimensions are used as **metric ground truth** during calibration.

### OpenCV parameters (must match the print)
```python
SQUARES_X = 7
SQUARES_Y = 5
SQUARE_LENGTH_M = 0.028   # 28 mm
MARKER_LENGTH_M = 0.021   # 21 mm
DICTIONARY_ID = cv2.aruco.DICT_4X4_50

In [2]:
import cv2
import numpy as np
import glob
from pathlib import Path

SQUARES_X = 7
SQUARES_Y = 5
SQUARE_LENGTH_M = 0.028   # 28 mm
MARKER_LENGTH_M = 0.021   # 21 mm
DICTIONARY_ID = cv2.aruco.DICT_4X4_50


## Calibration Frame Capture Step

This step collects raw calibration images from the camera and saves selected frames to disk.

- A directory named `calib_frames` is created to store captured images.
- The camera is opened using the DirectShow backend to ensure stable initialization on Windows.
- The capture resolution is fixed so all images share the same dimensions.
- Frames are continuously read from the camera and displayed in a preview window.
- Pressing **SPACE** saves the current frame as a PNG file with a sequential name.
- Pressing **ESC** exits the capture loop.
- When finished, the camera is released and all OpenCV windows are closed.

The result is a set of static images that serve as input to the ChArUco detection and calibration steps.


In [3]:
if not PRE_CALIBRATED:
    out = Path("calib_frames")
    out.mkdir(parents=True, exist_ok=True)

    cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
    cap.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 3840)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 2160)

    actual_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    actual_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    print(f"Capture resolution: {actual_w}x{actual_h}")

    if not cap.isOpened():
        cap.release()
        raise RuntimeError("Failed to open camera")

    idx = 0
    print("SPACE=save | ESC=quit")

    PREVIEW_MAX_W = 1600   # fits most monitors
    PREVIEW_MAX_H = 900

    while True:
        ok, frame = cap.read()
        if not ok:
            print("Frame read failed; exiting.")
            break

        # -------- preview only (does NOT affect saved image) --------
        preview = cv2.flip(frame, 1)  # mirror for usability

        h, w = preview.shape[:2]
        scale = min(PREVIEW_MAX_W / w, PREVIEW_MAX_H / h, 1.0)
        if scale < 1.0:
            preview = cv2.resize(
                preview,
                (int(w * scale), int(h * scale)),
                interpolation=cv2.INTER_AREA
            )
        # ------------------------------------------------------------

        cv2.imshow("SPACE=save | ESC=quit", preview)
        key = cv2.waitKey(1) & 0xFF

        if key == 27:
            print("ESC pressed; exiting.")
            break

        if key == 32:
            fname = out / f"frame_{idx:04d}.jpg"
            ok_write = cv2.imwrite(
                str(fname),
                frame,  # ORIGINAL 4K, NOT flipped, NOT resized
                [int(cv2.IMWRITE_JPEG_QUALITY), 95]
            )
            print(f"Saved {fname}" if ok_write else f"Failed to save {fname}")
            idx += 1

    cap.release()
    cv2.destroyAllWindows()
else:
    print("skipping")


skipping


## ChArUco Detection Step
This step processes the saved calibration images and extracts usable geometric measurements from them.

- The ChArUco board geometry and ArUco dictionary are defined so OpenCV knows the physical layout of the printed target.
- Each saved image is loaded and converted to grayscale.
- ArUco markers are detected in the image to identify which parts of the board are visible.
- Using the detected markers and the known board layout, OpenCV interpolates the ChArUco chessboard corner locations with subpixel accuracy.
- Images with too few detected corners are discarded.
- Valid corner locations and their IDs are accumulated across all images.

The output of this step is a set of consistent 2D image points paired with known board points, which are then used by the camera calibration routine.


In [4]:
aruco_dict = cv2.aruco.getPredefinedDictionary(DICTIONARY_ID)
board = cv2.aruco.CharucoBoard(
    (SQUARES_X, SQUARES_Y),
    SQUARE_LENGTH_M,
    MARKER_LENGTH_M,
    aruco_dict,
)

detector = cv2.aruco.ArucoDetector(
    aruco_dict, cv2.aruco.DetectorParameters()
)

charuco_corners = []
charuco_ids = []
image_size = None

for fname in sorted(glob.glob("calib_frames/*.jpg")):
    img = cv2.imread(fname)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    if image_size is None:
        image_size = (gray.shape[1], gray.shape[0])

    corners, ids, _ = detector.detectMarkers(gray)
    if ids is None:
        continue

    ok, c_corners, c_ids = cv2.aruco.interpolateCornersCharuco(
        corners, ids, gray, board
    )

    if ok is not None and ok >= 8:
        charuco_corners.append(c_corners)
        charuco_ids.append(c_ids)


## Camera Calibration Step

This step estimates the camera’s intrinsic parameters and lens distortion.

- `calibrateCameraCharuco` uses all detected ChArUco corner correspondences across images.
- The function fits a pinhole camera model by minimizing reprojection error.
- Outputs include the camera matrix (`K`), distortion coefficients (`dist`), and RMS reprojection error (`rms`).
- Per-image board poses (`rvecs`, `tvecs`) are also computed.

The results are saved to `camera_intrinsics.npz` so they can be reused for image undistortion and further processing without recalibrating.


In [5]:
n = len(charuco_ids)
m = len(charuco_corners)
print("charuco_ids:", n, "charuco_corners:", m, "image_size:", image_size)
if n == 0 or m == 0:
    raise RuntimeError("No usable ChArUco detections collected. Re-run detection, lower ok_i threshold, or capture clearer/closer board frames.")
if n != m:
    raise RuntimeError("Mismatch: charuco_ids and charuco_corners lengths differ. Re-run detection from scratch.")

rms, K, dist, rvecs, tvecs = cv2.aruco.calibrateCameraCharuco(
    charucoCorners=charuco_corners,
    charucoIds=charuco_ids,
    board=board,
    imageSize=image_size,
    cameraMatrix=None,
    distCoeffs=None,
)

np.savez(
    "camera_intrinsics.npz",
    rms=rms,
    camera_matrix=K,
    dist_coeffs=dist,
    image_size=image_size,
)


charuco_ids: 15 charuco_corners: 15 image_size: (3840, 2160)


## Live Undistortion Check

This step applies the calibrated camera parameters to a live video stream to visually verify the calibration.

- A refined camera matrix is computed to account for lens distortion.
- Each captured frame is undistorted using the calibrated intrinsics.
- The undistorted image is displayed in real time.
- This provides a qualitative check that distortion has been corrected.
- No parameters are estimated or saved in this step.


In [6]:
if not PRE_CALIBRATED:
    cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
    if not cap.isOpened():
        raise RuntimeError("Camera failed to open")

    newK, _ = cv2.getOptimalNewCameraMatrix(K, dist, image_size, 0)

    while True:
        ok, frame = cap.read()
        if not ok:
            break

        und = cv2.undistort(frame, K, dist, None, newK)

        cv2.imshow("original", frame)
        cv2.imshow("undistorted", und)

        if (cv2.waitKey(1) & 0xFF) == 27:
            break

    cap.release()
    cv2.destroyAllWindows()
else: print("skipping")

skipping


# Part 2
Implement in Python or MATLAB, a script to find the real world 2D dimensions of an object using perspective projection equations. 

In [None]:

# ---- display scaling (does NOT affect math) ----
DISPLAY_MAX_W = 1600
DISPLAY_MAX_H = 900
disp_scale = 1.0

def compute_display_scale(h, w):
    return min(DISPLAY_MAX_W / w, DISPLAY_MAX_H / h, 1.0)

def to_full_res(x, y, scale):
    return int(x / scale), int(y / scale)

def to_display_res(x, y, scale):
    return int(x * scale), int(y * scale)

# ---- interactive points ----
pts = []          # 4 points in FULL-RES coordinates
drag_i = None

zoom_center = None
zoom_on = False
zoom_factor = 4
zoom_radius = 170

def ids_to_world_xy(ch_ids, board):
    ids = ch_ids.reshape(-1).astype(int)
    chess = board.getChessboardCorners()
    return chess[ids, :2].astype(np.float32)

def order_quad(pts4):
    pts4 = np.asarray(pts4, dtype=np.float32)
    c = pts4.mean(axis=0)
    ang = np.arctan2(pts4[:, 1] - c[1], pts4[:, 0] - c[0])
    pts4 = pts4[np.argsort(ang)]
    s = pts4.sum(axis=1)
    i0 = np.argmin(s)
    return np.roll(pts4, -i0, axis=0)

def quad_wh_m(world_xy4):
    p = np.asarray(world_xy4, dtype=np.float64)
    d01 = np.linalg.norm(p[1] - p[0])
    d12 = np.linalg.norm(p[2] - p[1])
    d23 = np.linalg.norm(p[3] - p[2])
    d30 = np.linalg.norm(p[0] - p[3])
    w = 0.5 * (d01 + d23)
    h = 0.5 * (d12 + d30)
    return w, h

def make_zoom_view(img_bgr, center_xy, radius, factor):
    h, w = img_bgr.shape[:2]
    cx, cy = center_xy
    x1 = max(0, cx - radius); x2 = min(w, cx + radius)
    y1 = max(0, cy - radius); y2 = min(h, cy + radius)
    crop = img_bgr[y1:y2, x1:x2].copy()
    if crop.size == 0:
        return None
    zoom = cv2.resize(crop, None, fx=factor, fy=factor, interpolation=cv2.INTER_NEAREST)
    zh, zw = zoom.shape[:2]
    cv2.line(zoom, (zw//2, 0), (zw//2, zh), (0, 255, 0), 1)
    cv2.line(zoom, (0, zh//2), (zw, zh//2), (0, 255, 0), 1)
    return zoom

def nearest_point_index(x, y, pts_list, r=25):
    if not pts_list:
        return None
    p = np.asarray(pts_list, dtype=np.float32)
    d2 = (p[:, 0] - x)**2 + (p[:, 1] - y)**2
    i = int(np.argmin(d2))
    return i if d2[i] <= r*r else None

def mouse_cb(event, x, y, flags, param):
    global pts, drag_i, zoom_center, zoom_on, disp_scale

    fx, fy = to_full_res(x, y, disp_scale)

    if event == cv2.EVENT_LBUTTONDOWN:
        i = nearest_point_index(fx, fy, pts, r=25)
        if i is not None:
            drag_i = i
        else:
            if len(pts) < 4:
                pts.append((float(fx), float(fy)))

    elif event == cv2.EVENT_MOUSEMOVE:
        if drag_i is not None:
            pts[drag_i] = (float(fx), float(fy))

    elif event == cv2.EVENT_LBUTTONUP:
        drag_i = None

    elif event == cv2.EVENT_RBUTTONDOWN:
        zoom_center = (fx, fy)
        zoom_on = True

cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
cap.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 3840)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 2160)

win = "Measure: LClick add/drag | RClick zoom | C clear | Z toggle zoom | ESC quit"
cv2.namedWindow(win)
cv2.setMouseCallback(win, mouse_cb)

last_dims_m = None

while True:
    ok, frame = cap.read()
    if not ok:
        break

    Hh, Ww = frame.shape[:2]
    newK, _ = cv2.getOptimalNewCameraMatrix(K, dist, (Ww, Hh), 0)
    und = cv2.undistort(frame, K, dist, None, newK)
    gray = cv2.cvtColor(und, cv2.COLOR_BGR2GRAY)

    marker_corners, marker_ids, _ = detector.detectMarkers(gray)
    have_H = False
    Hinv = None
    ch_corners = None
    ch_ids = None

    if marker_ids is not None and len(marker_ids) > 0:
        ok_i, ch_corners, ch_ids = cv2.aruco.interpolateCornersCharuco(marker_corners, marker_ids, gray, board)
        if ok_i is not None and ch_ids is not None and ok_i >= 4:
            img_pts = ch_corners.reshape(-1, 2).astype(np.float32)
            world_xy = ids_to_world_xy(ch_ids, board)
            H_world_to_img, _ = cv2.findHomography(world_xy, img_pts, 0)
            if H_world_to_img is not None:
                Hinv = np.linalg.inv(H_world_to_img)
                have_H = True
    

    vis = und.copy()
    if ch_corners is not None and ch_ids is not None:
        cv2.aruco.drawDetectedCornersCharuco(vis, ch_corners, ch_ids)

    if len(pts) > 0:
        for i, (x, y) in enumerate(pts):
            dx, dy = to_display_res(x, y, 1.0)  # full-res draw on vis
            cv2.circle(vis, (int(dx), int(dy)), 10, (0, 255, 0), -1)
            cv2.putText(vis, str(i+1), (int(dx)+12, int(dy)-12),
                        cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0,255,0), 2)

        if len(pts) == 4:
            quad = order_quad(pts)
            cv2.polylines(vis, [quad.astype(np.int32).reshape(-1, 1, 2)], True, (0, 255, 0), 3)

            if have_H:
                pts_img = quad.astype(np.float32).reshape(-1, 1, 2)
                pts_world = cv2.perspectiveTransform(pts_img, Hinv).reshape(-1, 2)
                w_m, h_m = quad_wh_m(pts_world)
                last_dims_m = (w_m, h_m)
                cv2.putText(vis, f"W: {w_m*1000:.1f} mm", (30, 60),
                            cv2.FONT_HERSHEY_SIMPLEX, 1.4, (255,255,255), 3)
                cv2.putText(vis, f"H: {h_m*1000:.1f} mm", (30, 120),
                            cv2.FONT_HERSHEY_SIMPLEX, 1.4, (255,255,255), 3)
            else:
                cv2.putText(vis, "Need ChArUco in view for scale.", (30, 60),
                            cv2.FONT_HERSHEY_SIMPLEX, 1.2, (255,255,255), 3)

    # scale for display only
    disp_scale = compute_display_scale(vis.shape[0], vis.shape[1])
    disp = vis
    if disp_scale < 1.0:
        disp = cv2.resize(vis, (int(vis.shape[1]*disp_scale), int(vis.shape[0]*disp_scale)), interpolation=cv2.INTER_AREA)

    status = f"pts={len(pts)}  ok_i={int(ok_i) if ok_i is not None else -1}  have_H={have_H}"
    cv2.putText(disp, status, (20, 30),
            cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 255), 2)

    cv2.imshow(win, disp)

    if zoom_on and zoom_center is not None:
        zoom_img = make_zoom_view(vis, zoom_center, zoom_radius, zoom_factor)
        if zoom_img is not None:
            cv2.imshow("zoom", zoom_img)

    key = cv2.waitKey(1) & 0xFF
    if key == 27:
        break
    if key in (ord('c'), ord('C')):
        pts = []
        last_dims_m = None
    if key in (ord('z'), ord('Z')):
        zoom_on = not zoom_on
        if not zoom_on:
            cv2.destroyWindow("zoom")

cap.release()
cv2.destroyAllWindows()

w_mm = last_dims_m[0] * 1000
h_mm = last_dims_m[1] * 1000

print(f"Measured dimensions:")
print(f"  Width : {w_mm:.1f} mm")
print(f"  Height: {h_mm:.1f} mm")



(np.float64(0.10022584880153934), np.float64(0.09538679635776284))

In [8]:
cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)

cap.set(cv2.CAP_PROP_FRAME_WIDTH, 3840)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 2160)


True

In [9]:
actual_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
actual_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
print(actual_w, actual_h)


3840 2160


# Part 3


In [10]:
# assumes: K, dist, board, detector already exist

img = cv2.imread("photo.jpg")
h, w = img.shape[:2]
newK, _ = cv2.getOptimalNewCameraMatrix(K, dist, (w, h), 0)
und = cv2.undistort(img, K, dist, None, newK)
gray = cv2.cvtColor(und, cv2.COLOR_BGR2GRAY)

m_corners, m_ids, _ = detector.detectMarkers(gray)
ok_i, ch_corners, ch_ids = cv2.aruco.interpolateCornersCharuco(m_corners, m_ids, gray, board)

img_pts = ch_corners.reshape(-1, 2).astype(np.float32)
world_xy = board.getChessboardCorners()[ch_ids.reshape(-1).astype(int), :2].astype(np.float32)

H, _ = cv2.findHomography(world_xy, img_pts, 0)
Hinv = np.linalg.inv(H)



AttributeError: 'NoneType' object has no attribute 'shape'