## Part 1: Camera Calibration Using a Planar Checkerboard Target

Accurate metric reasoning from images requires a calibrated camera model that maps three-dimensional world points to two-dimensional image coordinates. In this work, we perform camera calibration using a planar checkerboard target and the standard pinhole camera model with lens distortion, as described in the forward imaging model and calibration framework in the videos.

### Calibration Target and Data Acquisition

Calibration is performed using a planar checkerboard pattern displayed on a laptop screen. The checkerboard consists of **13 × 10 squares**, which results in **12 × 9 internal corner points**, as required by OpenCV’s checkerboard detection convention that operates on *inner corners* rather than square boundaries.

A total of **five images** of the checkerboard are captured using a smartphone (iPhone 13 Pro) camera. The images are taken from different viewing angles and distances to ensure sufficient geometric diversity, which is necessary for stable estimation of both intrinsic and extrinsic parameters. All checkerboard images are assumed to lie on a single plane with known relative geometry.

### World Coordinate Definition

The world coordinate system is defined on the checkerboard plane, with all points lying on Z_w = 0. The 3D coordinates of the checkerboard corner points are generated assuming a regular grid structure, with unit spacing between adjacent corners. Since the projection matrix is defined only up to scale, absolute units are not required for intrinsic calibration at this stage.

### Calibration Procedure

For each image:
- Checkerboard inner corners are detected using OpenCV’s built in checkerboard corner detection function.
- Subpixel refinement is applied to improve localization accuracy.
- Corresponding 3D-2D point pairs are accumulated across all images.

Calibration quality is evaluated using the mean reprojection error, computed as the root-mean-square pixel distance between observed image points and the projected points obtained using the estimated camera parameters.


In [1]:
import cv2
import numpy as np
import glob

In [2]:
# Counting only inner corners (13x10 to 12x9)
checkerboard = (12, 9)

# World points on planar calibration target (Z = 0)
objp = np.zeros((checkerboard[0] * checkerboard[1], 3), np.float32)
objp[:, :2] = np.mgrid[0:checkerboard[0], 0:checkerboard[1]].T.reshape(-1, 2)

# Containers for 3D world points and corresponding 2D image points
objpoints = []
imgpoints = []

In [3]:
# Calibration images of checkerboard
images = glob.glob("calibration_img/image*.jpeg")

for fname in images:
    img = cv2.imread(fname)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Detect checkerboard inner corners using built in opencv function
    found, corners = cv2.findChessboardCorners(
        gray,
        checkerboard,
        cv2.CALIB_CB_ADAPTIVE_THRESH +
        cv2.CALIB_CB_NORMALIZE_IMAGE
    )

    if found:
        # Subpixel refinement to reduce image point localization error
        corners = cv2.cornerSubPix(
            gray,
            corners,
            (11, 11),
            (-1, -1),
            (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
        )
        objpoints.append(objp)
        imgpoints.append(corners)

# Estimate intrinsic matrix, distortion coefficients, and per-image extrinsics
ret, K, dist, rvecs, tvecs = cv2.calibrateCamera(
    objpoints,
    imgpoints,
    gray.shape[::-1],
    None,
    None
)

In [4]:
# Reprojection error computation to evaluate calibration accuracy

error = 0
points = 0

for i in range(len(objpoints)):
    projected, _ = cv2.projectPoints(
        objpoints[i],
        rvecs[i],
        tvecs[i],
        K,
        dist
    )
    error += cv2.norm(imgpoints[i], projected, cv2.NORM_L2) ** 2
    points += len(projected)

reprojection_error = np.sqrt(error / points)

print("Camera matrix (K):\n", K)
print("Distortion coefficients:\n", dist.ravel())
print("Mean reprojection error:", reprojection_error)


Camera matrix (K):
 [[3.14429337e+03 0.00000000e+00 1.44940526e+03]
 [0.00000000e+00 2.98493024e+03 1.96337804e+03]
 [0.00000000e+00 0.00000000e+00 1.00000000e+00]]
Distortion coefficients:
 [ 1.71646648e-01 -1.83120246e+00 -1.92813775e-02 -3.84685456e-03
  3.95203665e+00]
Mean reprojection error: 1.7112395295119254


## Part 2 and Part 3: Real-World 2D Measurement and Experimental Validation

In this section, we estimate the real-world two-dimensional dimensions of a planar object from a single calibrated image and validate the results using physical measurements.

### Measurement Setup and Assumptions

A book is used as the test object. The book is placed upright against a wall so that its surface is approximately parallel to the image plane. An image of the book is captured using the same calibrated smartphone camera from a measured distance of approximately 2.2 meters.

The following assumptions are made (as per the videos):
- All points on the book lie on a single planar surface.
- The depth of the object relative to the camera is approximately constant across the object.
- The object dimensions are small compared to the camera-to-object distance, allowing magnification to be treated as constant.
- Camera intrinsic parameters and lens distortion coefficients obtained in Part 1 are known and fixed.

### Real-World Dimension Estimation

The captured image is first undistorted using the calibrated camera parameters. Four corner points of the book are then manually selected in pixel coordinates. Using manually selected coordinates is suggested in the videos. 

Pixel distances between the selected corner points are computed along the width and height of the book. These pixel measurements are converted into real-world dimensions using the calibrated focal lengths and the known camera-to-object distance. This procedure directly follows the image magnification relationship described in the videos, where object size in the image scales linearly with depth.

### Experimental Validation

To validate the estimated dimensions, the physical size of the book is measured using a ruler. The true book dimensions are 7.6 inches in width and 10 inches in height. These measurements are converted to metric units and compared against the estimated dimensions obtained from the image.

Absolute and percentage errors are computed for both width and height. The estimated dimensions are within approximately 10 to 12 percent of the true measurements, which we consider to be reasonably accurate.


In [5]:
# Known camera-to-object distance (meters)
Z = 2.2

# Load image of object (a book for this notebook)
img = cv2.imread("detection_img/book.jpg")
h, w = img.shape[:2]

# Undistort image
new_K, _ = cv2.getOptimalNewCameraMatrix(K, dist, (w, h), 1)
img_ud = cv2.undistort(img, K, dist, None, new_K)

# Manually selected pixel coordinates of book corners
# Order: top-left, top-right, bottom-left, bottom-right
pts = np.array([
    [1404, 2364],
    [1648, 2352],
    [1410, 2682],
    [1664, 2682]
], dtype=np.float32)

# Pixel distances
width_px = np.linalg.norm(pts[1] - pts[0])
height_px = np.linalg.norm(pts[2] - pts[0])

# Intrinsic parameters
fx = new_K[0, 0]
fy = new_K[1, 1]

# Metric dimensions via perspective projection
width_m = (Z / fx) * width_px
height_m = (Z / fy) * height_px

print(f"Pixel width: {width_px:.2f} px")
print(f"Pixel height: {height_px:.2f} px")
print(f"Estimated real-world width: {width_m:.3f} m")
print(f"Estimated real-world height: {height_m:.3f} m")


Pixel width: 244.29 px
Pixel height: 318.06 px
Estimated real-world width: 0.174 m
Estimated real-world height: 0.224 m
