# Geometry Interpreter Module – Structural Geometry Pipeline
**Author**: Thaddeus da Silva Correa  
**Project**: Automated Extraction and Interpretation of Structural Geometry from CAD Drawings for BIM Integration

**Module**: 2 of 4 – Geometry Interpreter  
**Environment**: Google Colab  
**Last updated**: June 2025

---

This module takes structured geometry data (from `geometry.json` files) and interprets it into 3D geometric primitives such as extrusions, revolutions, and part groupings. It classifies parts like channels, anchors, and profiles based on their geometry and creates a BIM-compatible output for structural engineering applications.

**Inputs**: Geometry JSON files (produced by the DXF Parser)  
**Outputs**: Part definitions in JSON, including labeled extrusions and revolutions with metadata


## 1. Setup  
Import necessary libraries and define file system paths.


In [173]:
import json
import math
from typing import List, Dict, Any
from pathlib import Path
from collections import Counter

## 2. Core Parsing Functions  
In this section, we define the core functions used to interpret and classify geometry from the parsed files. These functions are organized into the following categories:

** A. Geometry Utilities**  
** B. Profile Detection**  
** C. Part Labeling & Metadata Assignment**  
** D. Cutout Detection & Handling**  
** E. Main Interpreter**  
** F. File I/O Utilities**



### A. Geometry Utilities  
Basic point math: distance, centroid calculation, and bounding box computation.



In [174]:
def distance(p1: List[float], p2: List[float]) -> float:
    """
    Calculate the Euclidean distance between two 2D or 3D points.

    Args:
        p1: First point as a list of floats [x, y, (z)].
        p2: Second point as a list of floats [x, y, (z)].

    Returns:
        Euclidean distance as a float.
    """
    return math.sqrt(sum((a - b) ** 2 for a, b in zip(p1, p2)))


def compute_centroid(points: List[List[float]]) -> List[float]:
    """
    Compute the centroid (geometric center) of a set of 2D or 3D points.

    Args:
        points: List of points, each a list of coordinates [x, y, (z)].

    Returns:
        A list representing the centroid [x, y, z].
    """
    if not points:
        return [0.0, 0.0, 0.0]

    x = sum(p[0] for p in points) / len(points)
    y = sum(p[1] for p in points) / len(points)
    z = sum(p[2] if len(p) > 2 else 0.0 for p in points) / len(points)

    return [x, y, z]


def compute_combined_bounding_box(parts: List[Dict[str, Any]]) -> Dict[str, List[float]]:
    """
    Compute a combined 3D bounding box from multiple parts with individual bounding boxes.

    Args:
        parts: List of parts, each containing a 'boundingBox' dictionary with 'min' and 'max' 3D points.

    Returns:
        A dictionary with combined 'min' and 'max' bounding box coordinates.
    """
    all_coords = []
    for part in parts:
        bbox = part.get("boundingBox")
        if bbox:
            all_coords.extend([bbox["min"], bbox["max"]])

    if not all_coords:
        return {}

    min_x = min(c[0] for c in all_coords)
    min_y = min(c[1] for c in all_coords)
    min_z = min(c[2] for c in all_coords)
    max_x = max(c[0] for c in all_coords)
    max_y = max(c[1] for c in all_coords)
    max_z = max(c[2] for c in all_coords)

    return {"min": [min_x, min_y, min_z], "max": [max_x, max_y, max_z]}


def compute_2d_bbox(points: List[List[float]]) -> Dict[str, List[float]]:
    """
    Compute the 2D bounding box of a set of 2D points.

    Args:
        points: List of points, each a list [x, y].

    Returns:
        Dictionary with 'min' and 'max' 2D bounds.
    """
    xs = [p[0] for p in points]
    ys = [p[1] for p in points]
    return {
        "min": [min(xs), min(ys)],
        "max": [max(xs), max(ys)]
    }


### B. Profile Detection & Bounding Logic  
Detect profiles and part types, infer bounding boxes, estimate extrusion parameters, and classify structural features (e.g., U-channels, rebar, cuboids).


In [175]:
def detect_revolution_axis(profile_points: List[List[float]], arcs: List[Dict] = [], features: List[Dict] = []) -> Dict[str, List[float]]:
    """
    Detect the likely axis of revolution for a given closed profile.

    Priority is given to dominant arc centers; if unavailable, centerline features or centroid fallback is used.

    Args:
        profile_points: List of profile boundary points.
        arcs: Optional list of arcs associated with the profile.
        features: Optional list of detected centerline/feature curves.

    Returns:
        A dictionary with 'origin' and 'direction' of the revolution axis.
    """
    def normalize(v):
        mag = math.sqrt(sum(x**2 for x in v))
        return [x / mag for x in v] if mag else [0, 0, 1]

    # Try using dominant arc centers
    arc_centers = [tuple(arc.get("center", [0, 0, 0])) for arc in arcs]
    if arc_centers:
        center_counts = {}
        for c in arc_centers:
            rounded = tuple(round(val, 2) for val in c)
            center_counts[rounded] = center_counts.get(rounded, 0) + 1
        dominant_center = max(center_counts.items(), key=lambda x: x[1])[0]
        for arc in arcs:
            if tuple(round(v, 2) for v in arc.get("center", [0, 0, 0])) == dominant_center:
                return {
                    "origin": list(dominant_center),
                    "direction": arc.get("normal", [0, 0, 1])
                }

    # Fallback: use centroid
    centroid = compute_centroid(profile_points)

    # Try to infer axis direction from features (e.g., centerlines)
    for feat in features:
        points = feat.get("points", [])
        if len(points) >= 2 and distance(centroid, points[0]) < 100:
            p1, p2 = points[0], points[-1]
            direction = [b - a for a, b in zip(p1, p2)]
            return {"origin": p1, "direction": normalize(direction)}

    # Final fallback
    return {"origin": centroid, "direction": [0, 0, 1]}


def detect_rotated_profile(profile: List[List[float]]) -> bool:
    """
    Determine whether a 2D profile is likely rotated (non-orthogonal).

    Args:
        profile: List of 2D points.

    Returns:
        True if the aspect ratio indicates rotation, False otherwise.
    """
    bbox = compute_2d_bbox(profile)
    width = abs(bbox["max"][0] - bbox["min"][0])
    height = abs(bbox["max"][1] - bbox["min"][1])
    if height == 0:
        return False
    aspect_ratio = width / height
    return aspect_ratio < 0.9 or aspect_ratio > 1.1


def estimate_sweep_angle(profile_points: List[List[float]], arcs: List[Dict] = []) -> int:
    """
    Estimate the sweep angle of a profile based on arc definitions or angular gaps.

    Args:
        profile_points: Profile perimeter points.
        arcs: Optional list of arcs with start/end angles.

    Returns:
        Estimated sweep angle (0–360°).
    """
    # Priority: Arc angles
    total_sweep = 0
    for arc in arcs:
        if "start_angle" in arc and "end_angle" in arc:
            sweep = (arc["end_angle"] - arc["start_angle"]) % 360
            total_sweep += sweep
    if total_sweep > 0:
        return min(round(total_sweep), 360)

    # Fallback: compute from point angles around centroid
    center = [
        sum(p[0] for p in profile_points) / len(profile_points),
        sum(p[1] for p in profile_points) / len(profile_points)
    ]
    angles = []
    for p in profile_points:
        dx, dy = p[0] - center[0], p[1] - center[1]
        angle = math.degrees(math.atan2(dy, dx)) % 360
        angles.append(angle)

    if not angles:
        return 360
    angles.sort()
    max_gap = max((angles[(i + 1) % len(angles)] - a) % 360 for i, a in enumerate(angles))
    return min(round(360 - max_gap), 360)


def find_extrusion_path(profile_chain: Dict, features: List[Dict], edges: List[Dict]) -> List[List[float]] | None:
    """
    Identify the extrusion path for a profile, if a feature like a polyline path exists.

    Args:
        profile_chain: A closed chain dictionary with 'points'.
        features: List of feature geometries (e.g., polylines).
        edges: Original edge data (not always used).

    Returns:
        A list of points representing the extrusion path, or None if undetected.
    """
    profile_centroid = compute_centroid(profile_chain["points"])
    for feat in features:
        if feat.get("type") != "polyline":
            continue
        feat_pts = feat.get("points", [])
        if len(feat_pts) < 2:
            continue
        if len(feat_pts) == 2:
            return feat_pts  # Simple line = valid path

        # Check for curvature or non-orthogonal direction
        direction_changes = 0
        for i in range(1, len(feat_pts) - 1):
            p1, p2, p3 = feat_pts[i - 1], feat_pts[i], feat_pts[i + 1]
            angle1 = math.atan2(p2[1] - p1[1], p2[0] - p1[0])
            angle2 = math.atan2(p3[1] - p2[1], p3[0] - p2[0])
            angle_diff = abs(angle2 - angle1)
            if angle_diff > math.radians(10):
                direction_changes += 1
        if direction_changes > 1:
            return feat_pts

        # Fallback: check if it's non-cardinal angle
        dx = feat_pts[-1][0] - feat_pts[0][0]
        dy = feat_pts[-1][1] - feat_pts[0][1]
        angle = math.degrees(math.atan2(dy, dx)) % 360
        if angle not in [0, 90, 180, 270]:
            return feat_pts
    return None

def compute_bounding_box_from_profile(profile_2d: List[List[float]], path_3d: List[List[float]]) -> Dict[str, List[float]]:
    """
    Compute the 3D bounding box from a 2D profile and its extrusion path.

    Args:
        profile_2d: List of [x, z] coordinates.
        path_3d: List of [x, y, z] path points.

    Returns:
        Dictionary with 'min' and 'max' 3D bounding box corners.
    """
    x_vals = [p[0] for p in profile_2d]
    z_vals = [p[1] for p in profile_2d]
    y_vals = [p[1] for p in path_3d]
    return {
        "min": [min(x_vals), min(y_vals), min(z_vals)],
        "max": [max(x_vals), max(y_vals), max(z_vals)]
    }

def detect_double_parallel(profile: List[List[float]]) -> bool:
    """
    Detect if the profile contains two nearly parallel edges (e.g., a U-channel).

    Args:
        profile: List of 2D points forming a closed loop.

    Returns:
        True if two parallel edge pairs are found, otherwise False.
    """
    if len(profile) < 4:
        return False
    edges = [(profile[i], profile[(i + 1) % len(profile)]) for i in range(len(profile))]
    parallel_edges = []

    for i in range(len(edges)):
        for j in range(i + 1, len(edges)):
            dx1 = edges[i][1][0] - edges[i][0][0]
            dy1 = edges[i][1][1] - edges[i][0][1]
            dx2 = edges[j][1][0] - edges[j][0][0]
            dy2 = edges[j][1][1] - edges[j][0][1]
            angle_diff = abs(math.degrees(math.atan2(dy2, dx2) - math.atan2(dy1, dx1))) % 180
            if angle_diff < 10:
                parallel_edges.append((i, j))

    return len(parallel_edges) == 2

def detect_cuboid(profile: List[List[float]]) -> bool:
    """
    Detect if the profile is a rectangular cuboid based on 4 points forming right angles.

    Args:
        profile: List of 2D points.

    Returns:
        True if it's a cuboid, otherwise False.
    """
    if len(profile) != 4:
        return False

    edge1 = [profile[1][0] - profile[0][0], profile[1][1] - profile[0][1]]
    edge2 = [profile[2][0] - profile[1][0], profile[2][1] - profile[1][1]]
    dot = edge1[0] * edge2[0] + edge1[1] * edge2[1]
    mag1 = math.hypot(*edge1)
    mag2 = math.hypot(*edge2)

    if mag1 == 0 or mag2 == 0:
        print(f"⚠️ Detected degenerate cuboid with zero-length edge: {profile}")
        return False

    angle = math.acos(dot / (mag1 * mag2))
    return abs(math.degrees(angle) - 90) <= 10

def detect_rebar_or_wire(profile: List[List[float]]) -> bool:
    """
    Detect circular, uniform profiles (e.g., rebars or wires) based on radial symmetry.

    Args:
        profile: List of 2D or 3D points.

    Returns:
        True if the profile is approximately circular.
    """
    if len(profile) < 3:
        return False

    centroid = compute_centroid(profile)
    distances = [math.dist(centroid[:2], p[:2]) for p in profile]
    avg = sum(distances) / len(distances)
    return (max(distances) - min(distances)) < 0.1 * avg

def flatten_profile(points: List[List[float]]) -> List[Dict[str, float]]:
    """
    Flatten a list of points into 2.5D XY points with Z=0.

    Args:
        points: List of [x, y] or [x, y, z] points.

    Returns:
        List of dictionaries with x, y, and z=0.
    """
    return [{"x": p[0], "y": p[1], "z": 0.0} for p in points]

def extract_extrusion_length(part_bbox: Dict[str, List[float]], dimensions: List[Dict[str, Any]], min_length: float = 500) -> float:
    """
    Infer extrusion length from vertical dimensions if present.

    Args:
        part_bbox: Bounding box of the part.
        dimensions: List of dimension annotations.
        min_length: Minimum expected extrusion length.

    Returns:
        Extrusion length if found, otherwise None.
    """
    if not dimensions:
        return None

    for dim in dimensions:
        if dim.get("type") != "linear":
            continue
        pt1, pt2 = dim.get("points", [None, None])
        if not pt1 or not pt2:
            continue
        dz = abs(pt1[2] - pt2[2])
        dx = abs(pt1[0] - pt2[0])
        dy = abs(pt1[1] - pt2[1])
        if dz > dx and dz > dy and dz > min_length * 0.5:
            return dz
    return None

def compute_parameters_from_bbox(bbox: Dict[str, List[float]]) -> Dict[str, Any]:
    """
    Compute center and dimensions from a bounding box.

    Args:
        bbox: Dictionary with "min" and "max" points.

    Returns:
        Dictionary with center and dimensional parameters.
    """
    min_pt = bbox["min"]
    max_pt = bbox["max"]
    center = [(a + b) / 2 for a, b in zip(min_pt, max_pt)]
    dimensions = [abs(b - a) for a, b in zip(min_pt, max_pt)]
    return {
        "center": {"x": center[0], "y": center[1], "z": center[2]},
        "dimensions": {
            "length": dimensions[0],
            "width": dimensions[1],
            "height": dimensions[2]
        }
    }

def add_bounding_boxes(parts: List[Dict[str, Any]]):
    """
    Append bounding box data to the parts list, including:
    - One for all extrusions
    - One for all revolutions
    - One overall combined box

    Args:
        parts: List of part dictionaries, each with type and geometry info.
    """
    if not parts:
        return

    if any(p["type"] == "extrusion" for p in parts):
        extrusions = [p for p in parts if p["type"] == "extrusion"]
        bbox = compute_combined_bounding_box(extrusions)
        parts.append({
            "type": "bounding_box",
            "label": "channel_bounding_box",
            "boundingBox": bbox,
            "parameters": compute_parameters_from_bbox(bbox),
            "origin": [0, 0, 0]
        })

    if any(p["type"] == "revolution" for p in parts):
        revolutions = [p for p in parts if p["type"] == "revolution"]
        bbox = compute_combined_bounding_box(revolutions)
        parts.append({
            "type": "bounding_box",
            "label": "anchors_bounding_box",
            "boundingBox": bbox,
            "parameters": compute_parameters_from_bbox(bbox),
            "origin": [0, 0, 0]
        })

    if parts:
        overall_bbox = compute_combined_bounding_box(parts)
        parts.append({
            "type": "bounding_box",
            "label": "bounding_box",
            "boundingBox": overall_bbox,
            "parameters": compute_parameters_from_bbox(overall_bbox),
            "origin": [0, 0, 0]
        })

### C. Part Labeling & Metadata Assignment  
Assign part labels and associate text metadata.



In [176]:
def label_part(part: Dict[str, Any], label_counter: Dict[str, int], context: Dict[str, Any]) -> str:
    """Dynamically generate the label for a part based on metadata, product code, layer name, or geometry."""
    shape_type = part.get("type", "").lower()
    dims = part.get("parameters", {}).get("dimensions", {})
    layer = part.get("metadata", {}).get("layer", "").lower() or part.get("reason", "").lower()
    product = context.get("product", "").lower()
    aspect_ratio = (dims.get("length", 1.0) / max(dims.get("width", 1.0), 1e-5)) if dims else 1.0
    label_base = "unknown"
    if "metadata" in part:
        meta = part["metadata"]
        if "type" in meta:
            label_base = meta["type"].lower()
        elif "product_code" in meta:
            label_base = meta["product_code"].lower()
    if label_base == "unknown":
        if shape_type == "revolution":
            if "anchor" in layer or "anchor" in product:
                label_base = "anchors"
            elif aspect_ratio < 1.5:
                label_base = "anchors"
            else:
                label_base = "round_anchor"
        elif shape_type in {"extrusion", "cuboid", "rotated_profile"}:
            if "channel" in layer or "channel" in product:
                label_base = "channel"
            elif aspect_ratio > 5:
                label_base = "flat_channel"
            elif "bracket" in layer:
                label_base = "bracket"
            else:
                label_base = "part"
        else:
            label_base = "part"
    count = label_counter.get(label_base, 0) + 1
    label_counter[label_base] = count
    return f"{label_base}_{count}"

def assign_text_metadata(part, texts):
    """    Assign relevant metadata to parts based on the nearby text labels.
    Looks for product codes and other specific part type indicators."""
    center = part["parameters"]["center"]
    for t in texts:
        value = t.get("value", "").strip()
        if "inspect" in value.lower() or "prüfen" in value.lower():
            part["inspect"] = True
        if value.startswith("HTA") or value.startswith("HZA"):
            dist = math.dist(center.values(), t.get("position", [0, 0, 0]))
            if dist < 500:
                part.setdefault("metadata", {})["product_code"] = value
            if "adapter" in value.lower():
                part.setdefault("metadata", {})["type"] = "adapter"
            elif "mount" in value.lower():
                part.setdefault("metadata", {})["type"] = "mount"
            elif "bracket" in value.lower():
                part.setdefault("metadata", {})["type"] = "bracket"

### D. Cutout Detection & Assignment  
Detect and assign circular or chain-based cutouts to their corresponding parts (extrusions, cuboids, etc.) based on spatial proximity and polygon containment.


In [177]:
def assign_cutouts(parts: List[Dict], chains: List[Dict], circles: List[Dict], min_radius: float = 2.0):
    """
    Assign cutout geometry (chains or circles) to structural parts (e.g., extrusions).

    A cutout is linked to a part if it's within its profile or bounding box.

    Args:
        parts: List of interpreted structural parts.
        chains: Closed chain geometries from the DXF parser.
        circles: Circular geometries from the DXF parser.
        min_radius: Minimum circle radius to consider as a valid cutout.
    """
    VALID_TYPES = {"extrusion", "rotated_profile", "cuboid"}

    # Assign chain-based cutouts
    for chain in chains:
        if not chain.get("is_cutout"):
            continue
        points = chain.get("points", [])
        if not points:
            continue

        cutout_centroid = compute_centroid(points)
        best_part = None

        # Priority 1: polygon inside profile
        for part in parts:
            if part.get("type") not in VALID_TYPES:
                continue
            if "profile" in part and point_in_polygon(cutout_centroid[:2], part["profile"]):
                best_part = part
                break
            if cutout_within_bbox(points, part["boundingBox"], dynamic_margin=True, base_margin=10):
                best_part = part
                break

        # Priority 2: fallback to nearest part by centroid
        if not best_part:
            min_dist = float('inf')
            for part in parts:
                if part.get("type") not in VALID_TYPES:
                    continue
                pc = part["parameters"].get("center")
                if not pc:
                    continue
                dist = math.dist([pc["x"], pc["y"]], cutout_centroid[:2])
                if dist < min_dist:
                    min_dist = dist
                    best_part = part

        # Assign the cutout
        if best_part:
            best_part.setdefault("cutouts", []).append({
                "type": "chain",
                "points": points,
                "source": "geometry.chain",
                "layer": chain.get("layer", "")
            })

    # Assign circular cutouts
    for circle in circles:
        if not circle.get("is_cutout"):
            continue

        center = circle.get("center", [0, 0])
        radius = circle.get("radius", 0)
        if radius < min_radius:
            continue

        best_part = None

        # Priority 1: center inside profile or bounding box
        for part in parts:
            if part.get("type") not in VALID_TYPES:
                continue
            if "profile" in part and point_in_polygon(center[:2], part["profile"]):
                best_part = part
                break
            if cutout_within_bbox([center], part["boundingBox"], dynamic_margin=True, base_margin=20):
                best_part = part
                break

        # Priority 2: closest centroid fallback
        if not best_part:
            min_dist = float('inf')
            for part in parts:
                if part.get("type") not in VALID_TYPES:
                    continue
                pc = part["parameters"].get("center")
                if not pc:
                    continue
                dist = math.dist([pc["x"], pc["y"]], center[:2])
                if dist < min_dist:
                    min_dist = dist
                    best_part = part

        # Assign the cutout
        if best_part:
            best_part.setdefault("cutouts", []).append({
                "type": "circle",
                "center": center[:2],
                "radius": radius,
                "source": "geometry.circle",
                "layer": circle.get("layer", "")
            })


def cutout_within_bbox(cutout_points: List[List[float]], part_bbox: Dict[str, List[float]],
                       dynamic_margin: bool = True, base_margin: float = 10) -> bool:
    """
    Check whether a cutout is within (or nearly within) a bounding box.

    Args:
        cutout_points: List of 2D points forming the cutout.
        part_bbox: Bounding box of the candidate part.
        dynamic_margin: If True, margin scales with part size.
        base_margin: Minimum margin applied to bbox.

    Returns:
        True if all or most of the cutout fits in the adjusted bbox.
    """
    min_x, min_y = part_bbox["min"][:2]
    max_x, max_y = part_bbox["max"][:2]

    # Expand bounding box by margin
    if dynamic_margin:
        width = max_x - min_x
        margin = max(width * 0.1, base_margin)
    else:
        margin = base_margin
    min_x -= margin
    min_y -= margin
    max_x += margin
    max_y += margin

    # Check if all cutout points lie inside
    all_inside = all(min_x <= p[0] <= max_x and min_y <= p[1] <= max_y for p in cutout_points)
    if all_inside:
        return True

    # Fallback: check cutout bounding box containment
    cutout_x = [p[0] for p in cutout_points]
    cutout_y = [p[1] for p in cutout_points]
    if not cutout_x or not cutout_y:
        return False

    cutout_bbox = {
        "min": [min(cutout_x), min(cutout_y)],
        "max": [max(cutout_x), max(cutout_y)]
    }

    return (
        part_bbox["min"][0] <= cutout_bbox["min"][0] and
        part_bbox["min"][1] <= cutout_bbox["min"][1] and
        part_bbox["max"][0] >= cutout_bbox["max"][0] and
        part_bbox["max"][1] >= cutout_bbox["max"][1]
    )


def point_in_polygon(point: List[float], polygon: List[Dict[str, float]]) -> bool:
    """
    Ray-casting algorithm to test if a point is inside a polygon.

    Args:
        point: [x, y] point to test.
        polygon: List of points with 'x' and 'y' keys (flattened profile).

    Returns:
        True if the point is inside the polygon.
    """
    x, y = point
    inside = False
    polygon_2d = [(p["x"], p["y"]) for p in polygon]
    if polygon_2d[0] != polygon_2d[-1]:
        polygon_2d.append(polygon_2d[0])  # ensure closed loop

    px1, py1 = polygon_2d[0]
    for i in range(1, len(polygon_2d)):
        px2, py2 = polygon_2d[i]
        if ((py1 > y) != (py2 > y)) and (x < (px2 - px1) * (y - py1) / (py2 - py1 + 1e-9) + px1):
            inside = not inside
        px1, py1 = px2, py2
    return inside


### E. Main Interpreter  
Classify profiles (chains) as structural parts based on geometric heuristics (extrusion, revolution, cuboid, etc.).  
Handles fallbacks using rules or BIM bounding boxes if detection fails.


In [178]:
def interpret_geometry(
    geometry: Dict,
    min_radius: float = 2.0,
    include_bounding_boxes: bool = True,
    context: Dict[str, Any] = {}
) -> List[Dict[str, Any]]:
    """
    Interpret raw geometry into structured parts like extrusions, revolutions, etc.

    Args:
        geometry: Parsed DXF geometry.
        min_radius: Minimum radius for valid revolutions.
        include_bounding_boxes: Whether to add bounding boxes for context.
        context: Metadata and rules (e.g., product code, tolerances).

    Returns:
        List of interpreted part dictionaries.
    """
    parts = []
    used_chain_indices = set()
    label_counter = {}
    product_prefix = context.get("product", "").split("_")[0]
    rule_cfg = context.get("rule_config", {})

    # Rule config values
    use_bim_fallback = rule_cfg.get("use_bim_fallback", False)
    default_length = rule_cfg.get("default_extrusion_length", 3000)
    min_area = rule_cfg.get("min_area", 10.0)
    fallback_min_area = rule_cfg.get("fallback_min_area", min_area)
    min_radius = rule_cfg.get("min_radius", min_radius)
    aspect_min = rule_cfg.get("aspect_ratio_min", 0.67)
    aspect_max = rule_cfg.get("aspect_ratio_max", 1.5)
    allow_fallbacks = rule_cfg.get("allow_fallback_extrusions", False)

    chains = geometry.get("chains") or geometry.get("edge_chains", [])
    circles = geometry.get("circles", [])
    dimensions = geometry.get("dimensions", [])
    texts = geometry.get("texts", [])

    def is_chain_acceptable(chain):
        if chain.get("is_cutout"):
            return False
        points = chain.get("points", [])
        if not points or len(set(tuple(p[:2]) for p in points)) < 3:
            return False
        if not chain.get("is_closed") and distance(points[0], points[-1]) > 1.0:
            return False
        return True

    valid_profiles = []
    for idx, chain in enumerate(chains):
        if not is_chain_acceptable(chain):
            continue
        points = chain["points"]

        # Ensure profile is closed
        if points and distance(points[0], points[-1]) > 1e-4:
            points.append(points[0])

        # Heuristic classification
        shape_type = chain.get("type", "").lower().strip()
        if shape_type not in {"extrusion", "revolution"}:
            continue
        if detect_double_parallel(points):
            shape_type = "double_parallel_profile"
        elif detect_cuboid(points):
            shape_type = "cuboid"
        elif detect_rebar_or_wire(points):
            shape_type = "rebar"
        elif detect_rotated_profile(points):
            shape_type = "rotated_profile"

        # Bounding box check
        bbox = {
            "min": [min(p[0] for p in points), min(p[1] for p in points), 0],
            "max": [max(p[0] for p in points), max(p[1] for p in points), 3000]
        }
        dims = compute_parameters_from_bbox(bbox)["dimensions"]
        area_2d = dims["length"] * dims["width"]
        if area_2d < min_area:
            continue

        centroid = compute_centroid(points)

        # Revolution rule filtering
        if shape_type == "revolution":
            radius_est = sum(math.dist(p[:2], centroid[:2]) for p in points) / len(points)
            aspect = dims["length"] / dims["width"] if dims["width"] else 1
            if radius_est < min_radius or not (aspect_min <= aspect <= aspect_max):
                continue

        # Z-axis consistency check
        z_values = [p.get("z", 0) for p in flatten_profile(points)]
        if max(z_values) - min(z_values) > 1e-4:
            continue

        # Create part
        part = {
            "flattened_profile": flatten_profile(points),
            "profile": flatten_profile(points),
            "boundingBox": bbox,
            "parameters": compute_parameters_from_bbox(bbox),
            "origin": [0, 0, 0],
            "reason": f"from edge_chain[{idx}] layer={chain.get('layer', '')}",
            "centroid": centroid,
            "cutouts": [],
            "chain_index": idx,
            "type": shape_type,
            "metadata": {
                "layer": chain.get("layer", ""),
                "source": "edge_chain"
            }
        }
        part["label"] = label_part(part, label_counter, context)

        # Add extrusion or revolution specifics
        if shape_type == "extrusion":
            path = find_extrusion_path(chain, geometry.get("features", []), geometry.get("edges", []))
            if path:
                part["path"] = path
                part["length"] = sum(math.dist(path[i], path[i + 1]) for i in range(len(path) - 1))
            else:
                part["length"] = extract_extrusion_length(bbox, dimensions) or default_length
        else:
            arcs = geometry.get("arcs", [])
            part["axis"] = detect_revolution_axis(points, arcs, geometry.get("edges", []))
            part["angle"] = estimate_sweep_angle(points, arcs)

        assign_text_metadata(part, texts)
        parts.append(part)
        used_chain_indices.add(idx)
        valid_profiles.append(part)

    # Fallback if no valid profiles
    if not valid_profiles and allow_fallbacks:
        fallback = [
            (idx, chain) for idx, chain in enumerate(chains) if is_chain_acceptable(chain)
        ]
        fallback.sort(key=lambda tup: len(tup[1]["points"]), reverse=True)
        for idx, chain in fallback:
            points = chain["points"]
            bbox = {
                "min": [min(p[0] for p in points), min(p[1] for p in points), 0],
                "max": [max(p[0] for p in points), max(p[1] for p in points), 3000]
            }
            dims = compute_parameters_from_bbox(bbox)["dimensions"]
            area = dims["length"] * dims["width"]
            if area < fallback_min_area:
                continue

            part = {
                "type": "extrusion",
                "flattened_profile": flatten_profile(points),
                "profile": flatten_profile(points),
                "boundingBox": bbox,
                "parameters": compute_parameters_from_bbox(bbox),
                "origin": [0, 0, 0],
                "reason": f"fallback from edge_chain[{idx}]",
                "centroid": compute_centroid(points),
                "cutouts": [],
                "chain_index": idx,
                "metadata": {
                    "layer": chain.get("layer", ""),
                    "source": "fallback_edge_chain"
                }
            }
            part["label"] = label_part(part, label_counter, context)
            part["length"] = extract_extrusion_length(bbox, dimensions) or default_length
            assign_text_metadata(part, texts)
            parts.append(part)
            used_chain_indices.add(idx)
            break

    # BIM fallback
    if not valid_profiles and use_bim_fallback and "bim" in geometry and "bounding_boxes" in geometry["bim"]:
        for idx, box in enumerate(geometry["bim"]["bounding_boxes"]):
            bbox = box.get("geometry", {})
            if "min" not in bbox or "max" not in bbox:
                continue
            part_bbox = {"min": bbox["min"], "max": bbox["max"]}
            part = {
                "type": "bounding_box",
                "boundingBox": part_bbox,
                "parameters": compute_parameters_from_bbox(part_bbox),
                "origin": [0, 0, 0],
                "reason": f"from bim.bounding_boxes[{idx}] layer={box.get('layer', '')}",
                "centroid": compute_centroid([bbox["min"], bbox["max"]]),
                "cutouts": [],
                "metadata": {
                    "layer": box.get("layer", ""),
                    "source": "bim_fallback"
                }
            }
            part["label"] = label_part(part, label_counter, context)
            assign_text_metadata(part, texts)
            parts.append(part)

    assign_cutouts(parts, chains, circles, min_radius)

    for p in parts:
        p.pop("centroid", None)
        p.pop("chain_index", None)

    if include_bounding_boxes:
        add_bounding_boxes(parts)

    return parts



def interpret_folder(geometry_folder: str, output_folder: str, default_min_radius: float = 2.0):
    """
    Batch interpret all *_geometry.json files in a folder and write *_parts.json output.

    Args:
        geometry_folder: Path to input JSONs with raw geometry.
        output_folder: Output directory for interpreted parts.
        default_min_radius: Default minimum radius for revolution detection.
    """
    input_path = Path(geometry_folder)
    output_path = Path(output_folder)
    output_path.mkdir(parents=True, exist_ok=True)

    rules = {}
    rules_path = input_path / "rules_config.json"
    if rules_path.exists():
        try:
            with open(rules_path) as f:
                rules = json.load(f)
        except Exception as e:
            print(f" Failed to load rules_config.json: {e}")

    total_parts = 0
    label_counter = Counter()
    type_counter = Counter()
    file_part_counts = {}

    for json_file in input_path.glob("*_geometry.json"):
        geometry_data = load_geometry(str(json_file))
        product_code = json_file.stem
        product_prefix = ''.join(filter(str.isalpha, product_code)).upper()
        rule_config = rules.get(product_prefix, rules.get("default", {}))
        min_radius = rule_config.get("min_radius", default_min_radius)
        context = {"product": product_code, "rule_config": rule_config}

        parts = interpret_geometry(
            geometry_data.get("geometry", {}),
            min_radius=min_radius,
            include_bounding_boxes=True,
            context=context
        )

        output_file = output_path / json_file.name.replace("_geometry.json", "_parts.json")
        save_parts(parts, output_file)
        print(f" Saved {len(parts)} parts to {output_file}")

        total_parts += len(parts)
        file_part_counts[product_code] = len(parts)
        label_counter.update(part["label"].rsplit("_", 1)[0] for part in parts)
        type_counter.update(part.get("type", "unknown") for part in parts)

    # Summary
    print("\n Summary Report")
    print("-" * 40)
    print(f" Total parts parsed: {total_parts}")
    print(f" Files processed: {len(file_part_counts)}")
    print("\n Parts per file:")
    for file, count in file_part_counts.items():
        print(f"  {file}: {count} parts")
    print("\n Parts by label type:")
    for label, count in label_counter.items():
        print(f"  {label}: {count}")
    print("\n Parts by shape type:")
    for typ, count in type_counter.items():
        print(f"  {typ}: {count}")

### F. File I/O Utilities  
Load parsed geometry input and save classified parts as JSON files.


In [179]:
def load_geometry(filepath: str) -> Dict:
    """
    Load geometry data from a JSON file.

    Args:
        filepath: Path to the *_geometry.json file.

    Returns:
        Parsed dictionary of geometry content.
    """
    with open(filepath, "r") as f:
        return json.load(f)


def save_parts(parts: List[Dict], filepath: str):
    """
    Save interpreted parts to a JSON file.

    Args:
        parts: List of part dictionaries to save.
        filepath: Destination path for *_parts.json file.
    """
    with open(filepath, "w") as f:
        json.dump(parts, f, indent=2)


## 3. File Paths  
Define where the interpreter will read parsed geometry files from, and where it will save the final output.


In [180]:
# Set input and output directories for interpreter
geometry_folder = Path("")
output_folder = Path("")

## 4. Configuration Setup  
This section sets up the rule-based configuration for interpreting geometries.

It:
- Defines parameters for cutout and profile detection.
- Configures thresholds such as minimum area, profile radius, and aspect ratios.
- Supports product-specific overrides and fallback behavior.


In [181]:
# Define config path inside the geometry folder
config_path = Path(geometry_folder) / "rules_config.json"

# Rule-based interpreter configuration for different product types
rules_config = {
    "HTA": {
        "default_extrusion_length": 2500,
        "min_area": 5.0,
        "fallback_min_area": 1.0,
        "min_radius": 2.0,
        "aspect_ratio_min": 0.5,
        "aspect_ratio_max": 2.0,
        "allow_fallback_extrusions": True,
        "use_bim_fallback": True,
        "label_keywords": {
            "anchor": "anchor",
            "plate": "plate",
            "bracket": "bracket"
        }
    },
    "HZA": {
        "default_extrusion_length": 2000,
        "min_area": 15.0,
        "fallback_min_area": 5.0,
        "min_radius": 3.0,
        "aspect_ratio_min": 0.6,
        "aspect_ratio_max": 1.7,
        "allow_fallback_extrusions": True,
        "use_bim_fallback": True,
        "label_keywords": {
            "adapter": "adapter",
            "mount": "mount"
        }
    },
    "default": {
        "default_extrusion_length": 2500,
        "min_area": 10.0,
        "fallback_min_area": 5.0,
        "min_radius": 2.5,
        "aspect_ratio_min": 0.6,
        "aspect_ratio_max": 2.0,
        "allow_fallback_extrusions": False,
        "use_bim_fallback": False,
        "label_keywords": {}
    }
}

# Save the rule config to a file for later use by the interpreter
with open(config_path, "w") as f:
    json.dump(rules_config, f, indent=2)


## 5. Execute Interpreter  
Run the geometry interpreter on all parsed geometry files in the input folder.



In [None]:
# Run interpreter and generate classified parts
interpret_folder(geometry_folder, output_folder)

##  Final Notes

This notebook is part of a **four-step modular pipeline** for extracting and validating BIM-ready geometry from structural engineering drawings.

### Output Location
- Interpreted parts are saved as `_parts.json` files in the defined `output_folder`.

### How to Run
1. Set your `geometry_folder` and `output_folder` paths in **Section 3**.
2. (Optional) Customize rule-based interpretation in **Section 4** via `rules_config.json`.
3. Run all cells from top to bottom.

### Next Step
- Continue to the next notebook: `[Json Builder]`

### Documentation
For full setup instructions and pipeline details, see the [README.md](https://github.com/ThadaMan/Thesis/blob/main/README.md) in the repository.
