# **1. Background and constraints**

* Every time ETABS saves a model, it writes:

  * A binary `.EDB` file (primary DB), and
  * A text backup `.$et` file with the same basename.
* `.$et` and `.e2k` are both *model text files* ETABS can import/export.
* The text files are organized in **sections** beginning with lines that start with `$` (e.g. `$ FRAME SECTIONS`, `$ LOAD COMBINATIONS`). Within each section, there are usually one-line records such as
  `FRAMESECTION "W5X16" MATERIAL "A36" SHAPE "W5X16"`
* CSI’s KB suggests that each text-field’s meaning is aligned with the “Print Tables” headers (exported as TXT/Excel).

So the basic idea is “parse in chunks per `$ HEADER`, diff per chunk, then summarize”


# **2. Concept details: what the system actually does**

## 2.1. User-level behavior

For a designer, the “automatic model log” should feel like:

> “Every time I save (or commit) the ETABS model, a log entry appears saying things like:
>
> * *‘Story L14: 26 columns changed from W14x90 → W14x211 (Grids A–D / 1–4).’*
> * *‘New load combo 1.2D+1.0Wx+1.0L created.’*
> * *‘Concrete f’c for C30 changed from 4.0 ksi → 4.5 ksi.’*”

No one ever sees raw `$et` diffs unless they go looking for them.

## 2.2. Core pipeline

For each new version of the model (two successive `.$et` files):

1. **Raw text snapshot**

   * Store/commit the `.$et` file (or `.e2k`, optional).
   * Optional git integration: each saved model version gets a commit or tag.

2. **Text → structured model**

   * Parse `.$et` into a structured in-memory object (or intermediate JSON) by:

     * Splitting into `$`-header sections.
     * Parsing each section’s lines into typed records (e.g. `FrameSection`, `FrameAssign`, `Joint`, `GridLine`, `Story`, `LoadCombo`).

3. **Semantic diff**

   * Compare previous vs current structured models.
   * Produce a **typed change list**, e.g.:

     * `FrameSectionChanged(name="W14X90", field="Area", old=XX, new=YY)`
     * `FrameAssignmentChanged(frame_id=2301, story="L14", old_section="W14X90", new_section="W14X211")`
     * `LoadComboAdded(name="1.4D")`, etc.

4. **Domain-aware aggregation**

   * Group low-level changes into patterns:

     * By story, by grid, by section type, etc.
     * Example: detect that “all columns on Story L14 that *were* W14x90 are now W14x211” instead of a hundred separate frame changes.

5. **LLM summarization & explanation**

   * LLM takes the *aggregated* change data (small JSON, not the whole file) and produces:

     * A concise summary for the log (“what changed?”).
     * Optionally, a richer explanation (“what does this mean structurally?”).
   * This is where statements like “column at grid A-1 changed from XXX to YYY” come from.

6. **Outputs**

   * **Machine-readable diff** (JSON) for tooling:

     * For MCP tools, UI dashboards, further processing.
   * **Human-readable log** (Markdown/HTML/Plain-text):

     * “ModelLog_2025-11-13.md” with sections like:

       * Geometry changes
       * Section/Material changes
       * Load pattern/case/combo changes
       * Analysis/design settings changes


# **3. Anticipated challenges & strategies**

## 3.1. `$et` format quirks

**Challenge:** `.$et` is a backup, not a formally documented API; the order of sections/records might change, some fields are volatile (timestamps, version numbers, etc.).

**Strategies:**

* Prefer **section-level parsing**:

  * Use `$ HEADER` lines as natural boundaries.
* Within each section:

  * Parse each line into a **dict of fields** (using quoted-string tokenization, known keywords).
  * For diffing, **sort** records by a stable key (e.g., object name, ID), so we’re not sensitive to ETABS reordering.
* Maintain a list of **volatile fields to ignore** in diffs (e.g., program version, date/time, “last analysis date”).

## 3.2. Stable identity for objects

**Challenge:** ETABS may renumber objects; textual IDs alone might not be stable (e.g., frame ID 2301 today could be 2310 tomorrow if we add/delete elements).

**Strategies:**

* Use **logical keys** wherever possible:

  * Frame/column/beam: often have a “Label” or “Unique Name” that is more stable than an internal ID. When available, that becomes the key.
  * Use (Story, EndJointI, EndJointJ) triple as a fallback identity, where joint coordinates haven’t changed.
* Establish **matching heuristics**:

  * For each “old” frame record, try:

    1. Match by Label/Name.
    2. If missing, match by joint pair (within coordinate tolerance).
  * If no match, treat as removed/added element instead of a modified one.

## 3.3. Numeric noise & insignificant changes

**Challenge:** Small numeric differences (e.g., 12.0000 vs 12.0001) and internal defaults can create huge but meaningless diffs.

**Strategies:**

* Implement tolerance rules per field:

  * Example: coordinate differences < 1e-4, stiffness modifiers differences < 1e-3 → ignored.
* Distinguish:

  * **“Noisy” changes** (ignored or only show under “advanced details”).
  * **Material / geometric changes** (always reported).

## 3.4. Large, pattern-like changes

**Challenge:** Real-life operations (copying a story, changing a template, swapping all columns for a stronger size) can touch thousands of records.

**Strategies:**

* **Aggregate first, then summarise:**

  * Group changes by (story, section_old, section_new, object_type).
  * Represent as “# of objects changed” rather than individual items.
* Detect high-level operations heuristically:

  * Example: if 100% of beams on Story L14 appear as new, with same pattern as previous story L13 → summarise as “Story L15 added by copying L14 (N beams, M columns, etc.)”.

## 3.5. Story & grid-based language (“column on grid A-1”)

**Challenge:** `.$et` encodes coordinates, grid lines, and story elevations, but designers think in grids/stories.

**Strategies:**

* Parse these tables/sections:

  * **Story data** (Story name, elevation).
  * **Grid lines** and their coordinates. ([CSI Knowledge Base][5])
  * **Joint coordinates** and connectivity.
* For each frame object:

  * Lookup its end joints.
  * Map joint coordinates to nearest grid line in X and Y (within tolerance) → grid labels.
  * Map Z to story elevation → story name.
* Attach human-friendly tags:

  * e.g. `location = {"story": "L14", "grid_x": "A", "grid_y": "1"}`.
* In LLM prompt, instruct: “When available, refer to frames as columns/beams at grid A-1 on Story L14."

## 3.6. Performance and scaling

**Challenge:** Big models → big text files (tens of MB). LLM context is limited and expensive.

**Strategies:**

* Keep LLM out of raw text:

  * All raw parsing and diffing is deterministic code.
  * LLM only sees compact JSON summaries (aggregations, counts, a handful of representative examples).
* Use streaming / incremental diff:

  * Only compute diff vs **previous version**, not vs entire history.
  * Optionally support “compare version X vs Y” with re-run of the same diff pipeline.


# **4. Data model: what to parse and how**

## 4.1. Focus for v1

Start with sections that have direct design meaning:

* `$ PROGRAM CONTROL` / meta (mostly ignored except version, analysis options).
* `$ MATERIAL PROPERTIES`
* `$ FRAME SECTIONS` (steel, concrete, etc.)
* `$ AREA SECTIONS` (slabs, walls).
* `$ POINT OBJECTS` / `$ JOINT COORDINATES`.
* `$ FRAME OBJECTS` / frame connectivity.
* `$ STORY DATA`.
* `$ GRID LINES`.
* `$ LOAD PATTERNS`, `$ LOAD CASES`, `$ LOAD COMBINATIONS`.
* `$ FRAME ASSIGNS` (sections, releases, modifiers, auto-mesh).
* (Optionally later) `$ AREA ASSIGNS`, diaphragm assignments, mass source, design preferences/overwrites.

For each section, define:

* **Record type** (e.g., `FrameSection`, `FrameAssignFrame`, `LoadCombo`).
* **Key fields** (for matching: name, label, ID).
* **Important attributes** vs **low-priority** ones.

## 4.2. Internal representation

Use a structured model, e.g. Python dataclasses or pydantic models:


In [None]:
from dataclasses import dataclass
from typing import Optional

@dataclass
class FrameSection:
    name: str
    material: str
    shape: str
    # Optional numeric properties (area, Ix, Iy, etc.) if present

@dataclass
class FrameObject:
    name: str  # ETABS object label
    story: str
    joint_i: str
    joint_j: str
    section: str
    # computed later
    grid_x: Optional[str] = None
    grid_y: Optional[str] = None

@dataclass
class LoadCombo:
    name: str
    design_type: Optional[str]
    terms: list  # list[LoadComboTerm]  # combination of patterns/cases with SFs


…and so on.

The **structured model** object then has collections:


In [None]:
from dataclasses import dataclass
from typing import Dict, List

@dataclass
class EtabsModel:
    program_info: 'ProgramInfo'
    stories: Dict[str, 'Story']
    grids: List['GridLine']
    joints: Dict[str, 'Joint']
    frames: Dict[str, 'FrameObject']
    frame_sections: Dict[str, 'FrameSection']
    load_combos: Dict[str, 'LoadCombo']
    # etc.


The structured model object then has collections as shown above.


# **5. Diff engine design**

## 5.1. General pattern

For each collection (by type):

* Compute:

  * **Added** keys: present in new, not in old.
  * **Removed** keys: present in old, not in new.
  * **Possibly modified** keys: present in both → field-by-field comparison with tolerances.

Produce **typed change objects**, e.g.:


In [None]:
from dataclasses import dataclass
from typing import Dict

@dataclass
class FrameSectionChanged:
    name: str
    changed_fields: Dict[str, 'FieldChange']

@dataclass
class FrameAssignmentChanged:
    frame_name: str
    story: str
    location: 'LocationInfo'
    old_section: str
    new_section: str


## 5.2. Aggregation pass

On top of the raw change list:

* Cluster changes by patterns, e.g.:


In [None]:
from dataclasses import dataclass
from typing import Literal, Optional, List

@dataclass
class SectionSwapCluster:
    object_type: Literal["column", "beam", "brace", "frame"]
    story: Optional[str]  # e.g. "L14"
    grid_region: Optional['GridRegion']  # e.g. {"grid_x": ["A","B"], "grid_y": ["1","2"]}
    old_section: str
    new_section: str
    count: int
    example_objects: List[str]


* Similar clusters for:

  * Added/removed frames by region.
  * Material property changes.
  * Load combination changes (new combos, changed factors, etc.).

## 5.3. Change categories

Tag each cluster by high-level **category**:

* `"geometry"`: new/deleted frames, changes in joint coordinates, story heights.
* `"member_properties"`: section swaps, modifiers, material changes.
* `"loads"`: load patterns, cases, combos, assignments.
* `"analysis_settings"`: mass source, P-Δ, nonlinear options.
* `"design_settings"`: design codes, phi factors, overwrite changes.

These tags feed directly into LLM prompts (“group summaries by category”).


# **6. LLM + MCP integration**

## 6.1. Role of the LLM

LLM should **not** parse `$et` or raw diffs. Its jobs:

1. **Semantic compression**:

   * Input: JSON structure describing aggregated changes (clusters).
   * Output: succinct Markdown/HTML summary:

     * Bullet points per category.
     * Use human-friendly terminology (“columns on Story L14”, “grid A-1”, “gravity load combo”, etc.).

2. **Explanation** (optional mode):

   * Given a change cluster, explain structural implications:

     * “Increasing columns from W14x90 to W14x211 on L14 increases axial and flexural capacity; likely a response to higher demand or design criteria.”

3. **Q&A over diffs** (via MCP tools):

   * Tools to let the LLM answer follow-up queries:

     * “Which stories had section changes?”
     * “Which load combos were modified between v12 and v13?”

## 6.2. MCP server sketch

Implement an MCP server exposing tools like:

* `list_model_versions(model_path)`
  → returns known `$et` snapshots with timestamps and tags.

* `get_diff(old_version, new_version)`
  → returns:


In [None]:
{
  "summary": "... short text ...",
  "categories": [...],
  "clusters": [...],
  "raw_changes": {...}
}

* `get_change_details(diff_id, cluster_id)`
  → returns detailed list of affected objects and raw field changes.

* `search_changes(filter)`
  Example filter:


In [None]:
{
  "story": "L14",
  "object_type": "column",
  "change_type": "section_swap"
}

With this, a chat-based LLM agent can call tools to fetch diffs and then narrate them back to the user.


# **7. Project spec: roles, phases, deliverables**

## 7.1. Roles

* ** Engineers / Developers**

  * Deep ETABS understanding, can generate test models and validate semantics.
  * Implements parsers, diff logic, aggregation, tests.
  * Decides thresholds, classification rules, naming conventions (“what counts as a column vs beam”).

* **Coding agent (LLM)**

  * Assists in:

    * Writing & refactoring parsing/diff/aggregation code.
    * Writing unit tests.
    * Drafting prompts for summarization.
    * Generating documentation, CLI, and example notebooks.
  * Later, serves as the summarization engine via MCP/API.

## 7.2. Phase 0 – Recon & sample collection

**Goals:**

* Create a small but diverse set of `$et` (and optionally `.e2k`) files:

  * Simple 2D frame.
  * Multi-story steel frame with grids.
  * RC building with walls & slabs.
* Create **controlled edits** between versions:

  * Change a column size on one story.
  * Change all columns on a story.
  * Add/delete frames.
  * Add/change load combos.
* Manually inspect text to identify:

  * Section names (`$ FRAME SECTIONS`, `$ JOINT COORDINATES`, `$ LOAD COMBINATIONS`, etc.).
  * Field patterns and stable keys.

**Deliverable:**
Short internal note describing observed patterns, section naming, and mapping from fields to ETABS tables (leveraging CSI docs & Print Tables outputs).

## 7.3. Phase 1 – `$et` parser library

**Tasks:**

* Implement a small **Python library** (e.g. `etabs_text_model`) with:

  * `parse_et_file(path: str) -> EtabsModel`
  * `serialize_model(model: EtabsModel) -> dict` (JSON-compatible)
* Write section-specific parsers:

  * For v1: program info, stories, grids, joints, frames, frame sections, load combos, basic assignments.
* Unit tests using hand-made fixtures and real `$et` snippets.

**Deliverables:**

* Library code + docs for the parser.
* Test suite demonstrating correct parsing for sample files.

## 7.4. Phase 2 – Semantic diff engine

**Tasks:**

* Implement a module `etabs_model_diff` with:

  * `diff_models(old: EtabsModel, new: EtabsModel) -> DiffResult`
* Implement:

  * Per-collection set-based comparison.
  * Field-by-field comparison with tolerance.
  * Typed change objects + categories.
* Implement aggregation logic to generate clusters:

  * Section swap clusters.
  * Added/removed object clusters.
  * Load combo changes.

**Deliverables:**

* `DiffResult` JSON schema.
* Tests:

  * “Single column size change on L14 → one SectionSwapCluster with count=1.”
  * “All columns on L14 changed → one cluster with count=N.”
  * “New load combo → LoadComboAdded.”

## 7.5. Phase 3 – Location tagging (story + grid)

**Tasks:**

* Parse story and grid definitions; build:

  * `story_by_elevation` map.
  * `grid_lines_x`, `grid_lines_y` sets with coordinate positions.
* For each joint:

  * Attach nearest `grid_x`, `grid_y`, `story`.
* For each frame:

  * Derive a “primary location” from its joints:

    * Column: vertical element spanning between stories (same X/Y; Z changes).
    * Beam: horizontal element at a story (Z roughly constant).
  * Tag object type: column / beam / brace (heuristic based on orientation and assigned section type, optional).

**Deliverables:**

* Functions to return location descriptors for any frame/joint.
* Tests to ensure correct mapping for sample grids.

## 7.6. Phase 4 – Summarization API + MCP server

**Tasks:**

* Define a compact JSON schema for `SummarizationInput` (aggregated clusters).
* Write prompts for LLM:

  * System prompt: domain context (“You are an ETABS model change explainer…”).
  * Example-based instructions (few-shot examples for section swaps, new stories, changed combos).
* Implement MCP server:

  * Expose `list_model_versions`, `get_diff`, `get_change_details`.
  * Use environment-variable configurable paths (project folder with `$et` snapshots).

**Deliverables:**

* Running MCP server with basic CLI frontend:

  * `etabs-log diff v10 v11` → prints Markdown summary.
* Sample conversation transcripts showing human designer queries answered via MCP tools.

## 7.7. Phase 5 – Integration & UX

**Potential integrations:**

* **Git hook**:

  * Pre-commit or post-commit hook that runs:

    * `parse_et_file`, `diff` vs previous commit, then writes a `MODEL_LOG.md` entry.
* **VS Code / IDE extension (future)**:

  * Panel showing change summaries per commit.
* **Internal HTML dashboard**:

  * Displays model history, with filters by story, object type, etc.


# **8. Non-goals / later ideas**

To keep v1 sane:

* Do **not** try to cover *all* ETABS tables (foundations, staged construction, pushover hinges, etc.) initially.
* Do **not** attempt to reverse-engineer every field from CSI’s internal data dictionary; stick to the tables we actually care about for design review.
* Do **not** let the LLM mutate the model (no editing or `.$et` generation in v1) – read-only is safer.

Later expansions:

* Add coverage for:

  * Shell elements (walls, slabs) and their thickness / reinforcement changes.
  * Nonlinear hinge properties, spring links.
* Use ETABS API directly to supplement text parsing (e.g. to confirm object IDs, geometry, and results).


# References

- https://docs.csiamerica.com/help-files/etabs/Menus/File/Saving_Models.htm
- https://www.csiamerica.com/software/ETABS/21/ReleaseNotesETABSv2100.pdf
- https://www.eng-tips.com/threads/changing-material-of-line-members.236802/
- https://wiki.csiamerica.com/display/kb/Reporting%2BFAQ
- https://web.wiki.csiamerica.com/wiki/pages/viewpage.action?pageId=2168039&u
- https://www.engineeringskills.com/posts/an-introduction-to-the-etabs-python-api