Shapeのフローを追うために接続先/元を追加 #15

moonmile · 2025-12-20T09:11:14Z

フローチャートの矢印がうまく追えていなかったので、Shape の構造を修正しました。

name: str | null // Excel shape Name プロパティ
begin_connected_shape: str | null // コネクタ始点の接続先 Shape.Name
end_connected_shape: str | null // コネクタ終点の接続先 Shape.Name

を追加しています。

こんな感じに、フローチャートの矢印が mermaid 上でも繋がるようになります。

Summary by CodeRabbit

New Features
- Shapes now include a nullable sequential id and expose begin_id/end_id to link connector endpoints by id.
Documentation
- Added sample diagram and updated data model and changelog entries describing id-based connector references.
Tests
- Expanded tests to validate shape ids, connector begin_id/end_id wiring, and updated fixtures to include ids.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-20T09:11:29Z

Walkthrough

Adds per-sheet sequential shape identifiers and connector endpoint references (id, begin_id, end_id); extracts connector-linked shape names during parsing, maps names to ids, and emits begin_id/end_id on Shape objects. Schemas, docs, samples, and tests updated to reflect id-based connector wiring and version bumps.

Changes

Cohort / File(s)	Summary
Documentation & Changelog `docs/agents/DATA_MODEL.md`	Bumped versions (0.9 → 0.10 → 0.11) and documented new Shape fields `id`, `begin_id`, `end_id`; explained connector endpoint semantics and nullability.
Schemas `schemas/shape.json`, `schemas/sheet.json`, `schemas/workbook.json`	Added nullable integer properties `id`, `begin_id`, `end_id` to Shape schemas (default null) with titles/descriptions for connector mapping.
Core implementation `src/exstruct/core/shapes.py`	Collects excel shape names, assigns sequential per-sheet `id` for non-relationship shapes, captures connector Begin/End names, accumulates pending connections, then resolves and writes `begin_id`/`end_id` using name→id map.
Public models `src/exstruct/models/__init__.py`	Exposed new Shape attributes: `id`, `begin_id`, `end_id` on the public model.
Samples & Documentation example `sample/sample-shape-connector.json`, `sample/sample-shape-connector.md`	Added sample workbook JSON with shapes/connectors and a Mermaid diagram markdown describing nodes and directed connections.
Tests `tests/com/test_shapes_extraction.py`, `tests/test_engine.py`, `tests/test_export_requirements.py`, `tests/test_models_validation.py`, `tests/test_print_area_views.py`	Updated tests and fixtures to construct Shapes with `id`; added test coverage for connector wiring (begin_id/end_id), uniqueness/existence assertions for ids, and resilient behavior when ConnectorFormat properties are absent.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Review mapping/resolution logic in src/exstruct/core/shapes.py for edge cases (duplicate names, missing names, ordering of assignment).
Verify consistency between model fields (src/exstruct/models/__init__.py) and all updated JSON schema files.
Check tests that assert id uniqueness and connector wiring for flakiness on platforms/Excel variations.

Possibly related issues

フローチャートの Shape がうまく追えていない #16 — Adds id/begin_id/end_id fields and connector wiring logic matching the connector extraction and endpoint mapping described in that issue.

Poem

🐰 I hopped through sheets and gave each node a name,

I counted every shape and numbered them the same,
Connectors now point to ids, not just to text,
I stitched begin to end so diagrams link next,
Hop, skip, a tiny dance — the graph is less perplexed. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the main change: adding connector source/destination fields to Shape model to enable flow tracking in flowcharts.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

src/exstruct/core/shapes.py (1)

221-242: Good defensive error handling for connector extraction.

The nested try/except structure correctly ensures:

Connector shapes that don't have ConnectorFormat are gracefully skipped
Partial extraction succeeds if only one end is connected

One minor readability nit: the variable name on lines 227 and 235 shadows the outer shape_name. Consider renaming to begin_name / end_name to avoid potential confusion.

🔎 Optional: Rename inner variables to avoid shadowing

                             try:
                                 begin_shape = connector.BeginConnectedShape
                                 if begin_shape is not None:
-                                    name = getattr(begin_shape, "Name", None)
-                                    if isinstance(name, str):
-                                        shape_obj.begin_connected_shape = name
+                                    begin_name = getattr(begin_shape, "Name", None)
+                                    if isinstance(begin_name, str):
+                                        shape_obj.begin_connected_shape = begin_name
                             except Exception:
                                 pass
                             try:
                                 end_shape = connector.EndConnectedShape
                                 if end_shape is not None:
-                                    name = getattr(end_shape, "Name", None)
-                                    if isinstance(name, str):
-                                        shape_obj.end_connected_shape = name
+                                    end_name = getattr(end_shape, "Name", None)
+                                    if isinstance(end_name, str):
+                                        shape_obj.end_connected_shape = end_name
                             except Exception:
                                 pass

tests/com/test_shapes_extraction.py (2)

48-54: Consider narrowing the exception handler scope.

The broad except Exception catch might mask unexpected errors during connector wiring. If there are known specific exceptions that can occur in some environments (e.g., COMError), consider catching those explicitly instead.

🔎 Example of more targeted exception handling

        try:
            connector.ConnectorFormat.BeginConnect(src_shape, 1)
            connector.ConnectorFormat.EndConnect(dst_shape, 1)
-       except Exception:
+       except (AttributeError, RuntimeError) as e:
            # In some environments connector wiring may fail; tests will
            # simply not find connected shapes in that case.
+           # Known issue: {type(e).__name__}
            pass

Note: Adjust exception types based on actual failures observed in different environments.

82-94: Consider extracting the filtering predicate for clarity.

The nested boolean logic spanning 11 lines makes the assertion harder to parse. Extracting it into a named helper function or variable would improve readability.

🔎 Example refactoring for improved readability

+   def is_non_relationship_autoshape(s):
+       has_no_text = s.text == "" or s.text is None
+       is_autoshape = (s.type or "").startswith("AutoShape")
+       has_relationship = (
+           s.direction
+           or s.begin_arrow_style is not None
+           or s.end_arrow_style is not None
+           or s.begin_connected_shape is not None
+           or s.end_connected_shape is not None
+       )
+       return has_no_text and is_autoshape and not has_relationship
+
    # Standard mode should not emit non-relationship AutoShapes without text.
-   assert not any(
-       (s.text == "" or s.text is None)
-       and (s.type or "").startswith("AutoShape")
-       and not (
-           s.direction
-           or s.begin_arrow_style is not None
-           or s.end_arrow_style is not None
-           or s.begin_connected_shape is not None
-           or s.end_connected_shape is not None
-       )
-       for s in shapes
-   )
+   assert not any(is_non_relationship_autoshape(s) for s in shapes)

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e0f1fe9 and 6b8e594.

⛔ Files ignored due to path filters (1)

sample/sample-shape-connector.xlsx is excluded by !**/*.xlsx

📒 Files selected for processing (7)

docs/agents/DATA_MODEL.md (4 hunks)
sample/sample-shape-connector.json (1 hunks)
sample/sample-shape-connector.md (1 hunks)
schemas/shape.json (2 hunks)
src/exstruct/core/shapes.py (2 hunks)
src/exstruct/models/__init__.py (2 hunks)
tests/com/test_shapes_extraction.py (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/com/test_shapes_extraction.py (1)

src/exstruct/core/integrate.py (1)

extract_workbook (290-388)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Codacy Static Code Analysis

🔇 Additional comments (9)

sample/sample-shape-connector.md (1)

1-45: LGTM! Good sample documentation for connector flowcharts.

The Mermaid diagram clearly demonstrates the new connector functionality with proper node definitions and flow connections. The loop section effectively shows cyclic relationships.

Minor: Line 45 has trailing whitespace after the closing code fence.

src/exstruct/models/__init__.py (1)

15-17: LGTM! Well-structured model extensions.

The new optional fields are properly defined with:

Correct str | None type annotations

Appropriate default=None values

Clear descriptions referencing the Excel ConnectorFormat API

Field placement is logical, keeping connector-related fields together.

Also applies to: 32-43

sample/sample-shape-connector.json (1)

1-313: LGTM! Comprehensive sample data for connector shapes.

The sample effectively demonstrates:

Shape names following Excel conventions

Connector relationships via begin_connected_shape / end_connected_shape

Various connector types (straight arrows, curved connectors)

Direction and arrow style metadata

The data aligns well with the Mermaid diagram in sample-shape-connector.md.

src/exstruct/core/shapes.py (1)

173-173: LGTM!

The isinstance(shape_name, str) check properly handles cases where getattr returns None or non-string values.

docs/agents/DATA_MODEL.md (1)

3-3: LGTM! Documentation properly updated.

The data model documentation is correctly updated with:

Version bump to 0.9

All three new Shape fields documented with proper types

Clear explanatory notes linking to Excel ConnectorFormat API

Appropriate changelog entry

The documentation maintains consistency with the code implementation.

Also applies to: 21-21, 30-31, 39-40, 194-194

tests/com/test_shapes_extraction.py (2)

76-76: LGTM!

The name field assertions correctly verify that Excel shape names are being captured.

Also applies to: 81-81

113-138: Well-structured connector validation test.

The test properly handles environment variability with pytest.skip and validates that connector metadata is correctly extracted and references valid shape names.

schemas/shape.json (2)

5-30: LGTM! Connector endpoint properties are well-defined.

The schema correctly defines begin_connected_shape and end_connected_shape as optional string fields with appropriate descriptions referencing the Excel COM properties.

121-133: LGTM! Shape name property is correctly defined.

The schema properly adds the name field as an optional string property with a clear description.

harumiWeb · 2025-12-20T12:30:37Z

docs/agents/DATA_MODEL.md

 ```jsonc
 Shape {
  text: str
+  name: str | null // Excel shape Name プロパティ


id: int にしてコンテキスト量を削減できそうです。

harumiWeb · 2025-12-20T12:31:52Z

docs/agents/DATA_MODEL.md

  rotation: float | null
  begin_arrow_style: int | null
  end_arrow_style: int | null
+  begin_connected_shape: str | null // コネクタ始点の接続先 Shape.Name


LLMが推論できそうならbegin_idくらい削りたいです！

harumiWeb

PRありがとうございます！
矢印の前後関係を明示的に取得する方向性自体は、フローチャート認識精度の観点で非常に良いと思います。
実際にLLMに推論させた結果までて提示いただいて非常に助かります！

一方で、本ライブラリは RAG / LLM 投入前提の構造化データ生成を目的としているため、
出力コンテキスト量（トークン数）を強く意識しています。

現状の実装では、

矢印オブジェクト側
接続先図形側

の双方に name を持たせており、情報が重複しているように見えます。

以下のいずれかの形に整理できないか、ご検討いただけますでしょうか。

シートごとに図形に一意の整数IDを振り、name は冗長に持たない
もしくは name はオプション扱い（フラグ指定時のみ出力）

「フローチャート構造の推定に最低限必要な情報は何か」
という観点で、少し情報量を削減した形に調整いただけると助かります。

moonmile · 2025-12-21T04:03:15Z

シートごとに図形に一意の整数IDを振り、name は冗長に持たない

もしくは name はオプション扱い（フラグ指定時のみ出力）

「フローチャート構造の推定に最低限必要な情報は何か」という観点で、少し情報量を削減した形に調整いただけると助かります。

なるほど、了解です。
現状 max で設定を入れてしまったので、必要な分だけに絞ってみます。

name: string を id: int に変更
begin_connected_shape, end_connected_shape をbegin_id, end_id
"Arrow" っぽいものに id が不要

あたりで、修正してみます。

moonmile · 2025-12-21T04:51:21Z

ひとまず、

name: string を id: int に変更
begin_connected_shape, end_connected_shape をbegin_id, end_id
AutoShape-Mixed の矢印 Shape では id 付与しない。

のバージョンです。

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

tests/com/test_shapes_extraction.py (1)

85-96: Consider simplifying this complex assertion.

The nested condition with multiple or clauses is hard to parse. Consider extracting to a helper function for readability.

🔎 Suggested refactor

def _is_connector_like(s: Shape) -> bool:
    """Return True if shape has connector-related properties."""
    return bool(
        s.direction
        or s.begin_arrow_style is not None
        or s.end_arrow_style is not None
        or s.begin_id is not None
        or s.end_id is not None
    )

# Then use:
assert not any(
    (s.text == "" or s.text is None)
    and (s.type or "").startswith("AutoShape")
    and not _is_connector_like(s)
    for s in shapes
)

src/exstruct/core/shapes.py (1)

174-186: DRY violation: is_relationship_geom logic duplicates _should_include_shape.

The relationship detection logic here mirrors lines 91-103 in _should_include_shape. Consider extracting this into a shared helper to avoid divergence.

🔎 Proposed refactor

Extract a helper function:

def _is_relationship_shape(
    *,
    shape_type_num: int | None,
    shape_type_str: str | None,
    autoshape_type_str: str | None,
    shape_name: str | None,
) -> bool:
    """Return True if the shape represents a line/arrow/connector."""
    if shape_type_num in (3, 9):
        return True
    if autoshape_type_str and (
        "Arrow" in autoshape_type_str or "Connector" in autoshape_type_str
    ):
        return True
    if shape_type_str and (
        "Connector" in shape_type_str or shape_type_str in ("Line", "ConnectLine")
    ):
        return True
    if shape_name and ("Connector" in shape_name or "Line" in shape_name):
        return True
    return False

Then use it in both _should_include_shape and get_shapes_with_position.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6b8e594 and 646d2b1.

📒 Files selected for processing (13)

docs/agents/DATA_MODEL.md (4 hunks)
sample/sample-shape-connector.json (1 hunks)
sample/sample-shape-connector.md (1 hunks)
schemas/shape.json (1 hunks)
schemas/sheet.json (3 hunks)
schemas/workbook.json (3 hunks)
src/exstruct/core/shapes.py (5 hunks)
src/exstruct/models/__init__.py (2 hunks)
tests/com/test_shapes_extraction.py (3 hunks)
tests/test_engine.py (1 hunks)
tests/test_export_requirements.py (1 hunks)
tests/test_models_validation.py (2 hunks)
tests/test_print_area_views.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

sample/sample-shape-connector.json
sample/sample-shape-connector.md

🧰 Additional context used

🧬 Code graph analysis (6)

tests/com/test_shapes_extraction.py (2)

tests/com/test_charts_extraction.py (1)

_ensure_excel (11-16)

src/exstruct/core/integrate.py (1)

extract_workbook (290-388)

tests/test_print_area_views.py (1)

src/exstruct/models/__init__.py (1)

Shape (11-46)

tests/test_engine.py (1)

src/exstruct/models/__init__.py (1)

Shape (11-46)

tests/test_models_validation.py (1)

src/exstruct/models/__init__.py (1)

Shape (11-46)

tests/test_export_requirements.py (1)

src/exstruct/models/__init__.py (1)

Shape (11-46)

src/exstruct/core/shapes.py (1)

src/exstruct/models/__init__.py (1)

Shape (11-46)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Codacy Static Code Analysis

🔇 Additional comments (14)

schemas/sheet.json (1)

251-263: Schema additions for shape identification and connector wiring look good.

The new id, begin_id, and end_id fields are properly defined as nullable integers with appropriate defaults and descriptions. The schema structure is consistent with the Python model definitions in src/exstruct/models/__init__.py.

Also applies to: 277-289, 313-325

tests/test_export_requirements.py (1)

65-65: LGTM!

The test correctly exercises Shape construction with the new id field, ensuring JSON roundtrip serialization works with the updated model.

tests/test_print_area_views.py (1)

9-10: LGTM!

The test correctly assigns distinct id values to the two shapes, ensuring proper testing of shape identification within print area views.

tests/test_models_validation.py (1)

15-15: LGTM!

The test updates correctly include the id field in Shape construction while continuing to validate model defaults and literal field validation.

Also applies to: 51-51

tests/test_engine.py (1)

39-39: LGTM!

The _sample_workbook helper correctly includes the id field, ensuring all dependent tests exercise the updated Shape model.

src/exstruct/models/__init__.py (1)

14-16: LGTM!

The new id, begin_id, and end_id fields are well-defined with appropriate types, defaults, and descriptions. The field ordering logically groups connector-related fields together.

Also applies to: 32-43

schemas/workbook.json (1)

251-263: LGTM!

The Shape definition in the workbook schema is consistent with schemas/sheet.json and the Python model, ensuring schema coherence across the project.

Also applies to: 277-289, 313-325

tests/com/test_shapes_extraction.py (3)

39-54: Good defensive coding for connector setup.

The try/except block appropriately handles environments where Excel cannot wire connectors, allowing tests to gracefully degrade rather than fail unexpectedly.

76-83: LGTM!

The assertions properly validate that:

Shape ids are positive integers

All shape ids within a sheet are unique

115-140: Thorough connector validation test.

The test properly:

Skips gracefully when Excel cannot populate ConnectorFormat properties

Validates that begin_id and end_id are distinct

Confirms referenced ids correspond to actual emitted shapes

docs/agents/DATA_MODEL.md (1)

16-41: Shape model documentation is clear and well-structured.

The documentation accurately reflects the new id, begin_id, and end_id fields with appropriate explanations of their purpose and nullability semantics. The notes about connectors referencing shape IDs via ConnectorFormat.BeginConnectedShape / EndConnectedShape provide helpful context.

src/exstruct/core/shapes.py (2)

232-253: Nested exception handling for COM API is appropriate.

The per-property try/except blocks for BeginConnectedShape and EndConnectedShape are well-structured. COM APIs can throw on unconnected connectors, so isolating each access prevents one failure from masking the other.

118-120: Deferred wiring approach is sound.

Using pending_connections to collect shapes first, then resolving begin_id/end_id after building the complete name_to_id map is a clean two-pass solution that handles forward references correctly.

Also applies to: 265-273

schemas/shape.json (1)

5-17: Schema definitions for new fields are correct.

The id field schema correctly uses anyOf with integer/null types, matching the Pydantic model definition in src/exstruct/models/__init__.py. The description aligns with the documented behavior.

coderabbitai · 2025-12-21T04:51:24Z

docs/agents/DATA_MODEL.md

 # ExStruct データモデル仕様

-**Version**: 0.8  
+**Version**: 0.10  


⚠️ Potential issue | 🟡 Minor

Version header inconsistent with changelog.

The document header states Version 0.10, but the changelog documents changes through 0.11 (connector field rename to begin_id / end_id). Consider updating the version to 0.11 for consistency.

🔎 Proposed fix

-**Version**: 0.10 +**Version**: 0.11

Also applies to: 194-196

🤖 Prompt for AI Agents

In docs/agents/DATA_MODEL.md around lines 3 and also lines 194-196, the Version header and corresponding mention still show 0.10 while the changelog documents changes up to 0.11; update the header and any version references at those locations to 0.11 so the document version matches the changelog (verify there are no other scattered 0.10 occurrences and update them to 0.11 as needed).

coderabbitai · 2025-12-21T04:51:24Z

schemas/shape.json

+        "begin_id": {
+          "anyOf": [
+            {
+              "type": "integer"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "description": "Shape id at the start of a connector (ConnectorFormat.BeginConnectedShape).",
+          "title": "Begin Id"
+        },
+        "end_id": {
+          "anyOf": [
+            {
+              "type": "integer"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "description": "Shape id at the end of a connector (ConnectorFormat.EndConnectedShape).",
+          "title": "End Id"
+        },


⚠️ Potential issue | 🟡 Minor

Inconsistent indentation for begin_id and end_id properties.

The begin_id and end_id property definitions use 8-space indentation while id and other properties use 4-space indentation. This inconsistency should be corrected for uniformity.

🔎 Proposed fix

- "begin_id": { - "anyOf": [ - { - "type": "integer" - }, - { - "type": "null" - } - ], - "default": null, - "description": "Shape id at the start of a connector (ConnectorFormat.BeginConnectedShape).", - "title": "Begin Id" - }, - "end_id": { - "anyOf": [ - { - "type": "integer" - }, - { - "type": "null" - } - ], - "default": null, - "description": "Shape id at the end of a connector (ConnectorFormat.EndConnectedShape).", - "title": "End Id" - }, + "begin_id": { + "anyOf": [ + { + "type": "integer" + }, + { + "type": "null" + } + ], + "default": null, + "description": "Shape id at the start of a connector (ConnectorFormat.BeginConnectedShape).", + "title": "Begin Id" + }, + "end_id": { + "anyOf": [ + { + "type": "integer" + }, + { + "type": "null" + } + ], + "default": null, + "description": "Shape id at the end of a connector (ConnectorFormat.EndConnectedShape).", + "title": "End Id" + },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"begin_id": {

"anyOf": [

{

"type": "integer"

},

{

"type": "null"

}

],

"default": null,

"description": "Shape id at the start of a connector (ConnectorFormat.BeginConnectedShape).",

"title": "Begin Id"

},

"end_id": {

"anyOf": [

{

"type": "integer"

},

{

"type": "null"

}

],

"default": null,

"description": "Shape id at the end of a connector (ConnectorFormat.EndConnectedShape).",

"title": "End Id"

},

"begin_id": {

"anyOf": [

{

"type": "integer"

},

{

"type": "null"

}

],

"default": null,

"description": "Shape id at the start of a connector (ConnectorFormat.BeginConnectedShape).",

"title": "Begin Id"

},

"end_id": {

"anyOf": [

{

"type": "integer"

},

{

"type": "null"

}

],

"default": null,

"description": "Shape id at the end of a connector (ConnectorFormat.EndConnectedShape).",

"title": "End Id"

},

🤖 Prompt for AI Agents

In schemas/shape.json around lines 18 to 43 the begin_id and end_id property blocks are indented with 8 spaces while other properties use 4 spaces; fix by reformatting those blocks (their keys, inner fields like anyOf/type/default/description/title, and closing braces) to use the same 4-space indentation as surrounding properties so the file formatting is consistent.

coderabbitai · 2025-12-21T04:51:24Z

src/exstruct/core/shapes.py

+        if pending_connections:
+            name_to_id = {name: sid for name, sid in excel_names}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's see if we can find the shapes.py file and read it around lines 267-268 find . -name "shapes.py" -path "*/core/*" -type f 2>/dev/null | head -5

Repository: harumiWeb/exstruct

Length of output: 91

🏁 Script executed:

# Read the file with context around lines 267-268 (let's get lines 250-290) sed -n '250,290p' ./src/exstruct/core/shapes.py | cat -n

Repository: harumiWeb/exstruct

Length of output: 1450

🏁 Script executed:

# Get more context - let's find the function definition # First, get line 250 and trace back to find the function start head -n 270 ./src/exstruct/core/shapes.py | tail -n 100 | head -n 50

Repository: harumiWeb/exstruct

Length of output: 2193

🏁 Script executed:

# Let's find the function name and see the full function with line numbers grep -n "def " ./src/exstruct/core/shapes.py | grep -B5 -A5 "267\|268" || grep -n "excel_names" ./src/exstruct/core/shapes.py

Repository: harumiWeb/exstruct

Length of output: 254

🏁 Script executed:

# Get the function definition - find the function that contains line 118 sed -n '100,130p' ./src/exstruct/core/shapes.py

Repository: harumiWeb/exstruct

Length of output: 1203

🏁 Script executed:

# Let's see the code around line 210 where items are appended to excel_names sed -n '200,220p' ./src/exstruct/core/shapes.py

Repository: harumiWeb/exstruct

Length of output: 1019

🏁 Script executed:

# Now let's check if there are any tests for duplicate name handling find . -name "*test*.py" -o -name "test_*" -type f | xargs grep -l "duplicate.*name\|name.*duplicate" 2>/dev/null

Repository: harumiWeb/exstruct

Length of output: 46

🏁 Script executed:

# Let's search for test files in the repository find . -name "*test*.py" -type f 2>/dev/null | head -10

Repository: harumiWeb/exstruct

Length of output: 374

🏁 Script executed:

# Check the shapes extraction test for any duplicate name handling grep -n "duplicate\|name" ./tests/com/test_shapes_extraction.py | head -20

Repository: harumiWeb/exstruct

Length of output: 92

🏁 Script executed:

# Let's read the entire shapes extraction test file to understand how it works cat ./tests/com/test_shapes_extraction.py

Repository: harumiWeb/exstruct

Length of output: 4570

🏁 Script executed:

# Let's check if there are any comments or documentation about duplicate handling grep -n -A5 -B5 "name_to_id\|excel_names" ./src/exstruct/core/shapes.py | head -40

Repository: harumiWeb/exstruct

Length of output: 1742

🏁 Script executed:

# Let's check if there are any safeguards or validation for duplicate names elsewhere grep -n "shape_name\|excel_name" ./src/exstruct/core/shapes.py | grep -v "^Binary"

Repository: harumiWeb/exstruct

Length of output: 892

🏁 Script executed:

# Let's check the entire function to understand the flow better sed -n '111,275p' ./src/exstruct/core/shapes.py

Repository: harumiWeb/exstruct

Length of output: 7487

🏁 Script executed:

# Let's check if there are any related issues or documentation about this find . -name "*.md" -o -name "*.txt" -o -name "*.rst" | xargs grep -l "duplicate\|shape.*name" 2>/dev/null | head -5

Repository: harumiWeb/exstruct

Length of output: 44

Consider handling potential duplicate shape names in the dict comprehension.

The dict comprehension at line 268 silently overwrites duplicate shape names with the last occurrence. While Excel allows duplicate names, and duplicate shape names can be created through copy and paste operations, this silent overwrite could cause connectors to incorrectly resolve to the wrong shape IDs. Either validate that shape names are unique when building excel_names, log a warning on duplicates, or use the first occurrence instead of the last.

🤖 Prompt for AI Agents

In src/exstruct/core/shapes.py around lines 267-268, the dict comprehension name_to_id = {name: sid for name, sid in excel_names} silently overwrites duplicate shape names; update this code to detect and handle duplicates instead of always keeping the last occurrence: iterate through excel_names in order, if a name is not yet in name_to_id insert it (to keep first occurrence) or alternatively detect when a name is already present and log a warning (use the module logger) listing the duplicated name and both IDs; if you prefer strictness, raise a ValueError when duplicates are found. Ensure tests or callers still receive a deterministic mapping and that the chosen behavior (keep-first, warn, or error) is clearly implemented and logged.

harumiWeb · 2025-12-21T11:09:29Z

ローカルで挙動を確認しましたが、完璧にフローチャート推論ができており、
現状の実装で問題ありません。
フローチャート認識精度向上に寄与する非常に素晴らしい改善です！
PRありがとうございます。マージします。

Shapeのフローを追うために接続先/元を追加

6b8e594

coderabbitai bot reviewed Dec 20, 2025

View reviewed changes

moonmile mentioned this pull request Dec 20, 2025

フローチャートの Shape がうまく追えていない #16

Closed

harumiWeb reviewed Dec 20, 2025

View reviewed changes

Shape.name を Shape.id に変更等

646d2b1

coderabbitai bot reviewed Dec 21, 2025

View reviewed changes

trigger CodeQL

6ffde8c

harumiWeb merged commit c159b02 into harumiWeb:main Dec 21, 2025
9 checks passed

harumiWeb mentioned this pull request Dec 21, 2025

v0.2.80 #18

Merged

This was referenced Dec 24, 2025

#14 セル背景色情報構造化機能の追加 #21

Merged

SmartArt解析構造化機能追加 #30

Merged

v0.3.1 Release #32

Merged

		if pending_connections:
		name_to_id = {name: sid for name, sid in excel_names}

Shapeのフローを追うために接続先/元を追加 #15

Shapeのフローを追うために接続先/元を追加 #15

Uh oh!

Conversation

moonmile commented Dec 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

harumiWeb Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

harumiWeb Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

harumiWeb left a comment

Choose a reason for hiding this comment

Uh oh!

moonmile commented Dec 21, 2025

Uh oh!

moonmile commented Dec 21, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 21, 2025

Choose a reason for hiding this comment

Uh oh!

harumiWeb commented Dec 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

moonmile commented Dec 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 20, 2025 •

edited

Loading