Skip to content

fix sql defs lookup#9754

Merged
Light2Dark merged 3 commits into
mainfrom
sham/fix-sql-defs
Jun 2, 2026
Merged

fix sql defs lookup#9754
Light2Dark merged 3 commits into
mainfrom
sham/fix-sql-defs

Conversation

@Light2Dark
Copy link
Copy Markdown
Collaborator

@Light2Dark Light2Dark commented Jun 2, 2026

📝 Summary

  • First, I added tests to assert the failure, including same table names across different schemas and catalogs, true duplicate qualified names, unqualified duplicates, validation errors, and unregister/delete cleanup.
  • The fix keeps the existing definition indexes for graph lookup, but adds a separate conflict index for duplicate detection. Qualified SQL definitions, like CREATE TABLE schema.table, are keyed by their fully qualified name. Python variables and unqualified SQL definitions, like CREATE TABLE t1, keep the existing global-name behavior.
  • This fixes SQL definition collision tracking so schema/catalog-qualified tables with the same leaf name, like finance.datalake.xx and finance.information_layer.xx, do not trigger false multiple-definition errors.

Closes #9744

📋 Pre-Review Checklist

  • For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
  • Any AI generated code has been reviewed line-by-line by the human PR author, who stands by it.
  • Video or media evidence is provided for any visual changes (optional).

✅ Merge Checklist

  • I have read the contributor guidelines.
  • Documentation has been updated where applicable, including docstrings for API changes.
  • Tests have been added for the changes made.

Copilot AI review requested due to automatic review settings June 2, 2026 07:00
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Jun 2, 2026 8:05am

Request Review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a false multiple-definition error for SQL cells that create tables with the same leaf name in different qualified schemas/catalogs (e.g. finance.datalake.xx vs finance.information_layer.xx). The DefinitionRegistry now decouples its name-lookup index from conflict detection, so qualified SQL definitions are grouped by their fully qualified name when checking for redefinitions, while still being discoverable by leaf name for SQL edge resolution.

Changes:

  • Add definition_conflicts / conflict_names keyed on ("sql", qualified_name) for SQL or ("global", name) otherwise; get_multiply_defined() now reports conflicts from this map.
  • Update unregister_definitions to take variable_data (so it can compute the conflict key and prune typed/conflict tables); delete_cell passes cell.variable_data accordingly.
  • Rewire validate_graph.check_for_multiple_definitions and graph.get_multiply_defined through a new get_multiply_defined_conflicts() API; add tests for catalog/schema/qualified/unqualified redefinition scenarios.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
marimo/_runtime/dataflow/definitions.py Replace definition_types with conflict-key tracking; update register/unregister and multiply-defined logic.
marimo/_runtime/dataflow/graph.py Pass variable_data to unregister; expose get_multiply_defined_conflicts().
marimo/_lint/validate_graph.py Use the new conflicts API to emit MultipleDefinitionErrors.
tests/_runtime/test_dataflow.py Add tests for same-name diff-schema/catalog, same qualified name, unqualified, and delete-clears-conflict.
tests/_lint/test_validate_graph.py Add lint-level test that qualified SQL names are used for multiple-definition errors.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files

Architecture diagram
sequenceDiagram
    participant Cell as Cell Registration
    participant Registry as DefinitionRegistry
    participant DefIndex as definitions dict
    participant TypedIndex as typed_definitions dict
    participant ConflictIndex as definition_conflicts dict
    participant Graph as DirectedGraph
    participant Lint as check_for_multiple_definitions

    Note over Cell,Lint: SQL Definition Registration with Qualified Name Conflict Resolution

    Cell->>Registry: register_definition(name, variable_data, cell_id)
    Registry->>Registry: _conflict_key(name, variable)
    alt qualified SQL def (e.g. finance.datalake.xx)
        Registry->>Registry: conflict_key = ("sql", "finance.datalake.xx")
    else unqualified SQL or Python def
        Registry->>Registry: conflict_key = ("global", name)
    end

    Registry->>DefIndex: store name -> cell_id
    Registry->>TypedIndex: store (name, kind) -> cell_id
    Registry->>ConflictIndex: store conflict_key -> cell_id
    Registry->>ConflictIndex: check for duplicate conflict_key
    alt duplicate conflict_key found
        ConflictIndex-->>Registry: sibling cells with same conflict_key
        Registry-->>Cell: return siblings (conflicting cells)
    else no duplicate
        Registry-->>Cell: return empty set
    end

    Note over Graph,ConflictIndex: Checking for Multiple Definitions

    Graph->>Registry: get_multiply_defined()
    Registry->>ConflictIndex: iterate over definition_conflicts
    alt len(definers) > 1
        ConflictIndex-->>Registry: conflict_key + definers
        Registry->>Registry: map conflict_key to Name via conflict_names
        Registry-->>Graph: [(name, set of cell_ids)]
    else no conflicts
        Registry-->>Graph: []
    end

    Graph->>Lint: get_multiply_defined_conflicts()
    Lint->>Lint: build MultipleDefinitionError per conflicting cell

    Note over Cell,Registry: Cell Deletion and Cleanup

    Graph->>Registry: unregister_definitions(cell_id, variable_data)
    loop for each name in variable_data
        Registry->>DefIndex: discard cell_id
        Registry->>TypedIndex: discard cell_id
        Registry->>ConflictIndex: discard cell_id
        opt conflict_key has no remaining cells
            Registry->>ConflictIndex: delete conflict_key
            Registry->>Registry: pop conflict_names
        end
        opt name has no remaining cells
            Registry->>DefIndex: delete name
        end
    end
Loading

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread marimo/_runtime/dataflow/definitions.py
@Light2Dark Light2Dark added the bug Something isn't working label Jun 2, 2026
@Light2Dark Light2Dark merged commit 7745bbf into main Jun 2, 2026
43 checks passed
@Light2Dark Light2Dark deleted the sham/fix-sql-defs branch June 2, 2026 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Bug ❓: Variable name collision for SQL cells using same table name in different schemas

3 participants