Skip to content

Fix: setting project on standalone audits#4411

Merged
georgesittas merged 5 commits intoSQLMesh:mainfrom
vholmer:fix_empty_project_standalone_audits
May 15, 2025
Merged

Fix: setting project on standalone audits#4411
georgesittas merged 5 commits intoSQLMesh:mainfrom
vholmer:fix_empty_project_standalone_audits

Conversation

@vholmer
Copy link
Contributor

@vholmer vholmer commented May 14, 2025

Closes #4410

@CLAassistant
Copy link

CLAassistant commented May 14, 2025

CLA assistant check
All committers have signed the CLA.

@georgesittas
Copy link
Contributor

Hey @vholmer, thank you for the contribution!

Can you add a test for this, or explain in a bit more detail the exact flow that leads to an error? I'd like to understand this a bit better by reproducing the issue locally, before merging.

@georgesittas
Copy link
Contributor

I saw that you just updated the issue's description, and with a quick change I managed to reproduce the issue you saw. Let me take a closer look and will get back to reviewing this.

@vholmer
Copy link
Contributor Author

vholmer commented May 14, 2025

Hi @georgesittas.

My colleague has previously been in contact with you and had a sitdown with one of your engineers regarding this. The problem here is that it's extremely hard to reproduce, and we could not get it to crash the same way in a completely fresh sqlmesh project.

So before writing a test, I think I need someone on your end to look at what I've done, as well as try to figure out why this even happened to us in the first place. Happy to provide you with any details necessary.

I'll see if I can hammer an example together tomorrow, but I think it will be difficult. If it helps, here's the slack thread: https://app.slack.com/client/T041MTMK2JC.

We had an audit that looked like this:

AUDIT (
    name checknum_unique_per_hour,
    standalone true,
    blocking false,
);

And then, after running:

sqlmesh plan
sqlmesh audit

Everything works fine. After running sqlmesh audit again, however, we're seeing this error:

"Duplicate key 'checknum_unique_per_hour' found in UniqueKeyDict<standaloneaudits>. Call dict.update(...) if this is intentional."

This crash occurs in context.py:633 on:

store[snapshot.name] = snapshot.node 

This PR fixes the issue.

@vholmer
Copy link
Contributor Author

vholmer commented May 14, 2025

A bit out of sync here with the comments, glad to hear that you could reproduce it! 😄

@georgesittas
Copy link
Contributor

Yep, thanks for providing that context!

Copy link
Contributor

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this seems legit to me. I believe project should be set for standalone audits and the fact that it's not seems like a bug to me, cc @tobymao.

Here's a test you can include in test_model.py to ensure there won't be regressions:

def test_project_is_set_in_standalone_audit(tmp_path: Path) -> None:
    init_example_project(tmp_path, dialect="duckdb", template=ProjectTemplate.EMPTY)

    db_path = str(tmp_path / "db.db")
    db_connection = DuckDBConnectionConfig(database=db_path)

    config = Config(
        project="test",
        gateways={"gw": GatewayConfig(connection=db_connection)},
        model_defaults=ModelDefaultsConfig(dialect="duckdb"),
    )

    model = tmp_path / "models" / "some_model.sql"
    model.parent.mkdir(parents=True, exist_ok=True)
    model.write_text("MODEL (name m); SELECT 1 AS c")

    audit = tmp_path / "audits" / "a_standalone_audit.sql"
    audit.parent.mkdir(parents=True, exist_ok=True)
    audit.write_text("AUDIT (name a, standalone true); SELECT * FROM m WHERE c <= 0")

    context = Context(paths=tmp_path, config=config)
    context.plan(no_prompts=True, auto_apply=True)

    model = tmp_path / "models" / "some_model.sql"
    model.parent.mkdir(parents=True, exist_ok=True)
    model.write_text("MODEL (name m); SELECT 2 AS c")

    assert context.fetchdf(
        "select snapshot -> 'node' -> 'project' AS standalone_audit_project "
        """from sqlmesh._snapshots where (snapshot -> 'node' -> 'source_type')::text = '"audit"'"""
    ).to_dict()["standalone_audit_project"] == {0: '"test"'}
    assert context.load().standalone_audits["a"].project == "test"

@georgesittas georgesittas changed the title Setting project on standalone audits Fix: setting project on standalone audits May 14, 2025
@georgesittas georgesittas requested a review from a team May 14, 2025 18:45
@vholmer
Copy link
Contributor Author

vholmer commented May 14, 2025

@georgesittas Thanks! Added the test, hopefully to the right location.

@vholmer
Copy link
Contributor Author

vholmer commented May 14, 2025

Not sure whether the failed engine_trio CI is related to this PR or something else...

@georgesittas
Copy link
Contributor

georgesittas commented May 14, 2025

Not sure whether the failed engine_trio CI is related to this PR or something else...

It's an intermittent failure, no worries. Restarted the test.

@georgesittas georgesittas merged commit 43094a1 into SQLMesh:main May 15, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setting project on standalone audits

4 participants