Fix: setting project on standalone audits#4411
Fix: setting project on standalone audits#4411georgesittas merged 5 commits intoSQLMesh:mainfrom vholmer:fix_empty_project_standalone_audits
Conversation
|
Hey @vholmer, thank you for the contribution! Can you add a test for this, or explain in a bit more detail the exact flow that leads to an error? I'd like to understand this a bit better by reproducing the issue locally, before merging. |
|
I saw that you just updated the issue's description, and with a quick change I managed to reproduce the issue you saw. Let me take a closer look and will get back to reviewing this. |
|
Hi @georgesittas. My colleague has previously been in contact with you and had a sitdown with one of your engineers regarding this. The problem here is that it's extremely hard to reproduce, and we could not get it to crash the same way in a completely fresh sqlmesh project. So before writing a test, I think I need someone on your end to look at what I've done, as well as try to figure out why this even happened to us in the first place. Happy to provide you with any details necessary. I'll see if I can hammer an example together tomorrow, but I think it will be difficult. If it helps, here's the slack thread: https://app.slack.com/client/T041MTMK2JC. We had an audit that looked like this: AUDIT (
name checknum_unique_per_hour,
standalone true,
blocking false,
);And then, after running: sqlmesh plan
sqlmesh auditEverything works fine. After running sqlmesh audit again, however, we're seeing this error: "Duplicate key 'checknum_unique_per_hour' found in UniqueKeyDict<standaloneaudits>. Call dict.update(...) if this is intentional."This crash occurs in context.py:633 on: store[snapshot.name] = snapshot.node This PR fixes the issue. |
|
A bit out of sync here with the comments, glad to hear that you could reproduce it! 😄 |
|
Yep, thanks for providing that context! |
georgesittas
left a comment
There was a problem hiding this comment.
Ok this seems legit to me. I believe project should be set for standalone audits and the fact that it's not seems like a bug to me, cc @tobymao.
Here's a test you can include in test_model.py to ensure there won't be regressions:
def test_project_is_set_in_standalone_audit(tmp_path: Path) -> None:
init_example_project(tmp_path, dialect="duckdb", template=ProjectTemplate.EMPTY)
db_path = str(tmp_path / "db.db")
db_connection = DuckDBConnectionConfig(database=db_path)
config = Config(
project="test",
gateways={"gw": GatewayConfig(connection=db_connection)},
model_defaults=ModelDefaultsConfig(dialect="duckdb"),
)
model = tmp_path / "models" / "some_model.sql"
model.parent.mkdir(parents=True, exist_ok=True)
model.write_text("MODEL (name m); SELECT 1 AS c")
audit = tmp_path / "audits" / "a_standalone_audit.sql"
audit.parent.mkdir(parents=True, exist_ok=True)
audit.write_text("AUDIT (name a, standalone true); SELECT * FROM m WHERE c <= 0")
context = Context(paths=tmp_path, config=config)
context.plan(no_prompts=True, auto_apply=True)
model = tmp_path / "models" / "some_model.sql"
model.parent.mkdir(parents=True, exist_ok=True)
model.write_text("MODEL (name m); SELECT 2 AS c")
assert context.fetchdf(
"select snapshot -> 'node' -> 'project' AS standalone_audit_project "
"""from sqlmesh._snapshots where (snapshot -> 'node' -> 'source_type')::text = '"audit"'"""
).to_dict()["standalone_audit_project"] == {0: '"test"'}
assert context.load().standalone_audits["a"].project == "test"|
@georgesittas Thanks! Added the test, hopefully to the right location. |
|
Not sure whether the failed engine_trio CI is related to this PR or something else... |
It's an intermittent failure, no worries. Restarted the test. |
Closes #4410