Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with versioned models?! #48

Closed
smilingthax opened this issue May 6, 2024 · 7 comments · Fixed by #50
Closed

Problem with versioned models?! #48

smilingthax opened this issue May 6, 2024 · 7 comments · Fixed by #50
Assignees
Labels
bug Something isn't working

Comments

@smilingthax
Copy link

Describe the bug

dbt-loom/test_projects/customer_success# dbt build
09:32:20  Running with dbt=1.7.14
09:32:20  dbt-loom: Patching ref protection methods to support dbt-loom dependencies.
09:32:20  dbt-loom: Loading manifest for `revenue` from `file`
09:32:20  Registered adapter: duckdb=1.7.4
09:32:20  dbt-loom: Injecting nodes
09:32:20  [WARNING]: Model orders has passed its deprecation date of 2024-01-01T00:00:00+00:00. This model should be disabled or removed.            ## (removing deprecation_date does not change anything)
09:32:20  Encountered an error:
Compilation Error
  'model.revenue.not_null_orders_v1_order_id' depends on 'model.revenue.orders.v1' which is not in the graph!

To Reproduce

  1. Install/Setup dbt-core, dbt-duckdb, dbt-loom, ...
  2. git clone https://github.com/nicholasyager/dbt-loom (to retrieve test_projects/)
  3. In dbt-loom/test_projects/revenue run dbt deps, dbt build, dbt run
  4. In dbt-loom/test_projects/customer_success run dbt deps, try dbt build or dbt run
  5. See error, above.

Expected behavior

The test project from the dbt-loom repository should compile without errors.
Other projects which use versioned models also compile without errors.

  • OS: python:3.12-bookworm-based container running on amd64 linux
  • dbt-loom Version 0.5.1
  • dbt-core Version 1.7.14, also 1.7.13

Additional context

This first happened in my own project, but just using the test_projects from the dbt-loom repository exhibits the same behaviour.

AFAICT the corresponding node name/id in revenue/target/manifest.json is "model.revenue.orders.v1", whereas in customer_success/target/manifest.json the injected(?) node seems to be called "model.revenue.orders.v1.0" – but some/all(?) references to it (depends_on, ...) still use the "original" "model.revenue.orders.v1" name/id, which then cannot be found, as said in the error message (... depends on 'model.revenue.orders.v1' which is not in the graph!)...

Non-versioned models seem to be unaffected / work fine.

@smilingthax smilingthax added bug Something isn't working triage This issue is being investigated labels May 6, 2024
@nicholasyager
Copy link
Owner

Thank you for taking the time to put this together, @smilingthax! I've attempted to replicate this result, but I've not been able to get dbt-loom/dbt-core to yield this same compilation error.

I have a suspicion that there may be an incompatibility being exposed here between different versions of dbt-core and how they represent node unique_id values for versioned models between dbt-core versions. If you're still receiving the error, can you please follow your same steps, but run a dbt clean prior to dbt deps in both projects? I suspect that this will clear out any lingering incompatible unique_ids generated between versions.

@theodotdot
Copy link

I am not sure my issue is related but it seems like it is:

I am having similar errors like

Compilation Error
  'model.my_dbt_project.stg_my_seed_file' depends on 'seed.my_dbt_project.seed_my_seed_file' which is not in the graph!
(I replaced the names with generic ones, this is on a private dbt repo that I cannot share)

This error happens with seeds however. This error even happens when trying to run models unrelated to the seed file and model in the error. In the first project, there are no errors when running or compiling.

I tried running with --no-partial-parse as well as running dbt clean before the commands to no avail.
Deleting the seed and related model only gave the same error with another seed and model pair.
I have tried with both dbt=1.7.14/bigquery=1.7.7 and dbt=1.6.9/bigquery=1.6.9

I have looked at the manifest.json from project A and I can see the seed node, however, when looking at the manifest generated from the run in project B, I can only see the seed node as a dependency from the staging model but doesn't seem to be injected as a node itself.

@smilingthax
Copy link
Author

I have a suspicion that there may be an incompatibility being exposed here between different versions of dbt-core and how they represent node unique_id values for versioned models between dbt-core versions. If you're still receiving the error, can you please follow your same steps, but run a dbt clean prior to dbt deps in both projects? I suspect that this will clear out any lingering incompatible unique_ids generated between versions.

It is not a transient error, I can reproduce it with a "clean install":

Dockerfile.bug:

FROM python:3.12-bookworm AS base

RUN apt-get update \
 && apt-get dist-upgrade -y \
 && apt-get install -y --no-install-recommends \
    git \
    less vim

ENV PYTHONIOENCODING=utf-8
ENV LANG=C.UTF-8

RUN python -m pip install --no-cache-dir "dbt-core"
RUN python -m pip install --no-cache-dir "dbt-duckdb"
RUN python -m pip install --no-cache-dir "dbt-loom"

Then:

$ docker build - < Dockerfile.bug
[...]
Successfully built 7fa7942df6c1

$ docker run --rm -ti 7fa7942df6c1 bash

root@bc9d87680530:/# cd /tmp

root@bc9d87680530:/tmp# git clone https://github.com/nicholasyager/dbt-loom
[...]

root@bc9d87680530:/tmp# cd dbt-loom/test-projects/revenue

root@bc9d87680530:/tmp/dbt-loom/test_projects/revenue# dbt clean
15:17:47  Running with dbt=1.7.14
15:17:48  dbt-loom: Patching ref protection methods to support dbt-loom dependencies.
15:17:48  Checking /tmp/dbt-loom/test_projects/revenue/target/*
15:17:48  Cleaned /tmp/dbt-loom/test_projects/revenue/target/*
15:17:48  Checking /tmp/dbt-loom/test_projects/revenue/dbt_packages/*
15:17:48  Cleaned /tmp/dbt-loom/test_projects/revenue/dbt_packages/*
15:17:48  Finished cleaning all paths.

root@bc9d87680530:/tmp/dbt-loom/test_projects/revenue# dbt deps
15:17:52  Running with dbt=1.7.14
15:17:52  dbt-loom: Patching ref protection methods to support dbt-loom dependencies.
15:17:52  Installing dbt-labs/dbt_utils
15:17:53  Installed from version 1.0.0
15:17:53  Updated version available: 1.1.1
15:17:53
15:17:53  Updates available for packages: ['dbt-labs/dbt_utils']
Update your versions in packages.yml, then run dbt deps

root@bc9d87680530:/tmp/dbt-loom/test_projects/revenue# dbt build
15:17:56  Running with dbt=1.7.14
15:17:56  dbt-loom: Patching ref protection methods to support dbt-loom dependencies.
15:17:56  Registered adapter: duckdb=1.7.4
15:17:56  Unable to do partial parsing because saved manifest not found. Starting full parse.
15:17:57  dbt-loom: Injecting nodes
15:17:57  [WARNING]: Model orders.v1 has passed its deprecation date of 2024-01-01T00:00:00+00:00. This model should be disabled or removed.
15:17:57  Found 7 models, 1 seed, 18 tests, 5 sources, 0 exposures, 0 metrics, 507 macros, 0 groups, 0 semantic models
[...]
15:17:58  Finished running 5 view models, 18 tests, 1 seed, 2 incremental models in 0 hours 0 minutes and 1.35 seconds (1.35s).
15:17:58
15:17:58  Completed successfully
15:17:58
15:17:58  Done. PASS=26 WARN=0 ERROR=0 SKIP=0 TOTAL=26

root@bc9d87680530:/tmp/dbt-loom/test_projects/revenue# dbt run
15:18:02  Running with dbt=1.7.14
15:18:02  dbt-loom: Patching ref protection methods to support dbt-loom dependencies.
15:18:02  Registered adapter: duckdb=1.7.4
15:18:02  dbt-loom: Injecting nodes
15:18:02  [WARNING]: Model orders.v1 has passed its deprecation date of 2024-01-01T00:00:00+00:00. This model should be disabled or removed.
15:18:02  Found 7 models, 1 seed, 18 tests, 5 sources, 0 exposures, 0 metrics, 507 macros, 0 groups, 0 semantic models
15:18:02
15:18:03  Concurrency: 4 threads (target='dev')
15:18:03
15:18:03  1 of 7 START sql view model main.stg_locations ................................. [RUN]
15:18:03  2 of 7 START sql view model main.stg_order_items ............................... [RUN]
15:18:03  3 of 7 START sql view model main.stg_orders .................................... [RUN]
15:18:03  4 of 7 START sql view model main.stg_products .................................. [RUN]
15:18:03  1 of 7 OK created sql view model main.stg_locations ............................ [OK in 0.09s]
15:18:03  5 of 7 START sql view model main.stg_supplies .................................. [RUN]
15:18:03  4 of 7 OK created sql view model main.stg_products ............................. [OK in 0.11s]
15:18:03  2 of 7 OK created sql view model main.stg_order_items .......................... [OK in 0.12s]
15:18:03  3 of 7 OK created sql view model main.stg_orders ............................... [OK in 0.14s]
15:18:03  5 of 7 OK created sql view model main.stg_supplies ............................. [OK in 0.08s]
15:18:03  6 of 7 START sql incremental model main.orders_v1 .............................. [RUN]
15:18:03  7 of 7 START sql incremental model main.orders_v2 .............................. [RUN]
15:18:03  7 of 7 OK created sql incremental model main.orders_v2 ......................... [OK in 0.27s]
15:18:03  6 of 7 OK created sql incremental model main.orders_v1 ......................... [OK in 0.56s]
15:18:03
15:18:03  Finished running 5 view models, 2 incremental models in 0 hours 0 minutes and 0.81 seconds (0.81s).
15:18:03
15:18:03  Completed successfully
15:18:03
15:18:03  Done. PASS=7 WARN=0 ERROR=0 SKIP=0 TOTAL=7

root@bc9d87680530:/tmp/dbt-loom/test_projects/revenue# cd ../customer_success/

root@bc9d87680530:/tmp/dbt-loom/test_projects/customer_success# dbt clean
15:18:16  Running with dbt=1.7.14
15:18:16  dbt-loom: Patching ref protection methods to support dbt-loom dependencies.
15:18:16  dbt-loom: Loading manifest for `revenue` from `file`
15:18:16  Checking /tmp/dbt-loom/test_projects/customer_success/dbt_packages/*
15:18:16  Cleaned /tmp/dbt-loom/test_projects/customer_success/dbt_packages/*
15:18:16  Checking /tmp/dbt-loom/test_projects/customer_success/target/*
15:18:16  Cleaned /tmp/dbt-loom/test_projects/customer_success/target/*
15:18:16  Finished cleaning all paths.

root@bc9d87680530:/tmp/dbt-loom/test_projects/customer_success# dbt deps
15:18:22  Running with dbt=1.7.14
15:18:22  dbt-loom: Patching ref protection methods to support dbt-loom dependencies.
15:18:22  dbt-loom: Loading manifest for `revenue` from `file`
15:18:22  Installing dbt-labs/dbt_utils
15:18:22  Installed from version 1.0.0
15:18:22  Updated version available: 1.1.1
15:18:22
15:18:22  Updates available for packages: ['dbt-labs/dbt_utils']
Update your versions in packages.yml, then run dbt deps

root@bc9d87680530:/tmp/dbt-loom/test_projects/customer_success# dbt build
15:18:27  Running with dbt=1.7.14
15:18:27  dbt-loom: Patching ref protection methods to support dbt-loom dependencies.
15:18:27  dbt-loom: Loading manifest for `revenue` from `file`
15:18:27  Registered adapter: duckdb=1.7.4
15:18:27  Unable to do partial parsing because saved manifest not found. Starting full parse.
15:18:28  [WARNING]: Did not find matching node for patch with name 'orders' in the 'models' section of file 'models/marts/__models.yml'
15:18:28  dbt-loom: Injecting nodes
15:18:28  [WARNING]: Model orders has passed its deprecation date of 2024-01-01T00:00:00+00:00. This model should be disabled or removed.
15:18:28  Encountered an error:
Compilation Error
  'model.revenue.not_null_orders_v1_order_id' depends on 'model.revenue.orders.v1' which is not in the graph!

root@bc9d87680530:/tmp/dbt-loom/test_projects/customer_success#

This could probably be minified even more... the Dockerfile is simply based on https://github.com/dbt-labs/dbt-core/blob/main/docker/Dockerfile;
I previously pip-installed the more specific "git+https://github.com/dbt-labs/dbt-core@v1.7.14#egg=dbt-core&subdirectory=core" (but not for dbt-duckdb / dbt-loom), with no difference.

I also believe in some incompatibility, but I have no clue what to try next....

@theodotdot : A - somewhat similar - problem with seeds has already been fixed in 0.5.1 (are you using the newest version of dbt-loom?): #47 , but it is AFAICT unrelated to this issue.

@nicholasyager
Copy link
Owner

@smilingthax Thanks for the Dockerfile! This should make it much easier to replicate the issue. Also, kudos for confirm that this isn't transient. I'll dig into replicating the issue this afternoon 👍🏻 In the meantime, please let me know if you find any more leads.

@nicholasyager
Copy link
Owner

nicholasyager commented May 6, 2024

@smilingthax Quick update on this: I've determined that the difference seems to be due to how ModelNodeArgs in dbt-core handles version values, where a version can be either a str or a float. I think what's happening is:

  • In the upstream project (revenue) this version is parsed as a string, so models.revenue.order.v1, and this is what is used for dependencies.
  • In the downstream project (customer_success) this version is being parsed as a float, so models.revenue.order.v1.0. This is fine for models ref-ing the upstream model, since they're generating the database Relation on the fly w/ a two-argument ref.
  • There are tests, however, being injected in customer_success, too. This is a defect. These tests have dependencies defined by the upstream project's unique id (the string-encoded version), which do not match the unique ids injected into the downstream model.

The resolution, then is two fold:

  1. Update ManifestNode to perform a type on the model's version. This will prevent incorrect deserialization from occuring.
  2. Update identify_node_subgraph to use dbt-core's NodeType enums to prevent the injection of non-node resources.

@nicholasyager nicholasyager removed the triage This issue is being investigated label May 6, 2024
@nicholasyager
Copy link
Owner

@smilingthax I have a PR (#50) that I believe resolves this defect. I'll leave this up for a day or so if you want to give that branch a test with your dbt implementation. Let me know if you have any feedback!

@smilingthax
Copy link
Author

The string-conversation of #50 in dbt_loom/__init__.py fixes the problem in my own project; the PR branch also installs + works fine; I did not test the other changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants