-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate entries in the report. #1420
Comments
Hello, I confirm this bug which is a bit annoying. When we run models in parallel then we end up with duplicates in the tables. create or replace table elementary.dbt_tests as (
select * from elementary.dbt_tests qualify row_number() over (partition by unique_id order by generated_at) = 1
);
create or replace table elementary.dbt_models as (
select * from elementary.dbt_models qualify row_number() over (partition by unique_id order by generated_at) = 1
);
create or replace table elementary.dbt_sources as (
select * from elementary.dbt_sources qualify row_number() over (partition by unique_id order by generated_at) = 1
);
create or replace table elementary.dbt_exposures as (
select * from elementary.dbt_exposures qualify row_number() over (partition by unique_id order by generated_at) = 1
);
create or replace table elementary.dbt_columns as (
select * from elementary.dbt_columns qualify row_number() over (partition by unique_id order by generated_at) = 1
); |
Hi @annav00 and @MICHM137 , I'll mark this as high priority on our end. In the meantime, a workaround you can consider setting the var (When caching is enabled we only insert a diff - and I think there's probably a race there) |
Hey (it is MICHM137), |
Hi @mattxxi , I think the duplicate entries when the cache is enabled results from a race in the |
Do you have timelines for the fix? Is the issue happening in latest version of elementary as well? |
Describe the bug
The
on-run-end
hook saves data to artifact table. When running tests/models in parallel, duplicate entries in artifact tables sometimes occur. Because of this, when generating a report, duplicate records with information about tests appear.To Reproduce
Steps to reproduce the behavior:
Expected behavior
The report contains one entry for each inspection.
Screenshots
Example: elementary_test_results 1 record * dbt_sources 2 records * dbt_tests 6 records -> there are 12 records in the report.
Environment (please complete the following information):
Additional context
Perhaps it is possible to solve the problem of duplication in artifact tables when working in parallel. Or maybe can distinct records when query the data for the report.
The text was updated successfully, but these errors were encountered: