Skip to content

Commit

Permalink
deduplicate pull requests based on PR url + last update (#95)
Browse files Browse the repository at this point in the history
* deduplicate pull requests based on PR url + last update

* Update stg_pull_requests.sql

* Update src_deduplicated_pull_requests.sql, stg_github_contributions.yml, and stg_pull_requests.sql

---------

Co-authored-by: Ramon <ramon.vermeulen@ing.com>
  • Loading branch information
ramonvermeulen and Ramon authored Apr 11, 2024
1 parent 620b23a commit c7680ef
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 2 deletions.
6 changes: 6 additions & 0 deletions models/staging/src_deduplicated_pull_requests.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{{ dbt_utils.deduplicate(
relation=source("github_contributions", "src_pull_requests"),
partition_by="pull_request_url",
order_by="updated_at desc",
)
}}
1 change: 1 addition & 0 deletions models/staging/stg_github_contributions.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
version: 2

models:
- name: src_deduplicated_pull_requests
- name: stg_pull_requests
columns:
- name: url
Expand Down
3 changes: 1 addition & 2 deletions models/staging/stg_pull_requests.sql
Original file line number Diff line number Diff line change
Expand Up @@ -156,5 +156,4 @@ select
cast(performed_via_github_app as integer) as performed_via_github_app,
cast(state_reason as integer) as state_reason,
cast(score as double) as score,

from {{ source("github_contributions", "src_pull_requests") }}
from {{ ref("src_deduplicated_pull_requests") }}

0 comments on commit c7680ef

Please sign in to comment.