Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(): add ad_reporting passthrough metrics variable #84

Conversation

aleix-cd
Copy link
Contributor

@aleix-cd aleix-cd commented Mar 17, 2023

Are you a current Fivetran customer?

Yes, from a company called Capdesk.

What change(s) does this PR introduce?

Introducing the variable ad_reporting__passthrough_metrics, which will serve as a "common denominator" for those metrics that are present in all of our Ads platforms.

In our case, for example, we like to track conversions both in Linkedin, Google, Microsoft and Facebook. Given that ad_reporting was not providing this column, we had to specifically build some ad_reporting models to just add this column.

With this PR, we should be able to do so by just giving the ad_reporting__passthrough_metrics the names of the metrics we want to add into our final models.

On the other hand, this would require a place in the README.md, in order to let package users know that this is possible (and the caveats as well e.g. that this metric needs to be present, and called the same, in all the Ads platforms we are merging). Let me know if I can provide more help in this regard!

Did you update the CHANGELOG?

  • Yes

Does this PR introduce a breaking change?

  • Yes (please provide breaking change details below.)
  • No (please provide an explanation as to how the change is non-breaking below.)
    Shouldn't break anything unless users introduce a metric name that is not found in all platforms' models.

Did you update the dbt_project.yml files with the version upgrade (please leverage standard semantic versioning)? (In both your main project and integration_tests)

  • Yes

Is this PR in response to a previously created Bug or Feature Request

  • Yes, Issue/Feature [link bug/feature number here]
  • No

How did you test the PR changes?

  • Buildkite
  • Local (please provide additional testing details below)
    Tested my local branch of the package against a dbt project, everything builds correctly.

Select which warehouse(s) were used to test the PR

  • BigQuery
  • Redshift
  • Snowflake
  • Postgres
  • Databricks
  • Other (provide details below)

Provide an emoji that best describes your current mood

💃

Feedback

We are so excited you decided to contribute to the Fivetran community dbt package! We continue to work to improve the packages and would greatly appreciate your feedback on our existing dbt packages or what you'd like to see next.

@fivetran-joemarkiewicz fivetran-joemarkiewicz added the status:scoping Currently being scoped label Mar 17, 2023
@fivetran-joemarkiewicz
Copy link
Contributor

Hi @aleix-cd thanks so much for taking the time to open this PR!

After reviewing your write up and taking a look at your suggested updates, I agree that this is something that I feel would be incredibly valuable to add to the package! There are a few suggested changes I can foresee, but altogether I would like to collaborate with you to integrate this into a future release of our package!

I am going to add this review to my up coming sprint and will share more feedback and suggestions then. Thanks again for thinking through this update and working to contribute a valuable addition to the package!

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aleix-cd I was able to review and feel this looks great, there are just a few changes I think we should consider before moving forward.

Since we have individual hierarchy types (account, campaign, ad group, ads) in the platform specific models, I think it would make the most sense to have the same hierarchy type variables for ad reporting. This would give the user control over each of the hierarchies and ensure that the metrics they are trying to passthrough are consistent (and maybe even renamed) across platforms to ensure the unioning works as intended. What are your thoughts?

So instead of just ad_reporting__passthrough_metrics we could have ad_reporting__campaign_passthrough_metrics. Or would you feel that this is duplicative? I just worry there may be some instances where metrics may not be consistent across hierarchies and I would want to isolate these pass through abilities. Thoughts?

Additionally, I have a few suggestions in the review as well. Let me know if you would prefer I elaborate on any of the suggestions.

Ultimately, I was able to get this to work with slight modifications using the following config:

vars:
  google_ads__campaign_stats_passthrough_metrics:
    - name: conversions
    - name: total_shares
      transform_sql: cast(total_shares as int)
  snapchat_ads__campaign_hourly_report_passthrough_metrics:
    - name: conversion_sign_ups
      alias: conversions
    - name: shares
      alias: total_shares
  tiktok_ads__campaign_hourly_passthrough_metrics:
    - name: conversion
      alias: conversions
    - name: shares
      alias: total_shares
  ad_reporting__campaign_passthrough_metrics: 
    - name: conversions
    - name: total_shares

models/ad_reporting__ad_group_report.sql Outdated Show resolved Hide resolved
@aleix-cd
Copy link
Contributor Author

aleix-cd commented Apr 4, 2023

Since we have individual hierarchy types (account, campaign, ad group, ads) in the platform specific models, I think it would make the most sense to have the same hierarchy type variables for ad reporting. This would give the user control over each of the hierarchies and ensure that the metrics they are trying to passthrough are consistent (and maybe even renamed) across platforms to ensure the unioning works as intended. What are your thoughts?

So instead of just ad_reporting__passthrough_metrics we could have ad_reporting__campaign_passthrough_metrics. Or would you feel that this is duplicative? I just worry there may be some instances where metrics may not be consistent across hierarchies and I would want to isolate these pass through abilities. Thoughts?

Additionally, I have a few suggestions in the review as well. Let me know if you would prefer I elaborate on any of the suggestions.

IMHO it would definitely feel repetitive, but agree that the flexibility it offers should be worth the extra lines. I will push the changes soon, let me know what you think :)

@aleix-cd
Copy link
Contributor Author

aleix-cd commented Apr 4, 2023

@fivetran-joemarkiewicz At this point, just need to test which is the best way to adapt the get_query() macro to account for the different variable names :) Will give it a thought later today!

@aleix-cd
Copy link
Contributor Author

aleix-cd commented Apr 5, 2023

@fivetran-joemarkiewicz OK, I got this up and running! Few comments:

  • metric.name did not work in my end, reverted back to metric. Not sure how you would use it, but strings don't have a name method afaik. Let me know if I am missing something!
  • Added a few more conditions into the get_query in order to add the relevant passthrough metics at each step. I am no longer adding these in the consistent_fields step, let me know if that looks good!
  • I am using an else at the end of the get_query macro to cast all passthrough_metrics as floats, so expecting that all metrics can be casted as such {{ dbt.type_float() }}

@fivetran-joemarkiewicz
Copy link
Contributor

Thanks @aleix-cd I will take a look later today!

Regarding your comment:

metric.name did not work in my end, reverted back to metric. Not sure how you would use it, but strings don't have a name method afaik. Let me know if I am missing something!

You are correct and I apologize for not mentioning this in my initial statement. All our other passthrough metrics for Ad Reporting use the dictionary structure as opposed to the list of strings. Therefore, I feel we should follow this method as well to be consistent across the ad platforms. For example it would be

vars:
  ad_reporting__campaign_passthrough_metrics: 
    - name: conversions
    - name: total_shares

As opposed to

vars:
  ad_reporting__campaign_passthrough_metrics: ["conversions","total_shares"]

That being said, I know we have made some enhancements in other packages that support both forms. I will take a look to see how this was done as we may be able to repurpose here as well.

@aleix-cd
Copy link
Contributor Author

aleix-cd commented Apr 6, 2023

Thanks @aleix-cd I will take a look later today!

Regarding your comment:

metric.name did not work in my end, reverted back to metric. Not sure how you would use it, but strings don't have a name method afaik. Let me know if I am missing something!

You are correct and I apologize for not mentioning this in my initial statement. All our other passthrough metrics for Ad Reporting use the dictionary structure as opposed to the list of strings. Therefore, I feel we should follow this method as well to be consistent across the ad platforms. For example it would be

vars:
  ad_reporting__campaign_passthrough_metrics: 
    - name: conversions
    - name: total_shares

As opposed to

vars:
  ad_reporting__campaign_passthrough_metrics: ["conversions","total_shares"]

That being said, I know we have made some enhancements in other packages that support both forms. I will take a look to see how this was done as we may be able to repurpose here as well.

Ah, my bad! I completely missed that the ad_reporting packages (e.g. google_ads) are using the following fivetran_utils macro to invoque the passthrough metrics:
{{ fivetran_utils.persist_pass_through_columns(pass_through_variable='var', transform = 'sum') }}

I could use this instead of the for loop I'm declaring in each model. What do you think?

@fivetran-joemarkiewicz
Copy link
Contributor

@aleix-cd yeah if we could get it to leverage an already existing macro then that would be ideal! Let me know if it doesn't work on your end. Maybe we can look to adjust the macro be more flexible.

@aleix-cd
Copy link
Contributor Author

aleix-cd commented Apr 14, 2023

@fivetran-joemarkiewicz Alright Joe, this should be ready to review now!
I now treat all the variables as Dicts, and use the fivetran_utils.persist_pass_through_columns() macro on all reports. Tested on my end and it worked with the following config:

vars:
  ad_reporting__url_passthrough_metrics: 
    - name: conversions

Let me know!

@fivetran-joemarkiewicz
Copy link
Contributor

@aleix-cd fantastic! Our next sprint starts Thursday. I will review and work to integrate this into a new release in the coming sprint! I will follow up if I have any other comments or suggestions 😄

@fivetran-joemarkiewicz fivetran-joemarkiewicz added status:in_review Currently in review and removed status:scoping Currently being scoped labels Apr 26, 2023
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aleix-cd thanks so much for working through this. I have had some time to test out your changes and I seem to be running into one issue on my end. When I add two or more metrics to be passed through the variables I am seeing that it looks like only one is being added to the final_fields_superset variable in the get_query macro. Do you know why only one would be getting passed through? Below is my environment:

Note: I am only using the campaign passthrough metric variable for testing reasons.

  • Variables
vars:
  google_ads__campaign_stats_passthrough_metrics:
    - name: conversions
    - name: total_shares
      transform_sql: cast(total_shares as int)
  snapchat_ads__campaign_hourly_report_passthrough_metrics:
    - name: conversion_sign_ups
      alias: conversions
    - name: shares
      alias: total_shares
  tiktok_ads__campaign_hourly_passthrough_metrics:
    - name: conversion
      alias: conversions
    - name: shares
      alias: total_shares
  reddit_ads__campaign_passthrough_metrics:
    - name: conversions
      transform_sql: "null"
    - name: total_shares
      transform_sql: "null"
  ad_reporting__campaign_passthrough_metrics: 
    - name: total_shares
    - name: conversions

Here is the error I am seeing
image

When looking into the compiled results I can see that in the platform specific cte's for the campaign_report I notice that only total_shares is included
image

However, in the final cte I see both total_shares and conversions which is then resulting in the error since conversions in this case is not included in the upstream ctes.
image

I will look into this some more, but curious if you have an idea why only one metrics is being included in the intermediate steps 🤔

@aleix-cd
Copy link
Contributor Author

aleix-cd commented Apr 27, 2023

transform_sql

Spotted the error:
I was defining campaign_passthrough_metric_dict as

{% set campaign_passthrough_metric_array_of_dicts = var('ad_reporting__campaign_passthrough_metrics')[0] %}

So only selecting the first element of the array -- in this case, conversions.

I am now iterating over that list of dictionaries:

{% set campaign_passthrough_metric_array_of_dicts = var('ad_reporting__campaign_passthrough_metrics') %}
    {%- for campaign_group_passthrough_metric_dict in campaign_passthrough_metric_array_of_dicts -%}
        {%- for campaign_passthrough_metric_value in campaign_passthrough_metric_dict.values() -%}
            {%- do final_fields_superset.update({campaign_passthrough_metric_value: campaign_passthrough_metric_value}) -%}
        {%- endfor -%}
    {%- endfor -%}

And this should be solved! All passthrough_metrics should now be populating the intermediate models :)

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aleix-cd I just tested and ran into a few errors, but was easily able to resolve them and see the passthrough metrics working as intended! Would you be able to apply the below suggested edits? Once those are applied my team can take things going forward for documentation and prep for release!!

One additional question I wanted to ask was if you had any strong preference for keeping the url_report and ad_report passthrough metrics distinct? I noticed in your code you have a variable for ad_reporting__ad_passthrough_metrics and one for ad_reporting__url_passthrough_metrics; however, in our upstream packages we use the same passthrough variable for both reports since they are built from the same base table. I am considering doing the same here to be consistent across the packages.

Let me know your thoughts!

macros/get_query.sql Outdated Show resolved Hide resolved
macros/get_query.sql Outdated Show resolved Hide resolved
macros/get_query.sql Outdated Show resolved Hide resolved
@aleix-cd
Copy link
Contributor Author

aleix-cd commented May 3, 2023

Just applied the requested fixes, @fivetran-joemarkiewicz! Thanks a lot for the tips, appreciate these a lot!

One additional question I wanted to ask was if you had any strong preference for keeping the url_report and ad_report passthrough metrics distinct? I noticed in your code you have a variable for ad_reporting__ad_passthrough_metrics and one for ad_reporting__url_passthrough_metrics; however, in our upstream packages we use the same passthrough variable for both reports since they are built from the same base table. I am considering doing the same here to be consistent across the packages.

That was my bad, I did not realise that the upstream models for ad and url reports were sharing the same variable. I've added the required changes, so now both ad and url reports should be sharing the same passthrough metrics. 👍

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aleix-cd thanks so much for applying the final updates! This is clear to approve following your recent changes! 🎉

All the updates look good on your end! Before moving forward I wanted to check if you are okay if I make some changes to your branch before merging and releasing? There is some documentation and under the hood (for testing and docs hosting) updates I will need to make. I likely won't edit the code, but wanted to make sure this was okay for me to edit in case you are pointing to this branch in a production environment.

If not, I can merge this into a branch on our repo and then make adjustments there before merging to main. Let me know! Thanks again for all your work on getting this feature included. This will undoubtedly be a crowd pleasure and I envision a lot of people using this extensively. We have had a lot of requests in the past for the ability to passthrough additional metrics, and this makes it possible ☺️

@aleix-cd
Copy link
Contributor Author

aleix-cd commented May 4, 2023

We are not using this in prod, so feel free to make any required changes! And thank you for the thorough review, it has been quite a good learning experience! 💯

@fivetran-joemarkiewicz
Copy link
Contributor

@aleix-cd just wanted to provide a quick update that there seem to be a few unrelated BigQuery JSON issues that are occurring in the upstream Ad packages. These just popped up following an update from BigQuery and we are currently triaging these issues.

Ultimately, we will be applying fixes to our upstream packages in the coming sprint. These changes will be breaking changes for the upstream packages. Therefore, we are planning to batch this PR with the next round of breaking changes so we don't cut breaking changes back to back.

I appreciate your patience!

@fivetran-joemarkiewicz
Copy link
Contributor

The upstream changes are in motion and we can move forward with merging this into the release branch! My plan is to release this early next week!

@fivetran-joemarkiewicz fivetran-joemarkiewicz changed the base branch from main to release/v1.4.0 June 8, 2023 19:14
@fivetran-joemarkiewicz fivetran-joemarkiewicz merged commit 3cdbfbe into fivetran:release/v1.4.0 Jun 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:in_review Currently in review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants