Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matomo Attribution Model Issue #21020

Open
jorgeuos opened this issue Jul 14, 2023 · 17 comments
Open

Matomo Attribution Model Issue #21020

jorgeuos opened this issue Jul 14, 2023 · 17 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc.

Comments

@jorgeuos
Copy link

jorgeuos commented Jul 14, 2023

Hi DevTeam!

I have gathered some questions and thoughts about the Matomo Attribution Model. I have tried to explain it as best as I can, but I’m not sure if I’m correct. So please correct me if I’m wrong.

First we have the questions from a client, then I have added my thoughts and findings below the Results headline.

From client:
 

MATOMO ATTRIBUTION RULES

 

  1. Visits are last touch
    https://matomo.org/faq/troubleshooting/faq_50/

Higher “Direct Entry” visits in Matomo When tracking the acquisition source of visitors, Google Analytics stores and uses campaign data for up to 6 months and attributes subsequent direct entry visits to the original campaign acquisition source. (So Google Analytics will report a higher number of visits attributed to “Campaigns” because these visits used a campaign in the previous 6 months.) Whereas Matomo tracks any new “Direct Entry” visits as direct entries and does not attribute these new “Direct Entry” visits to their original acquisition source. With Matomo’s Multi Channel Conversion Attribution plugin, you can apply different attribution models to your goal conversions.
 

  1. Conversions are last non-direct touch
    https://matomo.org/faq/general/what-is-the-default-attribution-model-used-in-matomo/

Data that support/ contradicts these rules

 

  1. Our Direct visits data is significantly higher than GA. All data that we have supports that this rule is correct
  2. Our conversion data does NOT support this

What we can see is that:

  • A. CHANNEL TYPES do report on a last non-direct touch basis
  • B. ANY CAMPAIGN PARAMETER (Source | Medium | Contents | Names) do not report on a last non-direct touch basis.. they are LAST TOUCH

Why? How?

In the Matomo reporting interface:

  • the Channel type report uses the API method "Referrers.getReferrerType".
  • The campaign parameter type reports use methods such as: "MarketingCampaignsReporting.getSource" & "MarketingCampaignsReporting.getMedium"

Having examined the contents of these respective files, specifically looking for fields related to campaign activity, I can observe the following:
 

  • Referrers.getReferrerType contains ONLY Label (which is the Campaign Name)
  • MarketingCampaignsReporting.getSource contains ONLY Source
  • MarketingCampaignsReporting.getMedium contains ONLY Medium

What I assume is happening is:

  • #A1: the Referrers.getReferrerType is correct (it is last non direct touch) - it allows for a count of the total ecommerceOrders for each campaign. I assume that this data is written to Matomo Using the values contained in the _pk_ref cookie because having examined the contents of that cookie, the ONLY campaign information that is present is the Campaign NAME
  • #A2: The MarketingCampaignsReporting.* queries do not use the _pk_ref cookie because they can't. It does not contain additional campaign parameters. These reports can only use what it has available which is essentially data from the Visits Log and that is LAST TOUCH (see #1 above)

Re-cap:

Our aim is to get:

  • Campaign reports (Campaign Names
  • Campaign Keywords
  • Campaign Sources
  • Campaign Mediums
  • Campaign Contents
  • Campaign Source - Medium
  • Campaign Ids
  • Campaign Groups
  • Campaign Placements
  • for orders on a last non direct touch basis.

A potential solution would be to extend the contents of the _pk_ref cookie to include these parameters and write them to either the Referrers.getReferrerType data set (or a new data set) then subsequently update the respective MarketingCampaignsReporting.* data sets to refer to this data.

Tasks

  1. Conduct a set-through analysis of the Matomo SQL queries to confirm/ deny #A1 and #A2
  2. Propose & build a solution that meets our aim

Results

After thorough analysis of the SQL queries, I have come to the following conclusions:

#A1

  • The API call Referrers.getReferrerType, runs about 30+ queries in the DB. (I have stored all Query logs and attached them in a zip-file).
  • Most of the Queries are pretty much standard, like security checks or look ups.
  • The interesting Query is fetching data from an archive blob data table.
    • Meaning, that the calculation is already done, with core:archiving
    • Following that trail takes me to the Referrers plugin, which is a default plugin I think.
    • The code base is found here:
      protected function aggregateVisitRow($row)
  • In the Referrers plugin I found the function which gives a goal its attribution.
      Attributing the correct Referrer to this conversion.
      Priority order is as follows:
      0) In some cases, the campaign is not passed from the JS so we look it up from the current visit
      1) Campaign name/kwd parsed in the JS
      2) Referrer URL stored in the _ref cookie
      3) If no info from the cookie, attribute to the current visit referrer
    
  • I also checked the JS tracker, and there is functionality there that tries to detect and attribute Referrer information, based on the _ref cookie, check it out:
    • https://github.com/matomo-org/matomo/blob/5.x-dev/js/piwik.js#L3798
      • Some interesting notes:
        cookie 'ses' was not found: we consider this the start of a 'session'
        Detect the campaign information from the current URL
        Only if campaign wasn't previously set
        Or if it was set but we must attribute to the most recent one
        Note: we are working on the currentUrl before purify() since we can parse the campaign parameters in the hash tag
      
    • It sets some values into the cookie, such as:
      • campaignNameDetected // Name if detected
      • campaignKeywordDetected // Keyword if detected
      • referralTs // Current time the user visits the site
      • referralUrl
      • And some interesting notes:
        Store the referrer URL and time in the cookie;
        referral URL depends on the first or last referrer attribution
      
    • The cookie is set for 6 months
    • So in theory, The cookie should be respected and the conversion should be attributed to the first referrer if I’m not mistaken, it also seems to depend on some kind of setting/config.
      • configConversionAttributionFirstReferrer
      • Which I currently can’t find in the value for.
  • I also checked the Goal plugin. Which generates that Channel Type View. See file Goals-Archiver.sql.
    • Ecommerce —> Sales —> Channel Type
    • The Query that build that report is looking into the log_conversion table, meaning that it shouldn’t be able to know anything about Multi Channel Attribution, because all the information is already in that table.
      • referer_keyword
      • referer_name
      • referer_type
      • campaign_content
      • campaign_group
      • campaign_id
      • campaign_keyword
      • campaign_medium
      • campaign_name
      • campaign_placement
      • campaign_source
    • And if you look at the Campaign Mediums - See file: MarketingCampaignsReporting-Archiver.sql
    • Ecommerce —> Sales —> Campaign Mediums
      • It is aggregating it’s report from matomo_log_visit and log_conversion
      • So I’m not quite sure how it’s suppose to work.
      • But potentially these reports are not respecting the last non direct touch approach.
    • I also added logs from MultiChannelConversionAttribution-Archiver.sql
      • It is joining multiple tables in the queries.

These queries are very complex and hard to fully understand if one isn’t familiar with the datasets. And it’s too soon for me to assess if it’s supposed to be like this, if the reports are correct or not. It could be related to configuration, as I’ve learned reading the code now, there is that configConversionAttributionFirstReferrer setting for an example.

In the file called customer-journey.sql

I reproduced the steps we did together.

  1. I enter the site from a campaign URL.
  2. I browse around.
  3. I delete the _ses Cookie and close the tab.
  4. I enter directly to the site and finish an order, which should convert.

And I can see in the last query that it attributes to my campaign. Which is the correct behavior, right? Last non direct touch.

#A2

The MarketingCampaignsReporting plugin uses the logs, but the logs are updated with the referral data. So it should be the same behavior. But that doesn’t exclude the possibility of the query(ies) to be wrongly formatted.
What I mean is that the referrer data is the same, it is there. But for some reason, it is not respecting the last non touch approach.

Conclusion:

We need Matomos Core Team to help us to fully understand which report is correct and how we can get the data we need.

Could it be that the campaign info is missing from the inserts in:

$this->aggregateFromConversions(array("referer_type", "referer_name", "referer_keyword"));

queries.zip

@jorgeuos jorgeuos added Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. To Triage An issue awaiting triage by a Matomo core team member labels Jul 14, 2023
@michalkleiner
Copy link
Contributor

Hi @jorgeuos,
thank you for opening such an elaborate and descriptive issue, I see you gave it a lot of thought and time and did as much investigation yourself as you could.

However, we don't provide support through the issue tracker, so unless we can clearly establish a hypothesis that a certain replicable behaviour is a bug, and provide replication steps, we can't go on too much investigating this ourselves as it could be related to configuration, reports setup, goals setup etc.

Could you perhaps try and formulate what you think is the issue in one or two sentences? We'd appreciate that! And, once again, we do appreciate a detailed report and your time put into it.

@michalkleiner
Copy link
Contributor

@sgiehl any insights welcome here, thanks!

@michalkleiner michalkleiner added Waiting for user feedback Indicates the Matomo team is waiting for feedback from the author or other users. and removed To Triage An issue awaiting triage by a Matomo core team member labels Jul 27, 2023
@jorgeuos
Copy link
Author

jorgeuos commented Aug 3, 2023

So we have an update.

I'm short on time right now, so I will probably update this issue tomorrow. But Basically our findings have resulted in that, when a user visits the site, they get a visitorId and browses around the site. This visit is attributed to a campaign or a whatever referrer it gets.
At a later point when that same user visits the site again, as a direct visit, and logs in to make a purchase, they get assigned a userId.
Now in theory, the same visit should be now treated with the userId as describe here:
https://matomo.org/faq/general/how-are-requests-with-a-user-id-tracked/

When a User Logs In During a Visit

When a visitor connects to your website but is not initially logged in, their visit is associated with a Visitor ID by default. This is a unique identifier for that specific visit that is not attributed to a specific user. However, once that user logs into their account and you set a User ID for this visitor, then all actions such as page views are linked to the User ID and not the visitor ID. Any previously tracked action for this visitor before the user was logged in is also associated with this User ID.

But instead, the user gets a new visitorId, and that is why the attribution isn't attributed to the first referrer. And the conversion is attributed to a Direct visit.

Br, Jorge

@jorgeuos
Copy link
Author

jorgeuos commented Aug 4, 2023

What exactly is the problem?

Last non-direct attribution on ecommerce orders does not work as expected.

What is the possible cause?

Imagine this use
Screen Shot 2023-08-04 at 15 57 23

Visit 1

  1. A new visitor comes to the site via an email campaign and their activity is correctly attributed to the campaign.
  2. They remain anonymous during this visit.
  3. Matomo generates a unique Visitor ID

Visit 2

  1. The same visitor comes to the site “directly” 1 hour later
  2. Matomo recognizes them from the Visitor ID set from Visit 1
  3. Their visit is attributed to DIRECT
  4. We see two visits for this visitor in their Visitor profile
  5. The visitor then logs-in and a User ID is sent to Matomo (setUserId).
  6. 🔴 Matomo generates a new Visitor ID
  7. The Visitor completes an Order
  8. The pre and post-login activity is all contained within this visit: Visit 2.. the USER ID binds this activity together (good)
  9. 🔴 The USER ID is not assigned to Visit 1
  10. The attribution of the order to EMAIL is lost because that belongs to visit 1 but it has no relationships to the USER ID who made the booking in VISIT 2 (bad)

Key questions

Is it expected behaviour that a new Visitor is generated once a UserId is provided to Matomo?

  1. If YES, surely Visit 1 (all previous visits) should be assigned to the USER?
  2. If NO, then what is causing it to be regenerated?

Regards, Me and my client

@MatomoForumNotifications

This issue has been mentioned on Matomo forums. There might be relevant details there:

https://forum.matomo.org/t/multi-channel-conversion-attribution-last-non-direct-doesnt-work/52828/6

@jorgeuos
Copy link
Author

I think that post 👆 could help us with getting the reports that we need.

@sgiehl
Copy link
Member

sgiehl commented Aug 22, 2023

@jorgeuos I don't have enough time to think that all through in detail, but to me it sounds more like this would be a problem around the user id tracking. If you wouldn't use user ids I guess the attribution would be correct all over, is that right?

If setting a user id later in a visit creates a new visit, that sounds incorrect to me and might be what we should investigate. Can you confirm that?

@michalkleiner
Copy link
Contributor

@sgiehl perhaps somewhat related to #21156 where assigning user ID also creates a new visitor?

@jorgeuos
Copy link
Author

@sgiehl

Yes, that is my understanding.

However, the documentation says:

However, once that user logs into their account and you set a User ID for this visitor, then all actions such as page views are linked to the User ID and not the visitor ID.

https://matomo.org/faq/general/how-are-requests-with-a-user-id-tracked/

When a user is trying to finish a purchase, they get logged in. Shouldn't the referrer and attribution also be updated and associated with the userID and not the visitorID.

I will consult with my client if it is an option for them to turn off the userID. At least until the issue is resolved or until we find a workaround.

@sgiehl
Copy link
Member

sgiehl commented Aug 22, 2023

Thanks for the response. I guess we at least need to investigate why the documentation and how the code behaves is different. One part seems to be wrong in that case.

@sgiehl sgiehl added Bug For errors / faults / flaws / inconsistencies etc. and removed Waiting for user feedback Indicates the Matomo team is waiting for feedback from the author or other users. Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. labels Aug 22, 2023
@sgiehl sgiehl added this to the For Prioritization milestone Aug 22, 2023
@mattab
Copy link
Member

mattab commented Sep 5, 2023

Thanks for the detailed issue, very appreciated.

@sgiehl Q: Is this maybe the same issue as #19927 ?

@jorgeuos
Copy link
Author

jorgeuos commented Sep 6, 2023

Hi @mattab

Yes, indeed, the two problems looks related. I'm sorry that I didn’t catch that sooner.

However, there seems to be several issues in the forum related to this issue and I understand that you are looking into it with your best effort.

Please let me know if there is something I can do to make it smoother for you to troubleshoot and pinpoint the error.

A further break down:

  • The Core Issue: Both the issues concern Matomo's incorrect handling of visits when a UserId is set during the visit. The expected behavior is that, when setting a UserId, all before and after actions during the visit should be attributed to that UserId.
  • Repercussions:
    • The issue's impact is more pronounced in webshops or sites where user behavior before logging in is crucial.
    • This means that key insights regarding user interactions, such as products viewed but not purchased, can be lost.
    • Major decisions by companies could be based on inaccurate data. We might lose crucial insights.
  • Documentation vs. Reality: The behavior is contrary to what's documented, which states that all actions, pre and post login, are counted as one visit when a user logs in.
  • Possible Solution: Matomo could prioritize continuity by merging visits based on both VisitorId and UserId to prevent session fragmentation upon user login.

@jorgeuos
Copy link
Author

Update: Based on recent discussions and observations:

  1. Impact of Removing UserId: By removing the UserId, we've witnessed a big impact on data. Notably, metrics such as the bounce rate, unique pageviews, and others have shown marked improvements. This adds further weight to the core issue identified earlier – the problematic handling of visits when a UserId is set.

  2. Alignment of Data Post Removal: After removing the UserId, we've observed that certain data metrics (like the “CAMPAIGN” bookings and “SOURCE MEDIUM” bookings) align closely post our change. This supports the fact that Matomo isn't treating UserId in the manner it should, thereby causing discrepancies.

  3. Broader Implications: We are in the process of conducting comparisons across all of our sites to see if this pattern is consistent. If all sites show same thing, it will confirm the problem is not just one place, but everywhere.

  4. Next Steps: We found part of the problem, which is good. But, how to use UserId effectively is the next challenge. We need to fix how UserId is managed to get right data and also keep track of users properly.

Given these updates and the previously discussed concerns, any insights or recommended action steps from your end would be valuable. We're committed to collaborating closely on this, ensuring we arrive at a comprehensive solution.

@MatomoForumNotifications

This issue has been mentioned on Matomo forums. There might be relevant details there:

https://forum.matomo.org/t/setconversion-attributionfirstreferrer-is-not-working/53798/2

@AdamMcAddEm
Copy link

My organization selected Matomo to replace UA because of their default support for last-touch non-direct attribution. As we poked around in the platform, we realized that this isn't what was happening at all in the system and that it was more of a last touch attribution setup. Then after a bit of searching, I discovered this thread and see that this is an issue that's been around and known for OVER A YEAR. We're incredibly upset that a feature that Matomo is advertising hasn't been available for over a year as we sit on the eve of our transition away from UA. Can we please pick this up? Or at the very least can Matomo stop advertising features that they don't have @mattab ?

Matomo Last Touch Non Direct issue V2

@mattab
Copy link
Member

mattab commented Jul 1, 2024

@AdamMcAddEm As far as we know, attribution works perfectly well in Matomo. We could be wrong but we still haven't been able to reproduce an issue.

The issue you referring to is only a Cosmetic issue, but I agree it's very annoying. This will be addressed ASAP in #19328

@AdamMcAddEm
Copy link

@AdamMcAddEm As far as we know, attribution works perfectly well in Matomo. We could be wrong but we still haven't been able to reproduce an issue.

The issue you referring to is only a Cosmetic issue, but I agree it's very annoying. This will be addressed ASAP in #19328

Yes, I investigated again, and it does look like the attribution of revenue/conversions is working. GA would just also attribute the sessions so that our conversion rates would be more accurate.

Example
Visitor 1

  • Visit 1 - Email
  • Visit 2 - Direct
  • Visit 3 - Direct - Convert (Purchase)

Visitor 2

  • Visit 1 - Email
  • Visit 2 - Direct
  • Visit 3 - Direct - Convert (Purchase)

GA
Email - 2 purchases, 33% conversion rate (6 visits, 2 purchases)

Matomo
Email - 2 purchases, more than 50% conversion rate (4 visits, 2 purchases)

Let me know if I'm misunderstanding something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Projects
None yet
Development

No branches or pull requests

6 participants