Email Protection Systems Generate Invalid Traffic #9798

mabumusa1 · 2021-03-17T20:56:28Z

Q	A
Mautic version	ANY
PHP version	ANY
Browser	ANY

Bug Description

This is not a Mautic bug by itself but it impacts Mautic a lot, there are many protection systems like https://www.proofpoint.com/us/products/email-security-and-protection/email-protection which do the following on the emails sent by Mautic

They collect all the emails
It parses the emails, and then follow all the links including the tracking pixels (so all the emails sent are marked as read).
It scrambles the UTM codes and then follows the links with scrambled UTM, therefore it messes up the analytics.
Traffic comes from different IPs, with different headers including User Agent which makes it hard to block invalid traffic.
To make the case worst, they follow the unsubscribe link and unsubscribe the whole list that you sent to, without a preferance center your list is marked as DNC

I opened this thread for discussion as there is no clear way to solve it

The text was updated successfully, but these errors were encountered:

kuzmany · 2021-03-18T07:04:04Z

We have a lot of discussion about that in company. We've talked about 2 solutions:

write algorithm to detect bots clicks depends on time/threshold and numbers of clicks. These require a lot of changes, new column is_bot in email_stats and channel_url_trackables etc. The the results would be questionable
This solution is something like HubSpot already done https://community.hubspot.com/t5/Email-Marketing-Tool/Are-Bots-Affecting-Your-Email/td-p/302428)
Invisible Recaptcha3

Add page before redirect with Recaptacha and decide based on score

good score - redirect to page
bad score - show link to page to manually go to page

I like this idea.

@mabumusa1 do you have any opinion?

YosuCadilla · 2021-03-19T03:40:14Z

@mabumusa1 I developed a Mautic plugin for a client who had this problem.

This type of software is usually used by large companies running their own email servers.

We tried to find an elegant and scientific method to solve the problem, like identify the browser agent, IPs and looked at the data for other possible ways to isolate the bad clicks or the bad click producers and we tested a few, didn't work consistently...

Browser agents change a lot. We identified a few browser agents doing a lot of damage, but in the next round of emails, those had changed. Also, if you check data over time, browser agents that were clearly harmful to one email, were part of what looked like legit clicks in other emails. Also, there usually is a great number of fake clicks coming from a handful of "bad browser agents" but that is maybe 50% of the total, and the rest of bad browsers make just one or a few clicks, hard to define patterns, maybe a good job for an AI.

I was also unsuccessful with isolating IPs, these are corporate servers behind corporate networks with reverse proxies, reaching out to the internet over a pool of IPs. In one extreme case, we had the same person (contact) click on a link from 5 different locations all around the US, from coast to coast in a 10-second window.
The issue is that both, legit and fake clicks, usually come from the very same IPs, so no joy...

What ended working decently well for us was to add an invisible link to all outgoing emails, then once a click to the invisible link happens, we check all the clicks in a 10-second window and we eliminate all of them (we copy them to a different table for further analysis).
This works under the assumption that the security bots/scanners click on all the links on an email and it is eliminating (probably) well over 85% of the bad clicks, however, when we look at the data there are a few SMALL inconsistencies here and there, so the method is not perfect. Tweaking the time window as well as the position of the invisible link on the email allowed us to increase the effectiveness by 10%, so well worth dedicating some time to this.

After a few adjustments to the scripts, the CMO of the company decided this method was doing the job well enough, and there was no need to double the development cost to squeeze an extra 5% reliability, hence no further development or research was deemed necessary.
It's been working for a few months already, we have some surprises now and then, but nothing big enough to make us consider more research or new development for now.

If you ask me, this is an excellent problem for an AI, this is what these excel at, finding patterns, so if we ever decide to improve the current scripts, I will strongly recommend training an AI with the data from the Mautic database and see what comes out.

Another thing that might change is the moment in time we run the filters. Right now we are running the scripts from a cronjob, hence the data first makes it to the database and then is evaluated and removed if deemed wrong.
The next iteration, if it ever happens, will be implemented as an external, real-time pre-filter (probably at the apache level), so the bad clicks never make it to the database in the first place.

Interesting possibility with the Recaptcha @kuzmany, so basically every link would point to or be intercepted by a "proxy page" where the Recaptcha lives and then redirected to the real page, right?

kuzmany · 2021-03-19T07:56:01Z

@YosuCadilla thank you for your experiences

What ended working decently well for us was to add an invisible link to all outgoing emails, then once a click to the invisible link happens, we check all the clicks in a 10-second window and we eliminate all of them... Tweaking the time window as well as the position of the invisible link on the email allowed us to increase the effectiveness by 10%, so well worth dedicating some time to this.

Did you increase or decrease that time tresholds?
That means after tweaking invisible link resolved 90% of bots clicks at least?

Interesting possibility with the Recaptcha @kuzmany, so basically every link would point to or be intercepted by a "proxy page" where the Recaptcha lives and then redirected to the real page, right?

Yes, all urls are tracked, then it's easy to add before redirection routine (stats, redirect) some page and continue to standard redirection after passed verification.

YosuCadilla · 2021-03-19T11:26:40Z

Did you increase or decrease that time tresholds?
There isn't a perfect number, each click seems to take about one second (but can vary for each destination domain). So the final timings depend on the position of the invisible link relative to the rest of the links in the email and the number of links in the email.
For example, we ended using +/- 10 seconds, because the invisible link is in the middle and there are 7-8 links on each email.
I think you can increase the number up to 20 ,30 or even more seconds, the risk here is that if the final recipient (the real person) happens to open the email and click on a link within this time window, the click would be discarded, so the shorter the window the better, but give it enough time to catch all the fake clicks.

That means after tweaking invisible link resolved 90% of bots clicks at least?
Clarification about the % of true clicks (effectivity): What we measured is the % of true/failed detections among the positives (emails with clicks on the invisible link), meaning how good/bad is the script at caching all the fake clicks once the invisible link is clicked. If a bot clicks on just one, a few, or all the links except the invisible link, we don't see anything at all (and that's why it is good enough but far from perfect).

However, our level of detected bad clicks matches what others described on the HubSpot thread, and the click ratios are now much more aligned with industry standards.

stale · 2021-06-22T21:31:53Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2021-07-07T01:06:17Z

This issue has been automatically closed because it has not had recent activity. If the reported issue persists, please create a new issue and link back to this one for reference. Thank you for your contributions.

adiux · 2021-09-10T10:05:40Z

I think this issue is important and we should discuss and address it in the community.

mautibot · 2021-09-10T10:08:48Z

This issue has been mentioned on Mautic Community Forums. There might be relevant details there:

https://forum.mautic.org/t/possible-work-around-for-reporting-open-and-clicks-without-bot-data/16989/13

kuzmany · 2021-10-05T07:39:16Z

We've already worked on solution with recpatcha page before go page.
I will report data when we get it.
This PR is part of it: #10503

stale · 2022-01-03T07:53:18Z

This issue or PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you would like to keep it open please let us know by replying and confirming that this is still relevant to the latest version of Mautic and we will try to get to it as soon as we can. Thank you for your contributions.

stale · 2022-01-17T08:04:10Z

This issue or PR has been automatically closed because it has not had recent activity. In the case of issues, if it persists in the latest version of Mautic, please create a new issue and link back to this one for reference. With PRs if you wish to pick up the PR and update it so that it can be considered for a future release, please comment and we will re-open it. Thank you for your contributions.

github-actions · 2022-01-17T08:04:31Z

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If this issue is continuing with the lastest stable version of Mautic, please open a new issue that references this one.

mabumusa1 added the needs-triage For new issues/PRs that need to be triaged label Mar 17, 2021

mabumusa1 mentioned this issue Mar 17, 2021

Rate opening superior 100% #9778

Closed

stale bot added the stale Issues which have not received an update within 90 days label Jun 22, 2021

stale bot closed this as completed Jul 7, 2021

RCheesley mentioned this issue Sep 6, 2021

tracking on GMAIL is pointless as of now (proxy) #10409

Closed

1 task

kuzmany mentioned this issue Sep 9, 2021

Missing clickthrough in the URL query (PublicController.php line 452) #7841

Closed

adiux reopened this Sep 10, 2021

stale bot removed the stale Issues which have not received an update within 90 days label Sep 10, 2021

kuzmany mentioned this issue Sep 10, 2021

Fix PHP Notice - Undefined index: ct #10418

Merged

RCheesley mentioned this issue Sep 22, 2021

Tracking issue: multi-clicks and/or multi-downloads occur at the same time #10461

Closed

1 task

stale bot added the stale Issues which have not received an update within 90 days label Jan 3, 2022

stale bot closed this as completed Jan 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Email Protection Systems Generate Invalid Traffic #9798

Email Protection Systems Generate Invalid Traffic #9798

mabumusa1 commented Mar 17, 2021 •

edited by RCheesley

Loading

kuzmany commented Mar 18, 2021 •

edited

Loading

YosuCadilla commented Mar 19, 2021

kuzmany commented Mar 19, 2021

YosuCadilla commented Mar 19, 2021

stale bot commented Jun 22, 2021

stale bot commented Jul 7, 2021

adiux commented Sep 10, 2021

mautibot commented Sep 10, 2021

kuzmany commented Oct 5, 2021

stale bot commented Jan 3, 2022

stale bot commented Jan 17, 2022

github-actions bot commented Jan 17, 2022

Email Protection Systems Generate Invalid Traffic #9798

Email Protection Systems Generate Invalid Traffic #9798

Comments

mabumusa1 commented Mar 17, 2021 • edited by RCheesley Loading

Bug Description

kuzmany commented Mar 18, 2021 • edited Loading

YosuCadilla commented Mar 19, 2021

kuzmany commented Mar 19, 2021

YosuCadilla commented Mar 19, 2021

stale bot commented Jun 22, 2021

stale bot commented Jul 7, 2021

adiux commented Sep 10, 2021

mautibot commented Sep 10, 2021

kuzmany commented Oct 5, 2021

stale bot commented Jan 3, 2022

stale bot commented Jan 17, 2022

github-actions bot commented Jan 17, 2022

⚠️COMMENT VISIBILITY WARNING⚠️

mabumusa1 commented Mar 17, 2021 •

edited by RCheesley

Loading

kuzmany commented Mar 18, 2021 •

edited

Loading