Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Google Analytics to Matomo from the ASF #576

Merged
merged 3 commits into from
Apr 22, 2022

Conversation

rossturk
Copy link
Contributor

Hello! This replaces the Google Analytics tracking tag in the airflow.apache.org website with a Matomo one from the ASF. I propose this because the ASF has recently updated its privacy policy, which now prohibits use of Google Analytics.

There are three changes in this PR:

  1. Disabling GA in the Hugo config
  2. Replacing the tag in the Sphinx theme
  3. Replacing the tag in all of the prebuilt docs-archive content (about 80k pages!)

For #3, I used this script:
https://gist.github.com/rossturk/1223ae5d57fbcbb4bc32da0d49137ef2

Signed-off-by: Ross Turk <ross@rossturk.com>
Signed-off-by: Ross Turk <ross@rossturk.com>
Signed-off-by: Ross Turk <ross@rossturk.com>
@potiuk
Copy link
Member

potiuk commented Apr 22, 2022

You might have beaten me @rossturk in "the biggest change size" :D

I usually get up to 3M lines of code change when I release providers but this one is way bigger (see the number of files at the top) :

Screenshot from 2022-04-22 20-14-01

@potiuk
Copy link
Member

potiuk commented Apr 22, 2022

And this one when you try to revfew single "third" commit:

image

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks @rossturk

I reviewed first two commits - they look good.

Then I prepared this file:

-    <script type="application/javascript">
-        var doNotTrack = false;
-        window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;
-        ga('create', 'UA-140539454-1', 'auto');
-        ga('send', 'pageview');
-    </script>
-    <script async src='https://www.google-analytics.com/analytics.js'></script>
+<!-- Matomo -->
+<script>
+  var _paq = window._paq = window._paq || [];
+  _paq.push(['disableCookies']);
+  _paq.push(['trackPageView']);
+  _paq.push(['enableLinkTracking']);
+  (function() {
+    var u="https://analytics.apache.org/";
+    _paq.push(['setTrackerUrl', u+'matomo.php']);
+    _paq.push(['setSiteId', '13']);
+    var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
+    g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s);
+  })();

And run this command on the last commit:

git diff -U0 HEAD^ | grep '^[+-]' | grep -vF -f /tmp/file.txt

This yielded no results.

This means that the only changes in the commit are those line replacing GA wth Matomo.

UPDATE: found a bug - running it again

@potiuk
Copy link
Member

potiuk commented Apr 22, 2022

OK. This works :)

(and this time I tested if it actually works by removing some lines from the pattern file).

 git diff -U0 HEAD^ | grep '^[+-]' | grep -vF -x -f /tmp/file.txt | grep -v '^---' | grep -v '^+++' | wc
      0       0       0

The pattern file:

-    <script type="application/javascript">
-        var doNotTrack = false;
-        window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;
-        ga('create', 'UA-140539454-1', 'auto');
-        ga('send', 'pageview');
-    </script>
-    <script async src='https://www.google-analytics.com/analytics.js'></script>
+<!-- Matomo -->
+<script>
+  var _paq = window._paq = window._paq || [];
+  _paq.push(['disableCookies']);
+  _paq.push(['trackPageView']);
+  _paq.push(['enableLinkTracking']);
+  (function() {
+    var u="https://analytics.apache.org/";
+    _paq.push(['setTrackerUrl', u+'matomo.php']);
+    _paq.push(['setSiteId', '13']);
+    var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
+    g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s);
+  })();
+</script>
+<!-- End Matomo -->
-
-<script type="application/javascript">
-window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;
-ga("create", "UA-140539454-1", "auto");
-ga("send", "pageview");
-<script async src="https://www.google-analytics.com/analytics.js"></script>
-  <div class="footer">This page uses <a href="https://analytics.google.com/">
-    Google Analytics</a> to collect statistics. You can disable it by blocking
-    the JavaScript coming from www.google-analytics.com. Check our
-    <a href="privacy_notice.html">Privacy Policy</a>
-    for more details.
-    <script type="text/javascript">
-      (function() {
-        var ga = document.createElement('script');
-        ga.src = ('https:' == document.location.protocol ?
-          'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
-        ga.setAttribute('async', 'true');
-        var nodes = document.documentElement.childNodes;
-        var i = -1;
-        var node;
-        do {
-          i++;
-          node = nodes[i]
-        } while(node.nodeType !== Node.ELEMENT_NODE);
-        node.appendChild(ga);
-      })();
-  </div>
-    <script type="text/javascript">
-    var _gaq = _gaq || [];
-    _gaq.push(['_setAccount', 'UA-140539454-1']);
-    _gaq.push(['_trackPageview']);
-  </script>
-    <a href="../../../../../privacy_notice.html">Privacy Policy</a>
-    <a href="../../../../privacy_notice.html">Privacy Policy</a>
-    <a href="../../../privacy_notice.html">Privacy Policy</a>
-    <a href="../../privacy_notice.html">Privacy Policy</a>
-    <a href="../privacy_notice.html">Privacy Policy</a>
-    <a href="#">Privacy Policy</a>

@potiuk potiuk merged commit b75fcd3 into apache:main Apr 22, 2022
@potiuk
Copy link
Member

potiuk commented Apr 22, 2022

Can you double-check that it works please @rossturk once the site is "built" ?

@rossturk
Copy link
Contributor Author

@potiuk absolutely - I'll watch for it 👍

@potiuk
Copy link
Member

potiuk commented Apr 22, 2022

Built!

@rossturk
Copy link
Contributor Author

The docs look good to me.

But I seem to have missed something on the landing-pages. The GA script is still there, but without the correct UA- tag, probably because that line in config.toml needs to be commented out. But the Sphinx changes don't seem to be taking effect.

I will look into it and do a bit more local testing, then submit another PR in a short while 👍

@potiuk
Copy link
Member

potiuk commented Apr 22, 2022

Better before 2.3.0 release :D or next provider's release :)

@potiuk
Copy link
Member

potiuk commented Apr 22, 2022

And HUUUGE thanks for it ! You wrote on the devlist it was a fun task, I hope it is :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants