Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do Not Track support #4046

Merged
merged 16 commits into from May 30, 2018
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
53 changes: 49 additions & 4 deletions docs/advertising-details.rst
Expand Up @@ -110,13 +110,50 @@ However, we always give advance notice in our issue tracker
and via email about showing ads where none were shown before.


.. _do-not-track:

Do Not Track Policy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is the right part of the docs for this. It feels like it should probably be it's own thing, since it applies to RTD itself, and not just ads.

I realize there are ad-specific things, so maybe a DNT page in the docs, and then also a section here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps having it in the Privacy Policy is enough as an additional section. I guess it depends how heavily we want to promote the fact that we support it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a section in the privacy policy is good and the advertising details will link there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

-------------------

Read the Docs supports Do Not Track (DNT) and respects users' tracking preferences.
Specifically, we support the `W3C's tracking preference expression`_
and the `EFF's DNT Policy`_.

This means:

* We **do not** do behavioral ad targeting regardless of your DNT preference.
You probably already knew that from reading the rest of this document.
* When DNT is enabled, both logged-in and logged-out users
are considered opted-out of analytics.
* Regardless of DNT preference, our logs that contain IP addresses
and user agent strings are deleted after 10 days unless a DNT exception applies.
* Our full DNT policy is `available here`_.

For more details about DNT, visit `All About Do Not Track`_.

Our DNT policy applies without reservation to ``readthedocs.org``.
A best effort is made to apply this to documentation sites hosted for authors
(typically ``*.readthedocs.io``, but also other domains),
but we do not have complete control over the contents of these sites.

.. _W3C's tracking preference expression: https://www.w3.org/TR/tracking-dnt/
.. _EFF's DNT Policy: https://www.eff.org/issues/do-not-track
.. _available here: https://readthedocs.org/.well-known/dnt-policy.txt
.. _All About Do Not Track: http://www.allaboutdnt.com

.. important::

Due to the nature of our environment where documentation is built as necessary,
the analytics opt-out only applies to documentation sites built after May 1, 2018.


.. _advertising-analytics:

Analytics
---------

Analytics are a sensitive enough issue that they require their own section.
In the spirit of full transparency, Read the Docs currently uses Google Analytics (GA).
In the spirit of full transparency, Read the Docs uses Google Analytics (GA).

GA is a contentious issue inside Read the Docs and in our community.
Some users are very sensitive and privacy conscious to usage of GA.
Expand All @@ -125,14 +162,22 @@ The developers at Read the Docs understand that different users have different p
and we try to respect the different viewpoints as much as possible while also accomplishing
our own goals.

We have taken steps to address some of the privacy concerns surrounding GA.
These steps apply both to analytics collected by Read the Docs and when
:doc:`authors enable analytics on their docs <guides/google-analytics>`.

* Users can opt-out of analytics by using the Do Not Track feature of their browser.
* Read the Docs instructs Google to anonymize IPs sent to them before they are stored.
* The cookies set by GA expire more rapidly (30 days) than the default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We configured the cookies set by GA to only last 30 days, instead of the default of 2 years" reads better.


Why we use analytics
~~~~~~~~~~~~~~~~~~~~

Advertisers ask us questions that are easily answered with an analytics solution like
"how many users do you have in Switzerland browsing Python docs?". We need to be able
to easily get this data. We also use data from GA for some development decisions such
as what browsers to support (or not) or how much usage a particular page or feature gets.

We have taken steps to address some of the privacy concerns.
Read the Docs instructs Google to anonymize IPs sent to them before they are stored.

Alternatives
~~~~~~~~~~~~

Expand Down
2 changes: 1 addition & 1 deletion docs/ethical-advertising.rst
Expand Up @@ -96,7 +96,7 @@ Additional details

* We have additional documentation on the
:doc:`technical details of our advertising <advertising-details>`
including our use of analytics.
including our Do Not Track policy and our use of analytics.
* We have an `advertising FAQ`_ written for advertisers.
* We have gone into more detail about our views in our
`blog post <https://blog.readthedocs.com/ads-on-read-the-docs/>`_ about this topic.
Expand Down
10 changes: 9 additions & 1 deletion docs/guides/google-analytics.rst
Expand Up @@ -9,4 +9,12 @@ You can enable it by:

Once your documentation rebuilds it will include your Analytics tracking code and start sending data.
Google Analytics usually takes 60 minutes,
and sometimes can take up to a day before it starts reporting data.
and sometimes can take up to a day before it starts reporting data.

.. note::

Read the Docs takes some extra precautions with analytics to protect user privacy.
As a result, users with Do Not Track enabled will not be counted
for the purpose of analytics.

For more details, see our :ref:`Do Not Track Policy <do-not-track>`.
59 changes: 33 additions & 26 deletions media/javascript/readthedocs-analytics.js
Expand Up @@ -2,33 +2,40 @@
// https://docs.readthedocs.io/en/latest/advertising-details.html#analytics


// RTD Analytics Code
// Skip analytics for users with Do Not Track enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use something like this script to check if do not track is enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That entire script is basically to handle an IE10 bug. I'm not sure it's worth the effort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be in a script or a function that we can call from any scipt. Regarding IE, we may still support IE10.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IE10 is not supported by Microsoft with a couple exceptions and it is a tiny fraction of our users (sub-0.1%). I don't think it is unreasonable to not support a privacy feature for users who are using a browser unsupported by the vendor. In addition, the "support" that the linked script offers is mostly just to mark IE10 as "unspecified" for DNT which for our purpose would be off.

I lean toward simplicity here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand. then we can keep window.doNotTrack === '1' || navigator.doNotTrack === '1') in a function and call it from everywhere!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, after testing this I'm reconsidering. It looks like IE11 on Windows 7 and Windows 8 set the DNT default to on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that to use this script would mean marking IE11 and IE10 as having DNT as "unspecified" I'm leaning toward maybe just checking navigator.doNotTrack === '1' and that's it. This would mean that no versions of IE can opt-out of tracking. It would be supported in MS Edge, however.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can keep window.doNotTrack === '1' || navigator.doNotTrack === '1') in a function and call it from everywhere!

Not very easily actually. readthedocs-analytics.js is loaded on the docs pages and should not have any outside dependencies apart from READTHEDOCS_DATA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR to just check navigator.doNotTrack === '1'.

if (window.doNotTrack === '1' || navigator.doNotTrack === '1') {
console.log('Respecting DNT with respect to analytics...');
} else {
// RTD Analytics Code
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
if (typeof READTHEDOCS_DATA !== 'undefined') {
if (READTHEDOCS_DATA.global_analytics_code) {
ga('create', READTHEDOCS_DATA.global_analytics_code, 'auto', 'rtfd', {
'cookieExpires': 30 * 24 * 60 * 60
});
ga('rtfd.set', 'dimension1', READTHEDOCS_DATA.project);
ga('rtfd.set', 'dimension2', READTHEDOCS_DATA.version);
ga('rtfd.set', 'dimension3', READTHEDOCS_DATA.language);
ga('rtfd.set', 'dimension4', READTHEDOCS_DATA.theme);
ga('rtfd.set', 'dimension5', READTHEDOCS_DATA.programming_language);
ga('rtfd.set', 'dimension6', READTHEDOCS_DATA.builder);
ga('rtfd.set', 'anonymizeIp', true);
ga('rtfd.send', 'pageview');
}

if (typeof READTHEDOCS_DATA !== 'undefined') {
if (READTHEDOCS_DATA.global_analytics_code) {
ga('create', READTHEDOCS_DATA.global_analytics_code, 'auto', 'rtfd');
ga('rtfd.set', 'dimension1', READTHEDOCS_DATA.project);
ga('rtfd.set', 'dimension2', READTHEDOCS_DATA.version);
ga('rtfd.set', 'dimension3', READTHEDOCS_DATA.language);
ga('rtfd.set', 'dimension4', READTHEDOCS_DATA.theme);
ga('rtfd.set', 'dimension5', READTHEDOCS_DATA.programming_language);
ga('rtfd.set', 'dimension6', READTHEDOCS_DATA.builder);
ga('rtfd.set', 'anonymizeIp', true);
ga('rtfd.send', 'pageview');
// User Analytics Code
if (READTHEDOCS_DATA.user_analytics_code) {
ga('create', READTHEDOCS_DATA.user_analytics_code, 'auto', 'user', {
'cookieExpires': 30 * 24 * 60 * 60
});
ga('user.set', 'anonymizeIp', true);
ga('user.send', 'pageview');
}
// End User Analytics Code
}

// User Analytics Code
if (READTHEDOCS_DATA.user_analytics_code) {
ga('create', READTHEDOCS_DATA.user_analytics_code, 'auto', 'user');
ga('user.set', 'anonymizeIp', true);
ga('user.send', 'pageview');
}
// End User Analytics Code
// end RTD Analytics Code
}

// end RTD Analytics Code
16 changes: 15 additions & 1 deletion readthedocs/core/views/__init__.py
Expand Up @@ -12,7 +12,7 @@
import logging

from django.conf import settings
from django.http import HttpResponseRedirect, Http404
from django.http import HttpResponseRedirect, Http404, JsonResponse
from django.shortcuts import render, get_object_or_404, redirect
from django.views.decorators.csrf import csrf_exempt
from django.views.generic import TemplateView
Expand Down Expand Up @@ -115,3 +115,17 @@ def server_error_404(request, exception, template_name='404.html'): # pylint: d
r = render(request, template_name)
r.status_code = 404
return r


def do_not_track(request):
dnt_header = request.META.get('HTTP_DNT')

# https://w3c.github.io/dnt/drafts/tracking-dnt.html#status-representation
return JsonResponse({ # pylint: disable=redundant-content-type-for-json-response
'policy': 'https://docs.readthedocs.io/en/latest/privacy-policy.html',
'same-party': [
'readthedocs.org',
'readthedocs.io',
],
'tracking': 'N' if dnt_header == '1' else 'T',
}, content_type='application/tracking-status+json')
2 changes: 2 additions & 0 deletions readthedocs/settings/base.py
Expand Up @@ -59,6 +59,8 @@ class CommunityBaseSettings(Settings):
SESSION_COOKIE_DOMAIN = 'readthedocs.org'
SESSION_COOKIE_HTTPONLY = True
CSRF_COOKIE_HTTPONLY = True
# See: docs/advertising-details.rst
CSRF_COOKIE_AGE = None # session cookie (expires on browser quit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break user experience

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have closed the browser and restore the window, the page will be loaded from cache. but the CSRF cookie will not be there. So it may make submittion CSRF Error. maybe you can try using django session CSRF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is potentially true and is warned about in the Django docs. I don't think it will be a big issue for us since it only affects form submissions of which the only ones on a non-authed page are the login/signup forms. However, we could make the CSRF cookie age match the logged in cookie age (~2 weeks) to mitigate it. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set it to 30 days so it matches the GA cookie. I think that's a reasonable balance so it's pretty obvious we aren't using it to track users.


# Application classes
@property
Expand Down
42 changes: 26 additions & 16 deletions readthedocs/templates/base.html
Expand Up @@ -17,22 +17,32 @@

<!-- Google Analytics -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', '{{ GLOBAL_ANALYTICS_CODE }}', 'auto', 'rtfd');
ga('rtfd.set', 'anonymizeIp', true);
ga('rtfd.send', 'pageview');

{% if DASHBOARD_ANALYTICS_CODE %}
// Dashboard Analytics Code
ga('create', '{{ DASHBOARD_ANALYTICS_CODE }}', 'auto', 'user');
ga('user.set', 'anonymizeIp', true);
ga('user.send', 'pageview');
// End Dashboard Analytics Code
{% endif %}
if (window.doNotTrack === '1' || navigator.doNotTrack === '1') {
console.log('Respecting DNT with respect to analytics...');
} else {
// For more details on analytics at Read the Docs, please see:
// https://docs.readthedocs.io/en/latest/advertising-details.html#analytics
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', '{{ GLOBAL_ANALYTICS_CODE }}', 'auto', 'rtfd', {
'cookieExpires': 30 * 24 * 60 * 60
});
ga('rtfd.set', 'anonymizeIp', true);
ga('rtfd.send', 'pageview');

{% if DASHBOARD_ANALYTICS_CODE %}
// Dashboard Analytics Code
ga('create', '{{ DASHBOARD_ANALYTICS_CODE }}', 'auto', 'user', {
'cookieExpires': 30 * 24 * 60 * 60
});
ga('user.set', 'anonymizeIp', true);
ga('user.send', 'pageview');
// End Dashboard Analytics Code
{% endif %}
}
</script>
<!-- End Google Analytics -->

Expand Down