### E10S Experiment Aurora: Top extensions

[Bug 1224518](https://bugzilla.mozilla.org/show_bug.cgi?id=1224518)

This analysis lists the top extensions in the Telemetry pings and compares them to the [whitelisted e10s addon list](https://wiki.mozilla.org/Electrolysis/Addons).

In [1]:
import ujson as json
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import plotly.plotly as py
import IPython

from __future__ import division
from moztelemetry.spark import get_pings, get_one_ping_per_client, get_pings_properties
from montecarlino import grouped_permutation_test

%pylab inline
IPython.core.pylabtools.figsize(16, 7)

Unable to parse whitelist (/home/hadoop/anaconda/lib/python2.7/site-packages/moztelemetry/bucket-whitelist.json). Assuming all histograms are acceptable.
Populating the interactive namespace from numpy and matplotlib


In [2]:
sc.defaultParallelism

16

#### Whitelisted addons

In [3]:
# Fetched from https://wiki.mozilla.org/Electrolysis/Addons
# Some entries have been commented out because the addon ID could not be determined.
whitelisted_addons = {
    # Broken
    "YoutubeDownloader@PeterOlayev.com":                "1-Click YouTube Video Downloader",
    "wrc@avast.com":                                    "Avast Online Security",
    "abs@avira.com":                                    "Avira Browser Safety",
    "translator@zoli.bod":                              "Google Translator for Firefox",
    "firefox@ghostery.com":                             "Ghostery",
    "{3d7eb24f-2740-49df-8937-200b1cc08f8a}":           "Flashblock",
    "{635abd67-4fe9-1b23-4f01-e679fa7484c1}":           "Yahoo! Toolbar",
    "{7affbfae-c4e2-4915-8c0f-00fa3ec610a1}":           "Aol Toolbar",
    "jid1-F9UJ2thwoAm5gQ@jetpack":                      "Lightbeam",
    "{1BC9BA34-1EED-42ca-A505-6D2F1A935BBB}":           "IE Tab 2",

    # Somewhat working/uses CPOWs
    "{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}":           "Adblock Plus",
    "adblockpopups@jessehakanen.net":                   "Adblock Plus Pop-up Addon",
    "avg@toolbar":                                      "AVG SafeGuard toolbar",
    "elemhidehelper@adblockplus.org":                   "Element Hiding Helper for Adblock Plus",
    "{73a6fe31-595d-460b-a920-fcc0f8843232}":           "NoScript",
    "{19503e42-ca3c-4c27-b1e2-9cdb2170ee34}":           "FlashGot",
    "mozilla_cc2@internetdownloadmanager.com":          "IDM CC",
    "yasearch@yandex.ru":                               "Yandex Elements",
    "support@lastpass.com":                             "LastPass",
    "{a0d7ccb3-214d-498b-b4aa-0e8fda9a7bf7}":           "WOT",
    "artur.dubovoy@gmail.com":                          "Flash Video Downloader - YouTube HD Download [4K]",
    "onepassword4@agilebits.com":                       "1Password",

    # Totally working
    "cck2wizard@kaply.com":                             "CCK2",
    "firebug@software.joehewitt.com":                   "Firebug",
    "{2b10c1c8-a11f-4bad-fe9c-1c11e82cac42}":           "uBlock",
    "{46551EC9-40F0-4e47-8E18-8E5CF550CFB8}":           "Stylish",
    "{dc572301-7619-498c-a57d-39143191b318}":           "Tab Mix Plus",
    "jid1-YcMV6ngYmQRA2w@jetpack":                      "Pin It button",
    "{e4f94d1e-2f53-401e-8885-681602c0ddd8}":           "McAfee Security Scan Plus",
    "https-everywhere@eff.org":                         "HTTPS-Everywhere",
    "url_advisor@kaspersky.com":                        "Kaspersky URL Advisor",
    "abb@amazon.com":                                   "Amazon 1Button App for Firefox",
    "{a7c6cf7f-112c-4500-a7ea-39801a327e5f}":           "FireFTP",
    "personas@christopher.beard":                       "Personas Plus",
    "mozsocial.cliqz.com@services.mozilla.org":         "Cliqz",
    "{6c28e999-e900-4635-a39d-b1ec90ba0c0f}":           "Download Status Bar",
    "{e4a8a97b-f2ed-450b-b12d-ee082ba24781}":           "Greasemonkey",
    #                                                   "United Internet Addons",
    "verticaltabs@mozilla.com":                         "Vertical Tabs",
    "{b9db16a4-6edc-47ec-a1f4-b86292ed211d}":           "Video DownloadHelper",
    "foxmarks@kei.com":                                 "Xmarks",
    "{0545b830-f0aa-4d7e-8820-50a4629a56fe}":           "ColorfulTabs",
    "{b9bfaf1c-a63f-47cd-8b9a-29526ced9060}":           "Download YouTube Videos as MP4",
    "{DDC359D1-844A-42a7-9AA1-88A850A938A8}":           "DownThemAll!",
    "{1018e4d6-728f-4b20-ad56-37578a4de76b}":           "Flagfox",
    "extension@one-tab.com":                            "OneTab",
    "feca4b87-3be4-43da-a1b1-137c24220968@jetpack":     "YouTube Video and Audio Downloader",

    # Other
    "firefox@mega.co.nz":                               "MEGA",
    #                                                   "Norton Toolbar",
    "{4ED1F68A-5463-4931-9384-8FFF5ED91D92}":           "McAfee WebAdvisor",
    "vb@yandex.ru":                                     "Yandex Visual Bookmarks",
    "{ef4e370e-d9f0-4e00-b93e-a4f274cfdd5a}":           "FoxTab",
    #                                                   "LogMeIn Remote Access",
    "{195A3098-0BD5-4e90-AE22-BA1C540AFD1E}":           "Garmin Communicator",
    #                                                   "IBM CCK",
    "{fe272bd1-5f76-4ea4-8501-a05d35d823fc}":           "Adblock Edge",
    "{87F8774F-B485-47E2-A755-A40A8A5E8874}":           "GBBD Banco Santander (Brasil) S.A."
}

#### Get addons

In [5]:
dataset = sqlContext.load("s3://telemetry-parquet/e10s-experiment/generationDate=20151117", "parquet")

Transform Dataframe to RDD of pings

In [6]:
def row_2_ping(row):
    ping = {"environment": {"addons": json.loads(row.addons)}}
    return ping

In [7]:
subset = dataset.rdd.map(row_2_ping)
subset_count = subset.count()

In [8]:
subset_count

93465

In [9]:
def ping_has_addons(ping, check_id_func):
    activeAddons = ping["environment"]["addons"].get("activeAddons", {})
    if not activeAddons:
        return False
    for k, v in activeAddons.iteritems():
        if not check_id_func(k):
            return False
    return True

How many clients had at least one addon?

In [10]:
any_subset = subset.filter(lambda p: ping_has_addons(p, lambda k: True))
any_subset_count = any_subset.count()

In [11]:
print "{:.2f}%".format(100.0 * any_subset_count / subset_count)

54.53%


How many clients had only whitelisted addons?

In [12]:
whitelisted_subset = subset.filter(lambda p: ping_has_addons(p, lambda k: k in whitelisted_addons))
whitelisted_subset_count = whitelisted_subset.count()

In [13]:
print "{:.2f}%".format(100.0 * whitelisted_subset_count / subset_count)

8.43%


How many clients had at least one unwhitelisted addon?

In [14]:
print "{:.2f}%".format(100.0 * (any_subset_count - whitelisted_subset_count) / subset_count)

46.10%


In [15]:
def get_ping_addons(ping):
    activeAddons = ping["environment"]["addons"].get("activeAddons", {})
    for k, v in activeAddons.iteritems():
        if v.get("name"):
            yield (k, v["name"].encode("ascii", "ignore"))

addons = subset.flatMap(get_ping_addons)

In [16]:
addon_counts = addons.countByKey()

How many addons did the clients have installed in total?

In [17]:
total_addons = sum(addon_counts.values())
total_addons

177157

Which whitelisted addons did not appear in the pings?

In [18]:
for addon in whitelisted_addons:
    if not addon in addon_counts:
        print whitelisted_addons[addon]

Aol Toolbar
AVG SafeGuard toolbar


#### Top whitelisted addons

In [19]:
from collections import Counter

for addon, addon_count in Counter(addon_counts).most_common():
    if addon in whitelisted_addons:
        print "{:.3f}%: {}".format(100.0 * addon_count / total_addons, whitelisted_addons[addon])

8.850%: Adblock Plus
1.767%: Video DownloadHelper
1.682%: Firebug
1.014%: DownThemAll!
0.928%: Greasemonkey
0.711%: Download YouTube Videos as MP4
0.672%: Ghostery
0.587%: Flash Video Downloader - YouTube HD Download [4K]
0.580%: NoScript
0.563%: LastPass
0.535%: Stylish
0.496%: YouTube Video and Audio Downloader
0.455%: Adblock Plus Pop-up Addon
0.451%: FireFTP
0.439%: Tab Mix Plus
0.438%: FlashGot
0.413%: Google Translator for Firefox
0.400%: WOT
0.397%: Flagfox
0.388%: 1-Click YouTube Video Downloader
0.360%: Adblock Edge
0.349%: uBlock
0.317%: Element Hiding Helper for Adblock Plus
0.308%: IDM CC
0.278%: Lightbeam
0.185%: Download Status Bar
0.177%: Kaspersky URL Advisor
0.163%: Xmarks
0.159%: Personas Plus
0.152%: HTTPS-Everywhere
0.145%: 1Password
0.125%: MEGA
0.125%: Yandex Elements
0.111%: IE Tab 2
0.096%: Pin It button
0.094%: ColorfulTabs
0.086%: Flashblock
0.062%: OneTab
0.040%: Garmin Communicator
0.020%: Amazon 1Button App for Firefox
0.019%: Yandex Visual Bookmarks
0.014%

#### Top unwhitelisted addons

In [20]:
# An addon ID might have multiple names. Pick the longer one because some addons appear to have
# invalid names (e.g. single space).
addon_names = addons.reduceByKey(lambda a, b: a if len(a) > len(b) else b).collectAsMap()

In [21]:
for addon, addon_count in Counter(addon_counts).most_common(100):
    if not addon in whitelisted_addons:
        print "{:.3f}%: {} ({})".format(100.0 * addon_count / total_addons, addon_names[addon], addon)

13.962%: ADB Helper (adbhelper@mozilla.org)
13.574%: Valence (fxdevtools-adapters@mozilla.org)
1.165%: Web Developer ({c45c406e-ab73-11d8-be73-000a95be3b12})
0.996%: uBlock Origin (uBlock0@raymondhill.net)
0.710%: ColorZilla ({6AC85730-7D0F-4de0-B3FA-21142DD85326})
0.512%: User Agent Switcher ({e968fc70-8f95-4ab9-9e79-304de2a71ee1})
0.401%: MeasureIt ({75CEEE46-9B64-46f8-94BF-54012DE155F0})
0.387%: JSONView (jsonview@brh.numbera.com)
0.382%: Reddit Enhancement Suite (jid1-xUfzOsOFlzSOXg@jetpack)
0.357%: S3.Google Translator (s3google@translator)
0.349%: Awesome screenshot: Capture and Annotate (jid0-GXjLLfbCoAx0LcltEdFrEkQdQPI@jetpack)
0.334%: ZenMate Security, Privacy & Unblock VPN (firefox@zenmate.com)
0.331%: FireGestures (firegestures@xuldev.org)
0.305%: Module de blocage des sites Internet dangereux (content_blocker@kaspersky.com)
0.299%: Flash and Video Download ({bee6eb20-01e0-ebd1-da83-080329fb9a3a})
0.292%: RESTClient ({ad0d925d-88f8-47f1-85ea-8463569e756e})
0.290%: Wappalyzer