### e10s-beta45-withaddons: Crash rate (with addons)

[Bug 1222890](https://bugzilla.mozilla.org/show_bug.cgi?id=1222890)

This analysis compares e10s and non-e10s crash rates.

In [1]:
import ujson as json
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import plotly.plotly as py
import IPython

from __future__ import division
from moztelemetry.spark import get_pings, get_one_ping_per_client, get_pings_properties
from montecarlino import grouped_permutation_test

%pylab inline
IPython.core.pylabtools.figsize(16, 7)

Unable to parse whitelist (/home/hadoop/anaconda2/lib/python2.7/site-packages/moztelemetry/bucket-whitelist.json). Assuming all histograms are acceptable.
Populating the interactive namespace from numpy and matplotlib


In [2]:
sc.defaultParallelism

160

In [3]:
def is_in_e10s_experiment(ping):
    try:
        experiment = ping["environment"]["addons"]["activeExperiment"]
        return experiment["id"] == "e10s-beta45-withaddons@experiments.mozilla.org" and \
               (experiment["branch"] == "control" or experiment["branch"] == "experiment")   
    except:
        return False

In [4]:
def is_e10s_ping(ping):
    return ping["environment"]["settings"]["e10sEnabled"]

In [5]:
def is_with_user_addons(ping):
    system_addons = ["firefox@getpocket.com", "loop@mozilla.org"]
    addons = ping["environment"]["addons"]["activeAddons"]
    if not addons:
        return False
    for k, v in addons.iteritems():
        if not k in system_addons:
            return True
    return False

### Pings

In [6]:
PING_OPTIONS = { "app": "Firefox", "channel": "beta", "version": "45.0", "build_id": "20160204142810" }

In [7]:
main_pings = get_pings(sc, doc_type="main", **PING_OPTIONS).filter(is_in_e10s_experiment) \
                                                           .filter(is_with_user_addons).cache()

In [8]:
main_pings.map(lambda p: (p["environment"]["addons"]["activeExperiment"]["branch"], 0)).countByKey()

defaultdict(int, {u'control': 1019818, u'experiment': 890319})

In [9]:
crash_pings = get_pings(sc, doc_type="crash", **PING_OPTIONS).filter(is_in_e10s_experiment) \
                                                             .filter(is_with_user_addons).cache()

In [10]:
crash_pings.map(lambda p: (p["environment"]["addons"]["activeExperiment"]["branch"], 0)).countByKey()

defaultdict(int, {u'control': 23095, u'experiment': 12138})

What are the total subsession lengths per build ID?

In [14]:
def get_subsession_lengths_per_build_id(pings):
    return pings.map(lambda p: (p["application"]["buildId"], p["payload"]["info"].get("subsessionLength", 0))) \
                .reduceByKey(lambda a, b: a + b).collectAsMap()

In [15]:
e10s_main_pings = main_pings.filter(lambda p: is_e10s_ping(p))
e10s_subsession_lengths = get_subsession_lengths_per_build_id(e10s_main_pings)
e10s_subsession_lengths

{u'20160204142810': 4447196997}

In [16]:
non_e10s_main_pings = main_pings.filter(lambda p: not is_e10s_ping(p))
non_e10s_subsession_lengths = get_subsession_lengths_per_build_id(non_e10s_main_pings)
non_e10s_subsession_lengths

{u'20160204142810': 5195545474}

What are the total (parent) crash counts per build ID?

In [17]:
def get_crash_counts_per_build_id(pings):
    return dict(pings.map(lambda p: (p["application"]["buildId"], 0)).countByKey())

In [18]:
e10s_crash_pings = crash_pings.filter(lambda p: is_e10s_ping(p))
e10s_crash_counts = get_crash_counts_per_build_id(e10s_crash_pings)
e10s_crash_counts

{u'20160204142810': 12110}

In [19]:
non_e10s_crash_pings = crash_pings.filter(lambda p: not is_e10s_ping(p))
non_e10s_crash_counts = get_crash_counts_per_build_id(non_e10s_crash_pings)
non_e10s_crash_counts

{u'20160204142810': 23123}

What are the total content crash counts per build ID?

In [20]:
def get_content_abort_count(ping):
    return ping["payload"].get("keyedHistograms", {}).get("SUBPROCESS_ABNORMAL_ABORT", {}).get("content", {}).get("sum", 0)

def get_content_crash_count_per_build_id(pings):
    return pings.map(lambda p: (p["application"]["buildId"], get_content_abort_count(p))) \
                .reduceByKey(lambda a, b: a + b).collectAsMap()

In [21]:
e10s_content_crash_counts = get_content_crash_count_per_build_id(e10s_main_pings)
e10s_content_crash_counts

{u'20160204142810': 20874}

### Crashes per 1000 usage hours

In [22]:
SECS_PER_1000_HOURS = 1000 * 60 * 60
print "build ID             non-e10s    e10s-parent   e10s-content"
for build_id in sorted(set(e10s_crash_counts.keys()) & set(non_e10s_crash_counts.keys())):
    print "{} {:>14.3f} {:>14.3f} {:>14.3f}".format(
        build_id,
        non_e10s_crash_counts[build_id] / non_e10s_subsession_lengths[build_id] * SECS_PER_1000_HOURS,
        e10s_crash_counts[build_id] / e10s_subsession_lengths[build_id] * SECS_PER_1000_HOURS,
        e10s_content_crash_counts[build_id] / e10s_subsession_lengths[build_id] * SECS_PER_1000_HOURS)

build ID             non-e10s    e10s-parent   e10s-content
20160204142810         16.022          9.803         16.897
