## Evaluation of Adnet Client Requests

### Statistics of Interest
#### Setting
- we have tested for
  - 40 potential vulnerabilities/configurations
  - 7 different algorithms (required also for vulnerability assessments)
  - (explicit session finish)
  - ~legacy downgrade vulnerabilities~ (contained in the 40 vulnerabilities)

#### Statistics and Plots
- "impact" of the different vulnerabilities
  - as expressed by share of users vulnerable to them
  - represented by ???-plot (40 vulns to cover...)
- "competibility" of the different vulnerabilities
  - as expressed by their share among the total number of individual vulnerabilities found
- 

### Evaluations Sketch
- load request log file to DF
- create complete request matrix for each of the resolvers and domains (ip, d1, d2, d3, d4)
- determine from request matrix for each (resolver, user)-tuple
  - has session finised
  - for each algorithm: supports algorithm
  - validates dnssec ("broken"; combination of algorithm support)
  - for each vulnerability check: is vulnerable

In [1]:
import pandas as pd
import numpy as np
from urllib import parse
# import logging
import matplotlib as plt
import seaborn as sns
import matplotlib.ticker as mtick
import logging

# pd.options.display.max_rows = 2000
pd.options.display.max_columns = 2000
# logging.getLogger('matplotlib.font_manager').disabled = True

pd.options.plotting.backend = 'matplotlib'
plt.style.use('ggplot')

REPO_DIR = '../../dnssec-downgrade-data/'
DATA_DIR = REPO_DIR + '/2021-10-06_adnet-study/'  # location of input/raw and processed data
STATS_DIR = DATA_DIR + '/stats/' # output location fo tables and plots 

IP_VICTIM = '104.238.214.165'
IP_ATTACKER = '104.238.214.154'

LOGFILE_DEV = DATA_DIR + '/dev-adnet.json'  # proper subset of EU logfile
LOGFILE_EU = DATA_DIR + '/downg-EU.json'
LOGFILE_AF = DATA_DIR + '/downg-AF.json'
LOGFILE_SA = DATA_DIR + '/downg-SA.json'
LOGFILE_NA = DATA_DIR + '/downg-NA.json'
LOGFILE_OC = DATA_DIR + '/downg-OC.json'
LOGFILE_AS = DATA_DIR + '/downg-AS.json'

# Regions
R_DEV = "dev"
R_EU = "eu"  # Europe
R_AF = "af"  # Africa
R_SA = "sa"  # South America
R_NA = "na"  # North America
R_OC = "oc"  # Oceania
R_AS = "as"  # Asia
REGIONS = [
    # R_DEV,
    R_EU,
    R_AF,
    R_SA,
    R_NA,
    R_OC,
    R_AS,
]
REGIONS_TO_LOGFILES = {
    R_DEV: LOGFILE_DEV,
    R_EU: LOGFILE_EU,
    R_AF: LOGFILE_AF,
    R_SA: LOGFILE_SA,
    R_NA: LOGFILE_NA,
    R_OC: LOGFILE_OC,
    R_AS: LOGFILE_AS,
}


STUDY_DOMAINS = {  # Must not be FQDN!
    "resolver-downgrade-attack.dedyn.io",
    "downgrade.dedyn.io"
}

TEST_NAMES = {
    # Downgrade Vulnerabilities
    "mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnskey16",
    "mitm-ra-ds8-ds13.ds8-ds15-dnskey15",
    "mitm-ra-ds8-ds13-ds15.ds13-ds16-dnskey13-dnskey16",
    "mitm-ra-ds8-ds13-ds15.ds16",
    "mitm-ra-ds8-ds13-ds15.ds8-ds15-dnskey15",
    "mitm-ra-ds8-ds13-ds15-ds16.ds8-ds16-dnskey16",
    "mitm-ra-ds8-ds13-ds15-ds16.ds16",
    "mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15",
    "mitm-rs13-ra.ds8-ds16-dnskey16",
    "mitm-rs13-ra.ds8-ds13-dnskey8",
    "mitm-rs13-ra.ds8-ds15-dnskey8-dnskey15",
    "mitm-rs13-ra.ds13-ds16-dnskey13-dnskey16",
    "mitm-rs13-ra.ds16",
    "mitm-rs15-ra.ds8-ds16-dnskey16",
    "mitm-rs15-ra.ds8-ds13-dnskey13",
    "mitm-rs15-ra.ds8-dnskey8",
    "mitm-rs15-ra.ds8-ds16-dnskey8-dnskey16",
    "mitm-rs15-ra.ds13-ds16-dnskey13-dnskey16",
    "mitm-rs15-ra.ds16",
    "mitm-rs15-ra.ds8-ds16-dnskey16",
    "mitm-rs15-ra.ds8-ds15-dnskey15",
    "mitm-rs16-ra.ds8-ds13-dnskey13",
    "mitm-rs16-ra.ds15-ds16-dnskey15",
    "mitm-rs16-ra.ds13-ds16-dnskey16",
    "mitm-rs16-ra.ds13-ds16-dnskey13-dnskey16",
    "mitm-rs16-ra.ds8-ds13-dnskey8",
    "mitm-rs16-ra.ds8-ds15-dnskey8-dnskey15",
    "mitm-rs16-ra.ds13-dnskey13",
    "mitm-rs16-ra.ds8-ds13-dnskey8-dnskey13",
    "mitm-rs16-ra.ds13-ds15-dnskey15",
    "mitm-rs8-ra.ds8-ds16-dnskey16",
    "mitm-rs8-ra.ds15-ds16-dnskey16",
    "mitm-rs8-ra.ds13-ds16-dnskey16",
    "mitm-rs8-ra.ds13-ds16-dnskey13-dnskey16",
    "mitm-rs8-ra.ds13-dnskey13",
    "mitm-rs8-ra.ds13-ds15-dnskey13-dnskey15",
    "mitm-rs8-ra.ds8-ds13-dnskey8-dnskey13",
    "mitm-rs8-ra.ds8-dnskey8",
    "mitm-rs8-ra.ds16",
    "mitm-rs8-ra.ds8-ds16-dnskey8-dnskey16",
    
    # Algorithm Support
    "mitm-ra.ds5-dnskey5",
    "mitm-ra.ds8-dnskey8",
    "mitm-ra.ds10-dnskey10",
    "mitm-ra.ds13-dnskey13",
    "mitm-ra.ds14-dnskey14",
    "mitm-ra.ds15-dnskey15",
    "mitm-ra.ds16-dnskey16",
    
    # Legacy Downgrade Vulnerabilities (actually covered by non-legacy)
    # "ecdsap256sha256",
    # "onlyrsasha256",
    # "rsasha256",
    
    # Housekeeping
    "broken",
    "session-finish",  # substitute for empty child / parent domain
}

### Load Request Data

In [2]:
region_dfs = [pd.read_json(REGIONS_TO_LOGFILES[region], lines=True) for region in REGIONS]

df_req_raw = pd.concat(region_dfs, keys=REGIONS).reset_index(level=0).rename(mapper={"level_0": "region"}, axis=1)
df_req_raw

Unnamed: 0,region,time_epoch,time_human,ip_server,ip_client,request_method,protocol,host_header,server_alias,port_server,url_path,filename,query,time_served_ms,status,errlog_reqest_id,user_agent
0,eu,1633511250653,2021-10-06T09:07:30,104.238.214.154,194.230.144.141,GET,HTTP/1.1,mitm-rs8-ra.ds13-dnskey13.downgrade.dedyn.io,ds13-dnskey13.downgrade.dedyn.io,443,/img.png,/var/www/downgrade.dedyn.io/img.png,?test=mitm-rs8-ra.ds13-dnskey13&tok=730252720&...,271,200,-,Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like...
1,eu,1633511250746,2021-10-06T09:07:30,104.238.214.154,194.230.144.141,GET,HTTP/1.1,mitm-rs8-ra.ds13-ds16-dnskey13-dnskey16.downgr...,ds13-ds16-dnskey13-dnskey16.downgrade.dedyn.io,443,/img.png,/var/www/downgrade.dedyn.io/img.png,?test=mitm-rs8-ra.ds13-ds16-dnskey13-dnskey16&...,291,200,-,Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like...
2,eu,1633511250854,2021-10-06T09:07:30,104.238.214.154,194.230.144.141,GET,HTTP/1.1,mitm-ra-ds8-ds13-ds15.ds16.downgrade.dedyn.io,ds16.downgrade.dedyn.io,443,/img.png,/var/www/downgrade.dedyn.io/img.png,?test=mitm-ra-ds8-ds13-ds15.ds16&tok=730252720...,466,200,-,Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like...
3,eu,1633511257590,2021-10-06T09:07:37,104.238.214.154,188.216.95.240,GET,HTTP/1.1,mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnskey16.d...,ds13-ds16-dnskey13-dnskey16.downgrade.dedyn.io,443,/img.png,/var/www/downgrade.dedyn.io/img.png,?test=mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnsk...,406,200,-,Mozilla/5.0 (Windows NT 10.0; Win64; x64) Appl...
4,eu,1633511257721,2021-10-06T09:07:37,104.238.214.154,188.216.95.240,GET,HTTP/1.1,mitm-ra-ds8-ds13.ds8-ds15-dnskey15.downgrade.d...,ds8-ds15-dnskey15.downgrade.dedyn.io,443,/img.png,/var/www/downgrade.dedyn.io/img.png,?test=mitm-ra-ds8-ds13.ds8-ds15-dnskey15&tok=2...,401,200,-,Mozilla/5.0 (Windows NT 10.0; Win64; x64) Appl...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
15378,as,1633694361938,2021-10-08T11:59:21,104.238.214.154,39.53.64.189,GET,HTTP/1.1,ecdsap256sha256.resolver-downgrade-attack.dedy...,ecdsap256sha256.resolver-downgrade-attack.dedy...,443,/img.png,/var/www/resolver-downgrade-attack.dedyn.io/im...,?test=ecdsap256sha256&tok=531293574&time=16336...,405,200,-,Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebK...
15379,as,1633694362100,2021-10-08T11:59:22,104.238.214.154,39.53.64.189,GET,HTTP/1.1,onlyrsasha256.resolver-downgrade-attack.dedyn.io,onlyrsasha256.resolver-downgrade-attack.dedyn.io,443,/img.png,/var/www/resolver-downgrade-attack.dedyn.io/im...,?test=onlyrsasha256&tok=531293574&time=1633694...,411,200,-,Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebK...
15380,as,1633694363289,2021-10-08T11:59:23,104.238.214.154,39.53.64.189,GET,HTTP/1.1,broken.resolver-downgrade-attack.dedyn.io,broken.resolver-downgrade-attack.dedyn.io,443,/img.png,/var/www/resolver-downgrade-attack.dedyn.io/im...,?test=broken&tok=531293574&time=1633694328835,392,200,-,Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebK...
15381,as,1633694363704,2021-10-08T11:59:23,104.238.214.154,39.53.64.189,GET,HTTP/1.1,rsasha256.resolver-downgrade-attack.dedyn.io,rsasha256.resolver-downgrade-attack.dedyn.io,443,/img.png,/var/www/resolver-downgrade-attack.dedyn.io/im...,?test=rsasha256&tok=531293574&time=1633694328835,398,200,-,Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebK...


#### Remove Irrelevant Columns

In [3]:
df_req_reduced = df_req_raw[['region', 'query', 'ip_server']]
df_req_reduced

Unnamed: 0,region,query,ip_server
0,eu,?test=mitm-rs8-ra.ds13-dnskey13&tok=730252720&...,104.238.214.154
1,eu,?test=mitm-rs8-ra.ds13-ds16-dnskey13-dnskey16&...,104.238.214.154
2,eu,?test=mitm-ra-ds8-ds13-ds15.ds16&tok=730252720...,104.238.214.154
3,eu,?test=mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnsk...,104.238.214.154
4,eu,?test=mitm-ra-ds8-ds13.ds8-ds15-dnskey15&tok=2...,104.238.214.154
...,...,...,...
15378,as,?test=ecdsap256sha256&tok=531293574&time=16336...,104.238.214.154
15379,as,?test=onlyrsasha256&tok=531293574&time=1633694...,104.238.214.154
15380,as,?test=broken&tok=531293574&time=1633694328835,104.238.214.154
15381,as,?test=rsasha256&tok=531293574&time=1633694328835,104.238.214.154


#### Expand Data From Query Parameter

In [4]:
def splitq(df):
    test = np.nan
    token = np.nan
    time = np.nan
    query = df['query']
    if query is not None and len(query) > 1:
        params = dict(parse.parse_qsl(query[1:]))
        test = params.get('test')
        token = str(params.get('tok'))
        time = params.get('time')
    return test, token, time
        

df_req_qsplit = df_req_reduced.copy(deep=True)
df_req_qsplit[['test', 'token', 'time_client']] = df_req_qsplit.apply(axis=1, func=splitq, result_type='expand')
df_req_qsplit = df_req_qsplit.drop(columns=['query'])
df_req_qsplit

Unnamed: 0,region,ip_server,test,token,time_client
0,eu,104.238.214.154,mitm-rs8-ra.ds13-dnskey13,730252720,1633511249822
1,eu,104.238.214.154,mitm-rs8-ra.ds13-ds16-dnskey13-dnskey16,730252720,1633511249822
2,eu,104.238.214.154,mitm-ra-ds8-ds13-ds15.ds16,730252720,1633511249820
3,eu,104.238.214.154,mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnskey16,2977949033,1633511256649
4,eu,104.238.214.154,mitm-ra-ds8-ds13.ds8-ds15-dnskey15,2977949033,1633511256649
...,...,...,...,...,...
15378,as,104.238.214.154,ecdsap256sha256,531293574,1633694328835
15379,as,104.238.214.154,onlyrsasha256,531293574,1633694328835
15380,as,104.238.214.154,broken,531293574,1633694328835
15381,as,104.238.214.154,rsasha256,531293574,1633694328835


#### Remove Queries that Don't Belong to a Test

In [5]:
df_req_qsplit_clean = df_req_qsplit.dropna()
seen_tests = df_req_qsplit_clean['test'].unique()
print(f"Seen tests ({len(seen_tests)}):\n{sorted(seen_tests)}".replace(",", ",\n"))
# sorted()

Seen tests (51):
['broken',
 'ecdsap256sha256',
 'finish',
 'mitm-ra-ds8-ds13-ds15-ds16.ds16',
 'mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15',
 'mitm-ra-ds8-ds13-ds15-ds16.ds8-ds16-dnskey16',
 'mitm-ra-ds8-ds13-ds15.ds13-ds16-dnskey13-dnskey16',
 'mitm-ra-ds8-ds13-ds15.ds16',
 'mitm-ra-ds8-ds13-ds15.ds8-ds15-dnskey15',
 'mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnskey16',
 'mitm-ra-ds8-ds13.ds8-ds15-dnskey15',
 'mitm-ra.ds10-dnskey10',
 'mitm-ra.ds13-dnskey13',
 'mitm-ra.ds14-dnskey14',
 'mitm-ra.ds15-dnskey15',
 'mitm-ra.ds16-dnskey16',
 'mitm-ra.ds5-dnskey5',
 'mitm-ra.ds8-dnskey8',
 'mitm-rs13-ra.ds13-ds16-dnskey13-dnskey16',
 'mitm-rs13-ra.ds16',
 'mitm-rs13-ra.ds8-ds13-dnskey8',
 'mitm-rs13-ra.ds8-ds15-dnskey8-dnskey15',
 'mitm-rs13-ra.ds8-ds16-dnskey16',
 'mitm-rs15-ra.ds13-ds16-dnskey13-dnskey16',
 'mitm-rs15-ra.ds16',
 'mitm-rs15-ra.ds8-dnskey8',
 'mitm-rs15-ra.ds8-ds13-dnskey13',
 'mitm-rs15-ra.ds8-ds15-dnskey15',
 'mitm-rs15-ra.ds8-ds16-dnskey16',
 'mitm-rs15-ra.ds8-ds16-dnskey8-

#### Aggregate over Clients the Tests that had Queries at the Web Server

In [6]:
df_req_clients = df_req_qsplit_clean[['token', 'test', 'ip_server']]
df_req_clients

# for name, group in request_df_clients:
#     print(f"{name}\n{group}\n")

Unnamed: 0,token,test,ip_server
0,730252720,mitm-rs8-ra.ds13-dnskey13,104.238.214.154
1,730252720,mitm-rs8-ra.ds13-ds16-dnskey13-dnskey16,104.238.214.154
2,730252720,mitm-ra-ds8-ds13-ds15.ds16,104.238.214.154
3,2977949033,mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnskey16,104.238.214.154
4,2977949033,mitm-ra-ds8-ds13.ds8-ds15-dnskey15,104.238.214.154
...,...,...,...
15378,531293574,ecdsap256sha256,104.238.214.154
15379,531293574,onlyrsasha256,104.238.214.154
15380,531293574,broken,104.238.214.154
15381,531293574,rsasha256,104.238.214.154


In [7]:
df_req_clients_queries_groupby = df_req_clients[['token', 'test', 'ip_server']].groupby('token')

# dirty but does the job...
newdf_v = []
for name, group in df_req_clients_queries_groupby:
    assert len(set(group['token'].values)) == 1
    token = group['token'].values[0]
    requests_present = dict((test_name, test_name in group['test'].values) for test_name in TEST_NAMES)
    newdf_v.append({'token': token, **requests_present})
df_request_presence = pd.DataFrame(newdf_v)
df_request_presence = df_request_presence.set_index('token')
df_request_presence

Unnamed: 0_level_0,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15,mitm-rs15-ra.ds8-ds15-dnskey15,mitm-rs13-ra.ds8-ds13-dnskey8,mitm-rs8-ra.ds8-ds16-dnskey8-dnskey16,mitm-ra.ds10-dnskey10,mitm-rs15-ra.ds8-ds16-dnskey8-dnskey16,mitm-rs13-ra.ds13-ds16-dnskey13-dnskey16,mitm-ra-ds8-ds13-ds15.ds8-ds15-dnskey15,mitm-rs8-ra.ds13-ds16-dnskey16,mitm-rs16-ra.ds8-ds13-dnskey8-dnskey13,mitm-ra.ds8-dnskey8,mitm-rs8-ra.ds15-ds16-dnskey16,mitm-rs15-ra.ds8-ds16-dnskey16,mitm-ra.ds16-dnskey16,mitm-rs8-ra.ds8-dnskey8,mitm-rs15-ra.ds13-ds16-dnskey13-dnskey16,mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnskey16,mitm-rs8-ra.ds8-ds16-dnskey16,mitm-ra.ds15-dnskey15,mitm-ra-ds8-ds13.ds8-ds15-dnskey15,mitm-ra-ds8-ds13-ds15-ds16.ds16,mitm-ra.ds14-dnskey14,mitm-rs15-ra.ds8-dnskey8,mitm-ra-ds8-ds13-ds15.ds16,mitm-rs16-ra.ds13-dnskey13,mitm-rs15-ra.ds16,mitm-rs8-ra.ds13-dnskey13,mitm-rs8-ra.ds8-ds13-dnskey8-dnskey13,mitm-rs13-ra.ds16,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds16-dnskey16,mitm-rs13-ra.ds8-ds16-dnskey16,mitm-rs16-ra.ds13-ds15-dnskey15,mitm-rs8-ra.ds13-ds15-dnskey13-dnskey15,mitm-ra.ds5-dnskey5,mitm-rs8-ra.ds13-ds16-dnskey13-dnskey16,mitm-ra.ds13-dnskey13,mitm-rs16-ra.ds8-ds15-dnskey8-dnskey15,broken,mitm-rs16-ra.ds8-ds13-dnskey13,mitm-rs16-ra.ds15-ds16-dnskey15,mitm-rs8-ra.ds16,mitm-rs13-ra.ds8-ds15-dnskey8-dnskey15,mitm-ra-ds8-ds13-ds15.ds13-ds16-dnskey13-dnskey16,mitm-rs16-ra.ds13-ds16-dnskey16,mitm-rs15-ra.ds8-ds13-dnskey13,mitm-rs16-ra.ds13-ds16-dnskey13-dnskey16,session-finish,mitm-rs16-ra.ds8-ds13-dnskey8
token,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1
100066390,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False
100084605,True,True,True,True,True,True,True,True,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True
1001178984,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False
1001406422,True,True,True,True,True,True,True,False,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True
1001661604,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
996220222,False,False,False,False,False,True,False,False,True,True,False,False,True,True,False,True,True,False,True,False,True,False,True,True,True,True,True,False,True,False,True,True,True,False,True,False,True,False,True,True,True,True,True,True,True,True,True,True
997138338,True,True,True,True,False,True,True,True,True,False,True,False,True,False,True,True,True,True,False,True,True,False,True,True,True,True,False,False,True,True,True,False,True,False,True,False,True,False,True,True,True,True,True,True,True,False,True,True
997306845,True,True,True,True,True,True,True,True,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True
998068579,True,False,True,False,False,False,True,True,False,False,False,False,True,False,False,False,True,False,False,True,True,False,True,True,False,False,False,False,True,True,True,False,False,False,False,False,False,False,False,False,False,True,True,False,True,False,False,False


### Infer Resolver Properties

#### Filter for Finished Sessions

In [8]:
def single_value(s):
    if len(set(s)) > 1:
        logging.warning(f'different regions for the same token: {set(s)}')
    return next(iter(s))

df_token_region = df_req_qsplit.groupby('token').agg({'region': [single_value]}).reset_index()
df_token_region.columns = df_token_region.columns.droplevel(1)
df_token_region = df_token_region.set_index('token')



In [9]:
df_finished_sessions = df_request_presence.copy(deep=True)
df_finished_sessions = df_finished_sessions[df_finished_sessions['session-finish'] == True]  # keep only those with finished session
# request_df_rp_fin = request_df_rp_fin.drop(columns=['finish'])
df_finished_sessions

Unnamed: 0_level_0,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15,mitm-rs15-ra.ds8-ds15-dnskey15,mitm-rs13-ra.ds8-ds13-dnskey8,mitm-rs8-ra.ds8-ds16-dnskey8-dnskey16,mitm-ra.ds10-dnskey10,mitm-rs15-ra.ds8-ds16-dnskey8-dnskey16,mitm-rs13-ra.ds13-ds16-dnskey13-dnskey16,mitm-ra-ds8-ds13-ds15.ds8-ds15-dnskey15,mitm-rs8-ra.ds13-ds16-dnskey16,mitm-rs16-ra.ds8-ds13-dnskey8-dnskey13,mitm-ra.ds8-dnskey8,mitm-rs8-ra.ds15-ds16-dnskey16,mitm-rs15-ra.ds8-ds16-dnskey16,mitm-ra.ds16-dnskey16,mitm-rs8-ra.ds8-dnskey8,mitm-rs15-ra.ds13-ds16-dnskey13-dnskey16,mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnskey16,mitm-rs8-ra.ds8-ds16-dnskey16,mitm-ra.ds15-dnskey15,mitm-ra-ds8-ds13.ds8-ds15-dnskey15,mitm-ra-ds8-ds13-ds15-ds16.ds16,mitm-ra.ds14-dnskey14,mitm-rs15-ra.ds8-dnskey8,mitm-ra-ds8-ds13-ds15.ds16,mitm-rs16-ra.ds13-dnskey13,mitm-rs15-ra.ds16,mitm-rs8-ra.ds13-dnskey13,mitm-rs8-ra.ds8-ds13-dnskey8-dnskey13,mitm-rs13-ra.ds16,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds16-dnskey16,mitm-rs13-ra.ds8-ds16-dnskey16,mitm-rs16-ra.ds13-ds15-dnskey15,mitm-rs8-ra.ds13-ds15-dnskey13-dnskey15,mitm-ra.ds5-dnskey5,mitm-rs8-ra.ds13-ds16-dnskey13-dnskey16,mitm-ra.ds13-dnskey13,mitm-rs16-ra.ds8-ds15-dnskey8-dnskey15,broken,mitm-rs16-ra.ds8-ds13-dnskey13,mitm-rs16-ra.ds15-ds16-dnskey15,mitm-rs8-ra.ds16,mitm-rs13-ra.ds8-ds15-dnskey8-dnskey15,mitm-ra-ds8-ds13-ds15.ds13-ds16-dnskey13-dnskey16,mitm-rs16-ra.ds13-ds16-dnskey16,mitm-rs15-ra.ds8-ds13-dnskey13,mitm-rs16-ra.ds13-ds16-dnskey13-dnskey16,session-finish,mitm-rs16-ra.ds8-ds13-dnskey8
token,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1
100084605,True,True,True,True,True,True,True,True,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True
1001178984,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False
1001406422,True,True,True,True,True,True,True,False,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True
1002703840,True,True,True,True,True,True,True,True,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True
1002877860,True,True,True,True,True,True,True,True,True,True,True,False,True,True,True,True,False,True,True,False,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,False,True,True,True,True,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
991581256,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False
991936365,True,True,True,True,True,True,True,True,True,True,True,False,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True
996220222,False,False,False,False,False,True,False,False,True,True,False,False,True,True,False,True,True,False,True,False,True,False,True,True,True,True,True,False,True,False,True,True,True,False,True,False,True,False,True,True,True,True,True,True,True,True,True,True
997138338,True,True,True,True,False,True,True,True,True,False,True,False,True,False,True,True,True,True,False,True,True,False,True,True,True,True,False,False,True,True,True,False,True,False,True,False,True,False,True,True,True,True,True,True,True,False,True,True


#### Filter for DNSSEC Validation

See [RFC8624](https://datatracker.ietf.org/doc/html/rfc8624#section-3.1) for algo support specification.

In [10]:
print(f"For {(~df_finished_sessions['broken']).mean():.1%} of users, we did not see a request on the domain name with broken DNSSEC.")

For 35.9% of users, we did not see a request on the domain name with broken DNSSEC.


In [11]:
df_validating_resolvers = df_finished_sessions.join(df_token_region, on='token').groupby(['region']).agg({
    'broken': [lambda s: 1 - s.mean()],
})
df_validating_resolvers.columns = ['users using validating resolvers']
df_validating_resolvers.style.format(lambda v: f"{v:.1%}")

Unnamed: 0_level_0,users using validating resolvers
region,Unnamed: 1_level_1
af,29.3%
as,32.2%
eu,45.6%
na,27.9%
oc,45.4%
sa,38.9%


In [12]:
df_validators = df_finished_sessions[~df_finished_sessions['broken']]
df_validators
# request_df_rp_fin_do = request_df_rp_fin

Unnamed: 0_level_0,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15,mitm-rs15-ra.ds8-ds15-dnskey15,mitm-rs13-ra.ds8-ds13-dnskey8,mitm-rs8-ra.ds8-ds16-dnskey8-dnskey16,mitm-ra.ds10-dnskey10,mitm-rs15-ra.ds8-ds16-dnskey8-dnskey16,mitm-rs13-ra.ds13-ds16-dnskey13-dnskey16,mitm-ra-ds8-ds13-ds15.ds8-ds15-dnskey15,mitm-rs8-ra.ds13-ds16-dnskey16,mitm-rs16-ra.ds8-ds13-dnskey8-dnskey13,mitm-ra.ds8-dnskey8,mitm-rs8-ra.ds15-ds16-dnskey16,mitm-rs15-ra.ds8-ds16-dnskey16,mitm-ra.ds16-dnskey16,mitm-rs8-ra.ds8-dnskey8,mitm-rs15-ra.ds13-ds16-dnskey13-dnskey16,mitm-ra-ds8-ds13.ds13-ds16-dnskey13-dnskey16,mitm-rs8-ra.ds8-ds16-dnskey16,mitm-ra.ds15-dnskey15,mitm-ra-ds8-ds13.ds8-ds15-dnskey15,mitm-ra-ds8-ds13-ds15-ds16.ds16,mitm-ra.ds14-dnskey14,mitm-rs15-ra.ds8-dnskey8,mitm-ra-ds8-ds13-ds15.ds16,mitm-rs16-ra.ds13-dnskey13,mitm-rs15-ra.ds16,mitm-rs8-ra.ds13-dnskey13,mitm-rs8-ra.ds8-ds13-dnskey8-dnskey13,mitm-rs13-ra.ds16,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds16-dnskey16,mitm-rs13-ra.ds8-ds16-dnskey16,mitm-rs16-ra.ds13-ds15-dnskey15,mitm-rs8-ra.ds13-ds15-dnskey13-dnskey15,mitm-ra.ds5-dnskey5,mitm-rs8-ra.ds13-ds16-dnskey13-dnskey16,mitm-ra.ds13-dnskey13,mitm-rs16-ra.ds8-ds15-dnskey8-dnskey15,broken,mitm-rs16-ra.ds8-ds13-dnskey13,mitm-rs16-ra.ds15-ds16-dnskey15,mitm-rs8-ra.ds16,mitm-rs13-ra.ds8-ds15-dnskey8-dnskey15,mitm-ra-ds8-ds13-ds15.ds13-ds16-dnskey13-dnskey16,mitm-rs16-ra.ds13-ds16-dnskey16,mitm-rs15-ra.ds8-ds13-dnskey13,mitm-rs16-ra.ds13-ds16-dnskey13-dnskey16,session-finish,mitm-rs16-ra.ds8-ds13-dnskey8
token,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1
1001178984,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False
1004758708,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False
1005158579,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False
1008382496,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,True,False,False,True,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,True,False
1012713579,False,False,False,False,False,True,False,False,True,True,False,False,True,True,False,True,True,False,False,False,False,False,True,True,True,True,True,False,True,False,True,True,True,False,True,False,True,False,True,True,True,True,True,True,True,True,True,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
99005795,False,False,False,False,False,True,False,False,True,True,False,False,True,True,False,True,True,False,False,False,False,False,True,True,True,True,True,False,True,False,True,True,True,False,True,False,True,False,True,True,True,True,True,True,True,True,True,True
990452317,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False
991581256,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False
996220222,False,False,False,False,False,True,False,False,True,True,False,False,True,True,False,True,True,False,True,False,True,False,True,True,True,True,True,False,True,False,True,True,True,False,True,False,True,False,True,True,True,True,True,True,True,True,True,True


### Determine and Plot Statistics of Interest

In [13]:
ALGORITHMS = [5, 8, 10, 13, 14, 15, 16]

In [14]:
for a in ALGORITHMS:
    df_validators[f'supports_{a}'] = ~df_validators[f'mitm-ra.ds{a}-dnskey{a}']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_validators[f'supports_{a}'] = ~df_validators[f'mitm-ra.ds{a}-dnskey{a}']


In [15]:
del df_validators['session-finish']
del df_validators['broken']

In [16]:
df_requests = df_validators.reset_index().melt(id_vars=['token'] + [f'supports_{a}' for a in ALGORITHMS], var_name='zone', value_name='request')

In [17]:
df_requests.head(3)

Unnamed: 0,token,supports_5,supports_8,supports_10,supports_13,supports_14,supports_15,supports_16,zone,request
0,1001178984,True,True,True,True,True,True,True,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15,False
1,1004758708,True,True,True,True,True,True,True,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15,False
2,1005158579,True,True,True,True,True,True,True,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15,False


In [18]:
df_requests['attack'] = df_requests.apply(lambda row: row['zone'].split('.')[0], axis=1)
df_requests['zone_prefix'] = df_requests.apply(lambda row: row['zone'].split('.')[1], axis=1)
df_requests['ds'] = df_requests.apply(lambda row: tuple(a for a in ALGORITHMS if f'ds{a}' in row['zone_prefix']), axis=1)
df_requests['dnskey'] = df_requests.apply(lambda row: tuple(a for a in ALGORITHMS if f'dnskey{a}' in row['zone_prefix']), axis=1)
df_requests['support'] = df_requests.apply(lambda row: tuple(a for a in ALGORITHMS if row[f'supports_{a}']), axis=1)
df_requests['validation_path'] = df_requests.apply(lambda row: tuple(set(row['ds']) & set(row['dnskey'])), axis=1)
df_requests['supported_validation_path'] = df_requests.apply(lambda row: tuple(set(row['validation_path']) & set(row['support'])), axis=1)
df_requests['supported_ds'] = df_requests.apply(lambda row: tuple(set(row['ds']) & set(row['support'])), axis=1)
df_requests['evil_content'] = df_requests.apply(lambda row: '-ra' in row['attack'] or '-at' in row['attck'], axis=1)

In [19]:
def behavior_correct(row):
    if row['supported_ds'] and row['evil_content']:
        # there are supported DS records and we delivered evil content, correct behavior is SERVFAIL
        return not row['request']
    
    if not row['supported_ds']:
        # there are no supported DS algorithms, hence this must be treated as insecure, correct behavior is NOERROR
        return row['request']
    
    return None

df_requests['behavior_correct'] = df_requests.apply(behavior_correct, axis=1)
df_requests[df_requests['behavior_correct'].isna()][['attack', 'ds', 'support', 'supported_validation_path', 'supported_ds', 'evil_content', 'request']]

Unnamed: 0,attack,ds,support,supported_validation_path,supported_ds,evil_content,request


In [20]:
df_requests = df_requests.join(df_token_region, on='token')

In [21]:
df_requests.head(3)

Unnamed: 0,token,supports_5,supports_8,supports_10,supports_13,supports_14,supports_15,supports_16,zone,request,attack,zone_prefix,ds,dnskey,support,validation_path,supported_validation_path,supported_ds,evil_content,behavior_correct,region
0,1001178984,True,True,True,True,True,True,True,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15,False,mitm-ra-ds8-ds13-ds15-ds16,ds8-ds15-dnskey15,"(8, 15)","(15,)","(5, 8, 10, 13, 14, 15, 16)","(15,)","(15,)","(8, 15)",True,True,oc
1,1004758708,True,True,True,True,True,True,True,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15,False,mitm-ra-ds8-ds13-ds15-ds16,ds8-ds15-dnskey15,"(8, 15)","(15,)","(5, 8, 10, 13, 14, 15, 16)","(15,)","(15,)","(8, 15)",True,True,oc
2,1005158579,True,True,True,True,True,True,True,mitm-ra-ds8-ds13-ds15-ds16.ds8-ds15-dnskey15,False,mitm-ra-ds8-ds13-ds15-ds16,ds8-ds15-dnskey15,"(8, 15)","(15,)","(5, 8, 10, 13, 14, 15, 16)","(15,)","(15,)","(8, 15)",True,True,sa


In [22]:
df_requests['num_ds'] = df_requests.apply(lambda row: len(row['ds']), axis=1)
df_requests['num_supported_ds'] = df_requests.apply(lambda row: len(row['supported_ds']), axis=1)
df_requests['num_unsupported_ds'] = df_requests.apply(lambda row: row['num_ds'] - row['num_supported_ds'], axis=1)

In [23]:
def rrsig(row):
    if '-rs' in row['attack'] and '-ds' in row['attack']:
        raise NotImplemented
    if '-rs' in row['attack']:
        for a in ALGORITHMS:
            if f'-rs{a}' in row['attack']:
                return tuple([a])
    if '-ds' in row['attack']:
        return tuple(set(row['ds']) - {a for a in ALGORITHMS if f'ds{a}' in row['attack']})
    return row['ds']

df_requests['rrsig'] = df_requests.apply(rrsig, axis=1)

In [24]:
df_requests['supported_rrsig'] = df_requests.apply(lambda row: tuple(set(row['rrsig']) & set(row['support'])), axis=1)
df_requests['num_rrsig'] = df_requests.apply(lambda row: len(row['rrsig']), axis=1)
df_requests['num_supported_rrsig'] = df_requests.apply(lambda row: len(row['supported_rrsig']), axis=1)
df_requests['num_unsupported_rrsig'] = df_requests.apply(lambda row: row['num_rrsig'] - row['num_supported_rrsig'], axis=1)
df_requests['has_supported'] = df_requests.apply(lambda row: row['num_rrsig'] - row['num_supported_rrsig'], axis=1)

In [25]:
df_requests['has_supported_ds'] = df_requests.apply(lambda row: bool(row['num_supported_ds']), axis=1)
df_requests['has_unsupported_ds'] = df_requests.apply(lambda row: bool(row['num_unsupported_ds']), axis=1)
df_requests['has_supported_rrsig'] = df_requests.apply(lambda row: bool(row['num_supported_rrsig']), axis=1)
df_requests['has_unsupported_rrsig'] = df_requests.apply(lambda row: bool(row['num_unsupported_rrsig']), axis=1)

In [26]:
df_affected_tokens = df_requests.groupby(['token', 'region']).agg({
    'behavior_correct': [min]
}).reset_index()
df_affected_tokens['has_any_vulnerability'] = ~df_affected_tokens[('behavior_correct', 'min')]
df_affected_tokens.columns = df_affected_tokens.columns.droplevel(1)
df_affected_tokens = df_affected_tokens.groupby(['region']).agg({
    'has_any_vulnerability': ['mean']
})
df_affected_tokens

Unnamed: 0_level_0,has_any_vulnerability
Unnamed: 0_level_1,mean
region,Unnamed: 1_level_2
af,0.876033
as,0.8
eu,0.538462
na,0.866667
oc,0.090659
sa,0.828452


In [27]:
def user_rel(s):
    return f"{s.mean():.1%} ({s.sum():n} of {len(s)})"

vuln = 'user vulnerable to the following attack in at least one DS configuration'
by = ['region', 'attack']
df_affected_tokens = df_requests.groupby(['token'] + by).agg({
    'behavior_correct': [min]
}).reset_index()
df_affected_tokens[vuln] = ~df_affected_tokens[('behavior_correct', 'min')]
df_affected_tokens.columns = df_affected_tokens.columns.droplevel(1)
df_affected_tokens = df_affected_tokens.groupby(by).agg({
    vuln: [user_rel]
}).reset_index()
df_affected_tokens.columns = df_affected_tokens.columns.droplevel(1)
df_affected_tokens = df_affected_tokens.pivot(index=by[0], columns=by[1:], values=[vuln])
df_affected_tokens#.style.format(lambda v: f'{v:.1%}')

Unnamed: 0_level_0,user vulnerable to the following attack in at least one DS configuration,user vulnerable to the following attack in at least one DS configuration,user vulnerable to the following attack in at least one DS configuration,user vulnerable to the following attack in at least one DS configuration,user vulnerable to the following attack in at least one DS configuration,user vulnerable to the following attack in at least one DS configuration,user vulnerable to the following attack in at least one DS configuration,user vulnerable to the following attack in at least one DS configuration
attack,mitm-ra,mitm-ra-ds8-ds13,mitm-ra-ds8-ds13-ds15,mitm-ra-ds8-ds13-ds15-ds16,mitm-rs13-ra,mitm-rs15-ra,mitm-rs16-ra,mitm-rs8-ra
region,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
af,0.0% (0 of 121),45.5% (55 of 121),46.3% (56 of 121),34.7% (42 of 121),50.4% (61 of 121),47.1% (57 of 121),47.1% (57 of 121),83.5% (101 of 121)
as,0.0% (0 of 100),40.0% (40 of 100),49.0% (49 of 100),46.0% (46 of 100),45.0% (45 of 100),43.0% (43 of 100),47.0% (47 of 100),68.0% (68 of 100)
eu,0.0% (0 of 169),17.8% (30 of 169),21.3% (36 of 169),12.4% (21 of 169),15.4% (26 of 169),15.4% (26 of 169),19.5% (33 of 169),49.1% (83 of 169)
na,0.0% (0 of 315),33.3% (105 of 315),35.6% (112 of 315),30.8% (97 of 315),32.4% (102 of 315),32.4% (102 of 315),33.3% (105 of 315),84.1% (265 of 315)
oc,0.0% (0 of 364),2.5% (9 of 364),4.4% (16 of 364),1.9% (7 of 364),4.1% (15 of 364),3.0% (11 of 364),3.6% (13 of 364),6.9% (25 of 364)
sa,0.0% (0 of 239),51.9% (124 of 239),59.8% (143 of 239),48.1% (115 of 239),60.7% (145 of 239),60.3% (144 of 239),62.3% (149 of 239),79.1% (189 of 239)


In [182]:
def user_rel(s):
    return f"{s.mean():.1%} ({s.sum():n} of {len(s)})"

vuln = 'proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination'
by = ['region', 'has_supported_ds', 'has_unsupported_ds', 'has_supported_rrsig', 'has_unsupported_rrsig']
df_affected_tokens = df_requests.groupby(['token'] + by).agg({
    'behavior_correct': [min]
}).reset_index()
df_affected_tokens[vuln] = ~df_affected_tokens[('behavior_correct', 'min')]
df_affected_tokens.columns = df_affected_tokens.columns.droplevel(1)
df_affected_tokens = df_affected_tokens.groupby(by).agg({
    vuln: [user_rel]
}).reset_index()
df_affected_tokens.columns = df_affected_tokens.columns.droplevel(1)
df_affected_tokens = df_affected_tokens.pivot(index=by[0], columns=by[1:], values=[vuln])
df_affected_tokens#.style.format(lambda v: f'{v:.1%}')

Unnamed: 0_level_0,proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination,proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination,proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination,proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination,proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination,proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination,proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination,proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination,proportion of users vulnerable in at least one configuration matching the specified DS/RRSIG combination
has_supported_ds,False,False,False,True,True,True,True,True,True
has_unsupported_ds,True,True,True,False,False,False,True,True,True
has_supported_rrsig,False,False,True,False,False,True,False,False,True
has_unsupported_rrsig,False,True,False,False,True,False,False,True,False
region,Unnamed: 1_level_5,Unnamed: 2_level_5,Unnamed: 3_level_5,Unnamed: 4_level_5,Unnamed: 5_level_5,Unnamed: 6_level_5,Unnamed: 7_level_5,Unnamed: 8_level_5,Unnamed: 9_level_5
af,36.8% (39 of 106),23.4% (25 of 107),46.8% (44 of 94),6.3% (4 of 63),51.6% (47 of 91),47.3% (52 of 110),3.2% (3 of 94),57.4% (54 of 94),54.3% (51 of 94)
as,53.2% (33 of 62),38.0% (27 of 71),56.7% (34 of 60),14.9% (10 of 67),57.4% (35 of 61),48.9% (45 of 92),13.3% (8 of 60),66.1% (41 of 62),61.3% (38 of 62)
eu,15.2% (15 of 99),31.4% (32 of 102),56.4% (44 of 78),5.7% (6 of 105),28.9% (22 of 76),15.6% (23 of 147),6.7% (5 of 75),39.7% (31 of 78),30.8% (24 of 78)
na,31.2% (83 of 266),16.0% (43 of 269),66.4% (162 of 244),6.5% (8 of 123),35.7% (85 of 238),30.2% (89 of 295),4.5% (11 of 244),41.0% (100 of 244),38.1% (93 of 244)
oc,23.1% (6 of 26),32.1% (9 of 28),80.8% (21 of 26),0.6% (2 of 350),10.7% (3 of 28),3.3% (12 of 364),3.6% (1 of 28),25.0% (7 of 28),14.3% (4 of 28)
sa,50.7% (103 of 203),15.4% (32 of 208),36.0% (71 of 197),6.1% (10 of 163),67.3% (132 of 196),59.0% (138 of 234),4.5% (9 of 198),71.9% (143 of 199),68.8% (137 of 199)


In [183]:
def vulnerable(row):
    return {
        True: False,
        False: True,
    }.get(row['behavior_correct'], None)

df_requests['vulnerable'] = df_requests.apply(vulnerable, axis=1)

In [184]:
# TODO replace with Elias' data
# values taken from Crawler Tranco
tranco_ds_distribution = {(1,): 4,
 (3,): 1,
 (5,): 882,
 (5, 7): 2,
 (5, 7, 8): 1,
 (5, 8): 20,
 (5, 10): 2,
 (5, 12): 1,
 (5, 13): 7,
 (7,): 1472,
 (7, 8): 8,
 (7, 8, 13, 14): 1,
 (7, 10): 1,
 (7, 13): 9,
 (8,): 21963,
 (8, 10): 5,
 (8, 13): 23,
 (8, 14): 1,
 (10,): 710,
 (10, 13): 2,
 (10, 14): 1,
 (12,): 2,
 (13,): 17862,
 (13, 15): 1,
 (14,): 267,
 (15,): 2}
tranco_ds_total = sum(c for c in tranco_ds_distribution.values())

# values taken from Crawler TLD
tld_ds_distribution = {(5,): 29, (7,): 34, (7, 8): 4, (8,): 1225, (10,): 33, (13,): 45}
tld_ds_total = sum(c for c in tld_ds_distribution.values())

In [198]:
_region_count_cache = {}

def count_region(r):
    if r not in _region_count_cache:
        _region_count_cache[r] = len(df_requests[df_requests['region'] == r]['token'].unique())
    return _region_count_cache[r]

def region(row):
    return {
        'af': 'Africa',
        'as': 'Asia',
        'eu': 'Europe',
        'na': 'North America',
        'oc': 'Oceania',
        'sa': 'South America',
    }.get(row['region'], row['region']) + f' (n={count_region(row["region"])})'

df_requests['Region'] = df_requests.apply(region, axis=1)

In [211]:
df_user_vuln = df_requests[df_requests['attack'] != 'mitm-ra'].groupby(['Region', 'ds', 'token']).agg(
    {'vulnerable': [any]}  # aggregation across attacks, hence using any
).groupby(['Region', 'ds']).agg({
    ('vulnerable', 'any'): 'mean'}  # aggregation across users (tokens), hence using user_rel (mean)
).reset_index().sort_values(['Region', 'ds'])
df_user_vuln.columns = df_user_vuln.columns.droplevel(1)#.droplevel(1)
df_user_vuln = df_user_vuln.pivot(index=['Region'], columns=['ds'], values=['vulnerable'])
df_user_vuln.columns = pd.MultiIndex.from_tuples(
    [(
        #'Prevalence of User Using Vulnerable Resolvers Conditioned on DS Algorithms and World Region', 
        ', '.join(str(int(a)) for a in x[1]),
        f"{tranco_ds_distribution.get(x[1], 0)/tranco_ds_total:.0%}", 
        f"{tld_ds_distribution.get(x[1], 0)/tld_ds_total:.0%}"
    ) for x in df_user_vuln.columns],
    names=['DS Algorithms', 'Prevalence in Tranco 1M', 'Prevalence in TLDs']
)
df_user_vuln.style.format(lambda v: f"{v:.1%}")

DS Algorithms,8,"8, 13","8, 15","8, 16",13,"13, 15","13, 16","15, 16",16
Prevalence in Tranco 1M,51%,0%,0%,0%,41%,0%,0%,0%,0%
Prevalence in TLDs,89%,0%,0%,0%,3%,0%,0%,0%,0%
Region,Unnamed: 1_level_3,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3
Africa (n=121),37.2%,45.5%,46.3%,47.1%,41.3%,42.1%,47.9%,84.3%,40.5%
Asia (n=100),30.0%,39.0%,44.0%,41.0%,33.0%,36.0%,47.0%,62.0%,48.0%
Europe (n=169),13.0%,15.4%,17.2%,15.4%,14.2%,14.8%,22.5%,50.3%,13.0%
North America (n=315),26.0%,30.5%,32.7%,32.1%,25.4%,28.6%,32.4%,84.1%,29.8%
Oceania (n=364),1.6%,1.6%,1.9%,1.9%,1.6%,1.1%,3.0%,6.3%,5.5%
South America (n=239),56.5%,59.4%,60.7%,59.0%,55.2%,57.3%,61.5%,80.8%,48.1%


In [213]:
formatters = {
    k: lambda v: f"{v:.1%}"
    for k in df_user_vuln.keys()
}
print(df_user_vuln.to_latex(index=True, formatters=formatters, escape=True, na_rep='', column_format="l" + (len(df_user_vuln.keys())) * "r"))

\begin{tabular}{lrrrrrrrrr}
\toprule
DS Algorithms &     8 & 8, 13 & 8, 15 & 8, 16 &    13 & 13, 15 & 13, 16 & 15, 16 &    16 \\
Prevalence in Tranco 1M &   51\% &    0\% &    0\% &    0\% &   41\% &     0\% &     0\% &     0\% &    0\% \\
Prevalence in TLDs &   89\% &    0\% &    0\% &    0\% &    3\% &     0\% &     0\% &     0\% &    0\% \\
Region                &       &       &       &       &       &        &        &        &       \\
\midrule
Africa (n=121)        & 37.2\% & 45.5\% & 46.3\% & 47.1\% & 41.3\% &  42.1\% &  47.9\% &  84.3\% & 40.5\% \\
Asia (n=100)          & 30.0\% & 39.0\% & 44.0\% & 41.0\% & 33.0\% &  36.0\% &  47.0\% &  62.0\% & 48.0\% \\
Europe (n=169)        & 13.0\% & 15.4\% & 17.2\% & 15.4\% & 14.2\% &  14.8\% &  22.5\% &  50.3\% & 13.0\% \\
North America (n=315) & 26.0\% & 30.5\% & 32.7\% & 32.1\% & 25.4\% &  28.6\% &  32.4\% &  84.1\% & 29.8\% \\
Oceania (n=364)       &  1.6\% &  1.6\% &  1.9\% &  1.9\% &  1.6\% &   1.1\% &   3.0\% &   6.3\% &  5.5\% \\


In [230]:
df_user_vuln = df_requests[df_requests['attack'] != 'mitm-ra'].groupby(['Region', 'token', 'ds']).agg(
    {'vulnerable': [any]}  # aggregation across attacks, hence using any
).reset_index()
df_user_vuln.columns = df_user_vuln.columns.droplevel(1)
df_user_vuln = df_user_vuln.pivot(index=['Region', 'token'], columns=['ds'], values=['vulnerable'])
df_user_vuln.head(20).style.apply(lambda row: ['background-color: red;' if val else '' for val in row], axis=1)

  return array(a, dtype, copy=False, order=order)


Unnamed: 0_level_0,Unnamed: 1_level_0,vulnerable,vulnerable,vulnerable,vulnerable,vulnerable,vulnerable,vulnerable,vulnerable,vulnerable
Unnamed: 0_level_1,ds,"(8,)","(8, 13)","(8, 15)","(8, 16)","(13,)","(13, 15)","(13, 16)","(15, 16)","(16,)"
Region,token,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2
Africa (n=121),1029828803,True,True,True,True,True,True,True,True,True
Africa (n=121),105095885,True,True,True,True,True,True,True,True,False
Africa (n=121),1108442410,False,False,False,False,False,False,False,True,False
Africa (n=121),1226409206,False,False,False,False,False,False,False,True,False
Africa (n=121),1250680222,True,True,True,True,True,True,True,True,False
Africa (n=121),1332352601,True,True,True,True,True,True,True,True,True
Africa (n=121),136107605,True,True,True,True,True,True,True,True,True
Africa (n=121),1373886788,False,True,True,True,False,True,False,True,True
Africa (n=121),1377730578,False,False,False,False,False,False,False,True,True
Africa (n=121),1388563405,False,False,False,False,False,False,False,True,False


In [220]:
df_user_behaviors = df_user_vuln.reset_index().groupby(list(sorted(df_user_vuln.keys()))).agg({('token', ''): ['count']}).reset_index().sort_values(('token', '', 'count'), ascending=False)
df_user_behaviors.head(30).style.apply(lambda row: ['background-color: red;' if val < 1 else '' for val in row], axis=1).format(lambda v: f"{v:n}")

TypeError: '<' not supported between instances of 'str' and 'int'

<pandas.io.formats.style.Styler at 0x7fd39d98f550>

In [216]:
len(df_requests['token'].unique())

1308

In [None]:
df_user_behaviors = df_user_vuln.reset_index().groupby(list(sorted(df_user_vuln.keys()))).agg({('token', ''): ['count']}).reset_index().sort_values(('token', '', 'count'), ascending=False)
df_user_behaviors.head(30).style.apply(lambda row: ['background-color: red;' if val < 1 else '' for val in row], axis=1).format(lambda v: f"{v:n}")

In [59]:
behaviors = sorted()
df_user_vuln['behavior'] = df_user_vuln.apply(lambda row: ''.join(['0' if row[k] < 1 else '1' for k in behaviors]), axis=1)

In [72]:
df_user_vuln.reset_index().groupby(['behavior']).agg({('token', '', ''): ['count']}).reset_index().sort_values(('token', '', '', 'count'), ascending=False).head(30)

Unnamed: 0_level_0,behavior,token
"(attack, )",Unnamed: 1_level_1,Unnamed: 2_level_1
"(zone_prefix, )",Unnamed: 1_level_2,Unnamed: 2_level_2
Unnamed: 0_level_3,Unnamed: 1_level_3,count
206,1111111111111111111111111111111111111111111111,527
205,1111111111111111111111111111111111111111011111,299
50,1111111010110111110001001000000000000000111111,172
65,1111111010111111110001001000000100000000011111,27
73,1111111010111111111111111111100011111111111111,20
64,1111111010111111110001001000000000000000111111,19
141,1111111111010111110001001000000000000000111111,7
154,1111111111011111111111111111111111111111011111,5
75,1111111010111111111111111111111111111111011111,5
155,1111111111011111111111111111111111111111111111,3


In [73]:
behaviors[-6]

(('behavior_correct', 'mean'), 'mitm-rs8-ra', 'ds15-ds16-dnskey16')