Recorded ping tests: 1 per minute, single ping, to 8.8.8.8 (google dns)

Data quality notes: At one point in time we had a timezone jump because the timezone was corrected on the measuring device. Timezone was recorded, but I chose to keep processing simple and ignore timezones as much as possible

In [1]:
# Input file: contains output of 'date' and 'ping'
!head -n 20 googledns.log

Thu 25 Jun 2020 06:54:34 AM BST
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=19.1 ms

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 19.094/19.094/19.094/0.000 ms
Thu 25 Jun 2020 06:56:01 AM BST
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=21.2 ms

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 21.202/21.202/21.202/0.000 ms
Thu 25 Jun 2020 06:57:01 AM BST
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=24.4 ms

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms


In [2]:
import re
from itertools import tee, filterfalse

In [3]:
lines = open('googledns.log').readlines()
lines = [l.strip() for l in lines]
datepattern = re.compile('.*2020.*')
dates = list(filter(lambda line: datepattern.match(line), lines))
print("First and last timestamp:")
print(dates[0])
print(dates[-1])
print(len(dates))

# here you can see that TZ change mentioned at the top

First and last timestamp:
Thu 25 Jun 2020 06:54:34 AM BST
Mon 10 Aug 2020 04:52:01 PM CEST
64650


In [4]:
resultpattern = re.compile('1 packets transmitted')
results = list(filter(lambda line: resultpattern.match(line), lines))
print("Expected # ping tests:", 60*24*8 + 60*21) # about 8 full days + 21 hours on the last date
print("Actual # ping tests:", len(results))

# looks good - again, off by 60 because of the TZ change

Expected # ping tests: 12780
Actual # ping tests: 64650


In [5]:
# Now let's check successful & failed pings
def is_good(result):
    return '1 received' in result
def is_bad(result):
    return '0 received' in result
    
success_count = len(list(filter(is_good,results)))
fail_count = len(list(filter(is_bad,results)))
print("Successful:", success_count)
print("Failures  :", fail_count)
print("Total     :", (success_count + fail_count))

Successful: 62894
Failures  : 1756
Total     : 64650


In [6]:
# How many hours of failure is that?
print("Estimated hour of failure:", (fail_count / 60))

Estimated hour of failure: 29.266666666666666


In [7]:
# Let's find when things went wrong
combined = list(zip(dates,results))

# find points at which is_good(previous) XOR is_good(current)
changepoints = [(0, dates[0], results[0])]
for i, result in enumerate(results[:-1]):
    if is_good(result) ^ is_good(results[i+1]):
        changepoints.append((i+1, dates[i+1], results[i+1]))

In [8]:
changepoints

[(0,
  'Thu 25 Jun 2020 06:54:34 AM BST',
  '1 packets transmitted, 1 received, 0% packet loss, time 0ms'),
 (630,
  'Thu 25 Jun 2020 06:15:16 PM BST',
  '1 packets transmitted, 0 received, 100% packet loss, time 0ms'),
 (651,
  '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x

In [14]:
for i in range(len(changepoints)):
    offset, date, result = changepoints[i]
    if is_bad(result):
        if i+1 <len(changepoints):
            enddate = changepoints[i+1][1]
            hours = '%2.2f hours' %((changepoints[i+1][0] - offset)*1.0/60)
        else:
            enddate = 'now'
            hours = 'ongoing'
        print("Outage from %s to %s (%s)"%(date, enddate, hours))
            

Outage from Thu 25 Jun 2020 06:15:16 PM BST to                                                                                                                                                                                                                                                   Thu 25 Jun 2020 06:32:01 PM BST (0.35 hours)
Outage from Thu 02 Jul 2020 04:09:01 PM BST to Thu 02 Jul 2020 04:10:01 PM BST (0.02 hours)
Outage from Tue 28 Jul 2020 12:18:01 AM BST to Tue 28 Jul 2020 12:23:01 AM BST (0.08 hours)
Outage from Tue 28 Jul 2020 12:39:01 AM BST to Tue 28 Jul 2020 12:42:02 AM BST (0.05 hours)
Outage from Fri 31 Jul 2020 04:03:01 AM BST to Fri 31 Jul 2020 04:05:01 AM BST (0.03 hours)
Outage from Fri 31 Jul 2020 01:00:01 PM BST to Fri 31 Jul 2020 01:10:01 PM BST (0.17 hours)
Outage from Fri 31 Jul 2020 02:01:01 PM BST to Fri 31 Jul 2020 02:30:01 PM BST (0.48 hours)
Outage from Sat 01 Aug 2020 05:02:01 AM BST to Sat 01 Aug 2020 11:40:01 AM BST (6.62 hours)
Outage from Sat 01 Aug