# On the Suitability of RTT Measurements for Geolocation

rev. 0, 9 August 2017, [Brian Trammell](mailto: ietf@trammell.ch), Networked Systems Group, ETH Zürich

#### Abstract

It is widely understood in the network measurement community that round trip time measurements, whether passively derived by observation of transport-layer or application-layer signals linking packets in one direction to counterparts in another, or actively measured using a facility such as ICMP echo reply, are not on their own useful as input to trilateration for geolocation, as they are dominated by error terms which are not easy to remove. While detailing several attempts to correct for this error in order to use RTT measurements for certain geolocation tasks appear in the literature, there has not to date been an attempt to provide a quantitiative, empirical study of this error as guidance for applying RTT measurements for geolocation.

This work uses RTTs derived from RIPE Atlas Ping measurements to address this shortcoming. This dataset is especially useful in that we have presumptive ground truth for the locations o Atlas probes and a set of dedicated measurement targets, called Atlas anchors, which allows us to compare our RTT-derived estimated distances to actual distances, as well as to derive empirical models for a function deriving distance from observed RTT.

_Conclusions from the study go here_. 

#### About this notebook

This is a Jupyter notebook containing the analysis code used to generate its visualizations; it is, in a sense, a "runnable paper". It depends on the [dataprep](dataprep.ipynb) notebook in this directory to download measurements from RIPE Atlas and place them in distilled form in an HDF5 datastore, from which this paper reads data and generates visualizations and tables. If you don't have an HDF5 datastore locally, go run the [dataprep](dataprep.ipynb) notebook now, then come back here.

Run the following cell to set up the environment, import useful packages, and load the dataframes from which this paper's analysis and visualization will follow:

In [None]:
%matplotlib inline

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats
import math

with pd.HDFStore('rtt.h5') as store:
    anchor_df = store['anchor_df']
    probe_df = store['probe_df']
    rtt_df = store['rtt_df']

## Introduction

_a paragraph or two about what we're trying to do here_

_a few words about related work_

## Problem Statement

In order to use RTT measurements as input to geolocation, we need to find a function of RTT that yields a distance:

$$ dist = f(RTT) $$

Observable RTT is given by the equation:

$$ RTT_{obs} = \sum_{n=0}^f(D_{prop_{n \rightarrow n+1}} + D_{queue_n} + D_{proc_n}) + \sum_{m=0}^r(D_{prop_{m \rightarrow m+1}} + D_{queue_m} + D_{proc_m}) + D_{app} $$

In other words, for _f_ hops in the forward direction and _r_ hops in the reverse direction, the observed RTT is the sum of propagation, queueing, and processing delay at each hop, plus any delay at the far endpoint (here labeled as application delay). The idealized RTT for geolocation purposes, however, would be:

$$ RTT_{ideal} = \sum_{n=0}^f D_{prop_{n \rightarrow n+1}} + \sum_{m=0}^r D_{prop_{m \rightarrow m+1}} $$

This ideal situation only holds in the unlikely circumstance that sources of error (i.e., variable queueing delay, unmeasurable processing delay, and application delay) are zero:

$$ RTT_{obs} = RTT_{ideal} \iff RTT_{error} = \sum_{n=0}^f(D_{queue_n} + D_{proc_n}) + \sum_{m=0}^r(D_{queue_m} + D_{proc_m}) + D_{app} = 0 $$

Even this impossible, idealized situation would not yield us a perfect $dist = f(RTT)$, as the propagation delays themselves follow the paths of the optical fiber, copper wire, radio or optical wireless link carrying each hop, which do not follow straight lines, and each of which may have its own functions determining distance given delay. Indeed, the only thing we can say with complete certainty is that distance given an estimated one-way delay cannot be equal to or greater than the speed of light:

$$ dist < \frac{RTT_{obs}}{2} \times c $$


## Methodology

In [1]:
## TODO