# High temperature at Ruapehu Crater Lake does not imply there is going to be an eruption

## Introduction
How do we make a data visualization that helps explain that just because the temperature of Ruapehu Crater Lake is high, it does not mean that its going to erupt. Our previous attempts have focused on showing the temperature data themselves, but maybe this is too much data, and it doesn't directly address the point we want to make.


## Purpose:
To create and visualise data that directly support our message. Do that for an audience that includes the stakeholders and the general public.

## Data Set and Justification:
The temperature of Ruapehu Crater Lake recorded by a data logger and manual sampling. The data set is restricted to the period since the lake was established following the 1995-96 eruptions. This period is chosen as the previously recognised pattern of a high lake temperature being associated with small eruptions seems to no longer apply. In other words, statistical analysis of eruptions and lake temperature that informs DOC response may not be correct.

Using the temperature data we count the number of periods of high temperature. High temperature is defined as a period when:
- the temperature reaches the top 25% of values, above about 32 deg
- the temperature has risen to that from below the median value, about 24 deg
- multiple-peaks count as a single period, unless the temperture drops below the median
We count the number of eruptions during this period, and the number that have occurred when the lake temperture has been high.

## Key Message:
A derived data set may be the best to get your message across.

## Author:
Steven Sherburn

## Date:
July 2018

In [None]:
from IPython.display import HTML
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import style
from IPython.display import Image
%matplotlib inline

In [None]:
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) {
    return false;
}

In [None]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

## Retrieve the data so we can count the high temperature periods

In [None]:
url = 'http://fits.geonet.org.nz/observation?siteID=RU001&typeID=t&days=6500'
df = pd.read_csv(url, parse_dates=['date-time'], index_col=['date-time'], usecols = ['date-time', ' t (C)'])

**Establish a daily samping**

Early data are about monthly intervals, later 15 mins. Resample as a daily mean, then linearly interpolate those. This allows us to estimate the data distribution.

In [None]:
dfd = df.resample('D').mean().interpolate(method='linear')
dfd[' t (C)'].describe()

In [None]:
temp = dfd[' t (C)'].plot(figsize=(15,5))
temp.axhline(dfd[' t (C)'].quantile(q=0.75), color='red', linestyle='dashed', linewidth=1)
temp.axhline(dfd[' t (C)'].median(), color='green', linestyle='dashed', linewidth=1)

## Our dataset

Using our definition, there are 16 periods of high temperatures.

We take the 'conservative' approach and count 2006 as well as 2007 as an eruption. The VAB following the 2006 event describes it as a small hydrothermal eruption.

In [None]:
d = [{'feature': 'periods of high temperature ', 'value': 16},
         {'feature': 'eruptions', 'value': 2},
         {'feature': 'eruptions during high temperature periods', 'value': 0}]
data = pd.DataFrame.from_dict(d)
data.set_index('feature')

In [None]:
style.use('fivethirtyeight')

fig = plt.figure(figsize=(15,5))

ax1 = fig.add_subplot(1, 1, 1)
ax1.bar(data.feature, data.value, color='indianred')
fig.gca().xaxis.grid(False)
ax1.set_ylabel('Number of observations', color='gray')

#titles
ax1.text(x = 0.03, y = 1, transform=fig.transFigure, s = 'High temperature at Ruapehu Crater Lake does not mean there is going to be an eruption', fontsize=20, weight='semibold')
ax1.text(x = 0.03, y = 0.925, transform=fig.transFigure, s = 'High temperature periods and eruptions since Ruapehu Crater Lake reformed after the 1995-96 eruptions; data since 2001', fontsize=14, color='gray')

#text boxes
ax1.text(2, 0.5, 'NONE', color='indianred', ha='center', fontsize=20, weight='bold')

#signature bar
ax1.text(x = 0, y = -0.05, transform=fig.transFigure, s = 'GeoNet, www.geonet.org.nz', fontsize = 14, color = 'darkgray')
ax1.text(x = 0.95, y = -0.05, transform=fig.transFigure, s = 'SOURCE: GEONET, FITS AND ERUPTION DATABASES', horizontalalignment='right', fontsize = 14, color = 'darkgray');