In [1]:
import plotly as py
import pandas as pd
import datetime as dt
import cufflinks as cf
import pickle
from ftplib import FTP

# Initialise for offline plotting
py.offline.init_notebook_mode(connected=True)

# Online visualisation of logger data

Rolf has some loggers that automatically upload data to NIVA's FTP server. This code provides a starting point for automatically parsing this data and creating some simple online visualisations.

**Note:** The approach here is preliminary and can be substantially improved. At present, I am not using NIVA's new data platform at all, but instead using some older tools deployed in previous projects. With some help from e.g. Akos, we could make a much nicer interface, incorporating additional functionality derived from the new platform.

**Note 2:** This demonstration currently runs on my laptop (i.e. locally), so the plots are only updated when my laptop is turned on and this code is run. To regularly update the plots, we need to transfer the code to a server and setup a "scheduled task" to e.g. run the code once per day. This is easy, but not necessary to begin with.

## 1. Python environment

For ease of deployment later, it is convenient to create a clean Python environment for the new application:

    conda create -n rolf_loggers python=3.6
    
    conda install -c dhirschfeld pandas plotly cufflinks jupyter notebook
    
Note that the `jupyter notebook` part is only required here for development and testing.

## 2. Read data from FTP server

Rolf currently has two loggers (IDs 17113521 and 18033677) and we are only interested in the data from 03/05/2018 onwards. To avoid sharing the FTP login details via GitHub, I have created a `.pickle` file on my local computer, which is used to connect in the code below.

**Note:** I'm not sure what parameters Rolf is interested in, so have just picked four at random in the code below. I originally thought the main focus was on pH, but this variable is not present in the data from 03/05/2018 onwards. **Check with Rolf**.

### 2.1. User input

In [2]:
# Variables of interest
pars = ['Temperature(°C)[1:1]', 'Chloride (Cl-)(mg/L)[24:117]', 
        'Barometric Pressure(mmHg)[16:22]', 'External Voltage(V)[32:163]']

# Start date
st_date = '2018-05-03'

# Loggers IDs to consider
loggers = ['17113521', '18033677']

### 2.2. Process FTP data

In [3]:
# Read FTP credentials
with open('rolf_ftp_creds.pickle', 'rb') as handle:
    ftp_creds = pickle.load(handle)
    
# Connect to FTP
ftp = FTP(host=ftp_creds['host'], 
          user=ftp_creds['user'], 
          passwd=ftp_creds['pw'])

# Empty dict for data
df_dict = {}
for logger in loggers:
    df_dict[logger] = []
    
# Parse user start date
st_date = dt.datetime.strptime(st_date, '%Y-%m-%d')

# Get list of files from server
flist = []
ftp.retrlines('MLSD', flist.append)

# Loop over files
for item in flist:
    # Get file name and metadata
    facts, fname = item.split('; ')
    
    # Get file date from metadata
    fdate = facts.split(';')[2][7:21]
    fdate = dt.datetime.strptime(fdate, '%Y%m%d%H%M%S')

    # Get logger ID
    log_id = fname.split('__')[0]
    
    if ((log_id in df_dict.keys()) and 
        (fdate >= st_date)):
        # Read from server to local disk
        tmp_file = open('temp.txt', 'wb')
        ftp.retrbinary('RETR %s' % fname, tmp_file.write)
        tmp_file.close()

        # Read local file to df
        df = pd.read_csv('temp.txt', 
                         sep=';', 
                         header=6, 
                         skipfooter=24, 
                         encoding='windows-1252', 
                         engine='python', 
                         decimal=',')
        
        # Build datetime index
        df.index = pd.to_datetime(df['Date'] + ' ' + df['Time'],
                                  format='%d/%m/%Y %H:%M')
        
        # Get cols of interest
        df = df[pars]
                           
        # Append to dict
        df_dict[log_id].append(df)
    
ftp.quit()

'221 Goodbye, closing session.'

## 3. Upload to Plotly

[Plotly](https://plot.ly/) provides tools for online data visualisation and analysis. It is not something we use much at NIVA, as we have decided to develop our own platform instead, but I have a personal Plotly account, which is convenient for developing simple applications like this one. This is not a long-term solution - it's just a useful starting point.

**Note:** For testing, change `py.plotly.plot` in the code below to `py.plotly.iplot` to see the result rendered in the notebook (as well as uploading online)

In [4]:
# Loop over dict
for log_id in df_dict.keys():
    # Concatenate all data to single df
    df = pd.concat(df_dict[log_id], axis=0)
    
    # Create plot and upload
    fig = df.iplot(subplots=True, 
                   subplot_titles=True, 
                   legend=False, 
                   shared_xaxes=False, 
                   asFigure=True,
                   filename='logger_%s' % log_id)

    fig['layout'].update(height=600, width=1200, 
                         title='Logger ID: %s' % log_id)

    py.plotly.plot(fig, filename='logger_%s' % log_id)    

## 4. View results

The online plots can be found here:

 * **[Logger 17113521](https://plot.ly/~James_Sample/14.embed)**
 
 * **[Logger 18033677](https://plot.ly/~James_Sample/12.embed)**