# Finding the BSP correction factors for all data

## Import important packages and set up file paths

In [None]:
import sys
import glob
import time
import os

import numpy as np
from scipy.io import readsav

from read import read_NX2
import plot

%matplotlib inline
from matplotlib import pyplot as plt

## Lusoria 2006

In [None]:
dat0651 = read_NX2('../data/2006/20060623fifth-day-no-sail.csv', 
                   origin = (49.0164, 12.0285))
dat0661 = read_NX2('../data/2006/20060624sixth-day-with-sail.csv', 
                   origin = (49.0164, 12.0285))

In [None]:
schwaller = readsav('../data/2006/stromgeschwindigkeit.sav')['strom']

In [None]:
temp = plt.quiver(schwaller['x'][0], schwaller['y'][0], schwaller['vx'][0], schwaller['vy'][0])

In [None]:
ax = plot.speeds(dat0661)

Seeing how good the blue and the orange line agree in the plot above,
I think we can say that the BSP correction factor must be almost exactly one. In fact,
if I remember correctly, we actually took the value for the correction that the NX2 suggested and 
accepted them, thus making the correction factor 1 for the later days.

Below is a plot_course plot, but I also don't see any anomalies there:

In [None]:
ax = plot.course(dat0661, scale=10)

In [None]:
a, ind, ax = plot.fit_BSP(dat0661)
print(a)

So, that is basically all consistent with a correction factor of $0.94 \approx 1$ (within 10% - 
I don't think that we'll get much better than that).

Now, let's do the same experiment on the other day with sailing data in 2006. All dates without
sailing need not be calibrated at all.

In [None]:
a, ind, ax = plot.fit_BSP(dat0651)
print(a)

Now, $\beta = 0.8$  which is significantly less than the other day, but after the cuts there
is actually very little data left anyway. Here, I show the data that was not used in the fit
because is was taken during a time of gusty wind, in unfavourable wind angles etc. (see the
documentation of the fitting procedure for details) and most of this data also indicates a 
higher value for $\beta$ than fit. Thus, I conclude, that the values of $\beta$ might well
be compatible with what I have seen for the other day and I use 0.95 as the correction factor for all
2006 data.

## Victoria: Data from 2008

In [None]:
filelist = glob.glob('../data/2008/08*csv')
filelist.sort()

dat08 = [read_NX2(f) for f in filelist]

In [None]:
ax = course(dat08[5], scale=10)

Looks OK to me. At least, no errors are immidiately obvious, i.e. the rowing times line up with the times of the trip, sailing is in between rowing sections etc.

In [None]:
fig = plt.figure(figsize = (20,10))
for i, data in enumerate(dat08):
    ax = fig.add_subplot(3,7,i+1)
    a, ind, ax = plot.fit_BSP(data, ax=ax)
    ax.set_title(f'{i}: {a:5.3f}')
    ax.plot(plt.xlim(), np.array([1.15]) * plt.xlim(), 'g', label = 'm = 1.15')

The plot above shows the SOG on the x-axis and the BSP on the y-axis for all days in 2008. Black dots are used in the fit, yellow dots are rejected, because e.g. they are recorded at time when the wind speed or the direction of the boat changed significantly or when the course-over-ground (COG) and the compass course (HDC) disagree. If COG and HDC are parallel and there is no current (as expected on a lake), then SOG and BSP should have the same value. If there is an angle between COG and HDC, then we boat is drifted sideways to some degree, which causes the BSP to measure only one component of the velocity vector. This differece is the drift, one of the parameters we seek to constrain with those measurements.


The green line the best fit to the dataset from each day; the numerical value for $\beta$, the slope of the line, is show in the title of each panel. The red line plotted in all figures is a line with $\beta = 1.15$, which I propose to take for all measurements in 2008. In most plots, the fit value is virtually indistinguishable from 1.15, those with strong disagreements have either a very low number of datapoints (e.g. second and forth plot in last row) or the pattern of the black dots looks in some way inconsistent (first and third plot in first row).

The cloud of black points is always much tighther 

Below, I pick the third dataset with shows $\beta = 1.0\:$ and look at it in some more detail, because I suspect that this is a case where the log was not fully submerged, possibly one of those days with a lot of guests or a film crew.

In [None]:
ax = plot.course(dat08[3], scale=10)

In [None]:
ax = plot.speeds(dat08[3])

In the SOG-BSP plot there is this off group of black dots that have SOG$ > 3.3$, but relatively low BSP. Let's find out what happened there.

In [None]:
ind33 = data.SOG > 3.3
ind33.sum()

The time period in questions is only 70 s long, the BSP here is lower than it should be. During this time, the ship moved into a wind, going in a very tight loop. While I cannot reproduce exac
tly, what happened here, it is obvious form the SOG-BSP diagram that the conversion factor in this phase must be different from the usual values.

## Lusoria 2011

In [None]:
filelist = glob.glob('../data/2011/2011*csv')
filelist.sort()

dat11 = [read_NX2(f) for f in filelist]

In [None]:
fig = plt.figure(figsize=(20, 10))
for i, data in enumerate(dat11):
    ax = fig.add_subplot(5, 7 , i+1)
    a, ind, ax = plot.fit_BSP(data, ax=ax)
    ax.set_title(f'{i}: {a:5.3f}')
    ax.plot(plt.xlim(), np.array([0.87]) * plt.xlim(), 'y', label = 'm = 0.87')

Expect for the first dataset (number 0), all datasets are beautifully consistent with $\beta = 0.87\;$. In all cases, the cloud of black points is extremly narrow. Some of the datasets are teken with the mast set, some are taken without, but apparently that makes no difference for the fitting of $\beta\;$.

## Victoria 2012

In [None]:
filelist = glob.glob('../data/2012/2012*csv')
filelist.sort()

dat12 = [read_NX2(f) for f in filelist]

In [None]:
fig = plt.figure(figsize=(20, 4))
for i, data in enumerate(dat12):
    ax = fig.add_subplot(1, 7 , i+1)
    a, ind, ax = plot.fit_BSP(data, ax=ax)
    ax.set_title(f'{i}: {a:5.3f}')
    ax.plot(plt.xlim(), np.array([0.87]) * plt.xlim(), 'y', label = 'm = 0.87')

The same $\beta = 0.87\:$ that worked well in 2011 is again a very good fit to the data in 2012. This is not surprising, since we used exactly the same methods of fixing the log, in fact we can take that as confimations that the log is put in place in a very reproducable manner.

## Summary

So, from this analysis, we see that the following correction factors should be used for all the following analysis::

- Lusoria Regina 2006: $\beta = 0.95$
- Victoria 2008: $\beta = 1.18$
- Lusoria Rhenana 2011 and 2012: $\beta = 0.87$