# Processing Echosounder Data using `echopype`

In this notebook we demonstrate how to process echosounder data using `echopype`.

We pick data from the Ocean Observatories Initiative (OOI) [Oregon Offshore Cabled Shallow Profiler Mooring](https://oceanobservatories.org/site/ce04osps/) collected on August 21, 2017. This was the day of the solar eclipse, during which the reduced sunlight affected the regular diel vertical migration (DVM) patterns of marine life. This change was directly observed using a moored echosounder that happened to be within the totality zone.

## Processing one file
Let's first test `echopype` by downloading and processing 1 file.

**Install echopype**

In [None]:
!pip install echopype

In [None]:
# downloading the file
!wget https://rawdata.oceanobservatories.org/files/CE04OSPS/PC01B/ZPLSCB102_10.33.10.143/OOI-D20170821-T163049.raw 

In [None]:
filename = 'OOI-D20170821-T163049.raw'

**Converting from Raw to Standartized netCDF Format**

In [None]:
import os

In [None]:
# import as part of a submodule
from echopype.convert import Convert
data_tmp = Convert(filename)
data_tmp.raw2nc()
os.remove(filename)

**Calibrating, Denoising, Mean Volume Backscatter Strength**

In [None]:
from echopype.model import EchoData
data = EchoData(filename[:-4]+'.nc')
data.calibrate()     # calibration and echo-integration
data.remove_noise()  # noise removal from calibrated Sv
data.get_MVBS()      # obtain MVBS

**Visualizing the Result**

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
data.MVBS.MVBS.sel(frequency=200000).plot(x='ping_time',cmap = 'jet')
plt.show()

## Processing Multiple Files

Now that we verified that `echopype` does work, let's proceed to process all sonar data from August 21, 2017.

To process multiple file from the OOI website we need to scrape the names of the existing files there. We will use the `Beautiful Soup` package for that. 

In [None]:
!conda install --yes beautifulsoup4

In [None]:
from bs4 import BeautifulSoup
from urllib.request import urlopen

In [None]:
path = 'https://rawdata.oceanobservatories.org/files/CE04OSPS/PC01B/ZPLSCB102_10.33.10.143/'

In [None]:
response = urlopen(path)
soup = BeautifulSoup(response.read(), "html.parser")

In [None]:
urls = []
for item in soup.find_all(text=True):
    if '.raw' in item:
        urls.append(path+'/'+item)        

In [None]:
urls = [path+'/'+item for item in soup.find_all(text=True) if '.raw' in item]

In [None]:
urls

In [None]:
from datetime import datetime

Specify range, note that the data files were recorded at UTC time:

In [None]:
start_time = '20170821-T06000'
end_time = '20170822-T070000'

In [None]:
# convert the times to datetime format
start_datetime = datetime.strptime(start_time,'%Y%m%d-T%H%M%S')
end_datetime = datetime.strptime(end_time,'%Y%m%d-T%H%M%S')

In [None]:
# function to check if a date is in the date range
def in_range(date_str, start_time, end_time):
    date_str = datetime.strptime(date_str,'%Y%m%d-T%H%M%S')
    true = date_str >= start_datetime and date_str <= end_datetime
    return(true)

In [None]:
# identify the list of urls in range
range_urls = []
for url in urls: 
    date_str = url[-20:-4]
    if in_range(date_str, start_time, end_time):
        range_urls.append(url)

In [None]:
range_urls

In [None]:
rawnames = [url.split('//')[-1] for url in range_urls]

**Downloading the Files**

In [None]:
import requests
rawnames = []
for url in range_urls:
    r = requests.get(url, allow_redirects=True)
    rawnames.append(url.split('//')[-1])
    open(url.split('//')[-1], 'wb').write(r.content)

**Converting from Raw to Standartized netCDF Format**

In [None]:
# import as part of a submodule
from echopype.convert import Convert
for filename in rawnames:
    data_tmp = Convert(filename)
    data_tmp.raw2nc()
    os.remove(filename)

**Calibrating, Denoising, Mean Volume Backscatter Strength**

In [None]:
# calibrate and denoise
from echopype.model import EchoData

for filename in rawnames:
    data = EchoData(filename[:-4]+'.nc')
    data.calibrate()
    data.remove_noise()
    data.get_MVBS(save=True)
    os.remove(filename[:-4]+'.nc')

**Opening and Visualizing the Results in Parallel**

Now that all files are in an appropriate format, we can open them and visualize them in parallel. For that we will need to install the `dask` parallelization library.

In [None]:
!conda install --yes dask

In [None]:
import xarray as xr

In [None]:
res = xr.open_mfdataset('*MVBS.nc',
                        combine='by_coords',
                        data_vars='different')

In [None]:
depth = res.range.sel(frequency=200000).max() - \
        res.range.sel(frequency=200000)
MVBS_200k = res.MVBS.sel(frequency=200000).\
            sel(ping_time=slice('2017-08-21 07:00:00',
                                '2017-08-22 07:00:00'))
MVBS_200k.coords['depth'] = ('range_bin',depth)
MVBS_200k = MVBS_200k.swap_dims({'range_bin': 'depth'})

In [None]:
import matplotlib.pyplot as plt

In [None]:
MVBS_200k.plot(x='ping_time', y='depth',
               vmin=-80, vmax=-30, cmap='jet',
               size=5, aspect=3, yincrease=False)
plt.xlabel('Ping time', fontsize=14)
plt.ylabel('Depth (m)', fontsize=14)
plt.show()

## Check solar radiation measurements

Now we've seen how the echogram looks like during the day of eclipse, let's match the sonar observation with the solar radiation measurements.

From the National Data Buoy Center (http://www.ndbc.noaa.gov/) we see that there is a surface buoy with a pyranometer that measures shortwave radiation (SRAD1 on the NDBC website) at the EAO site (Station 46098: http://www.ndbc.noaa.gov/station_page.php?station=46098). Let's access the data and check it out!

In [None]:
import gzip
import requests
import urllib
from datetime import datetime
import numpy as np
import pandas as pd

First we need to construct the url to the historical data.

In [None]:
srad_url = 'https://www.ndbc.noaa.gov/data/historical/srad/'
filename = '46098'+'r'+'2017'+'.txt.gz'
fileurl = srad_url+filename

Then we open up the file and read all measurements from 2017.

In [None]:
f = gzip.open(urllib.request.urlopen(fileurl))
lines = [line.decode().strip() for line in f.readlines()]

In [None]:
lines[:2]

This tells us that the first 5 columns are measurement timestamp, and the 6th column is the short wave radiation measurement. Let's now parse the entire 2017 data set.

In [None]:
srad1_time = []
srad1 = []
for line in lines[2:]:
    line = line.split()
    srad1_time.append(datetime.strptime(''.join(line[:5]), '%Y%m%d%H%M'))
    nn = 5  # the 6th column is SRAD1
    srad1.append(np.nan if line[nn] == '9999.0' else float(line[nn]))

Make it into a pandas DataFrame for convenience.

In [None]:
df_srad = pd.DataFrame(srad1, columns=['SRAD'], index=srad1_time)

NOw we can see how the solar radiation changed during the time of eclipse:

In [None]:
df_srad[np.logical_and(df_srad.index>=pd.to_datetime('2017-08-21 07:00:00'), 
                       df_srad.index<=pd.to_datetime('2017-08-22 07:00:00'))].\
        plot(logy=True, color='r', figsize=(12,3))
plt.xlabel('Time', fontsize=14)
plt.show()

There is indeed a sharp drop around the time of eclipse.

## Combine sonar observation with solar radiation measurements

We can finally put everything together and figure out the effect of eclipse on the marine animals!

In [None]:
import matplotlib.dates as mdates

In [None]:
fig = plt.figure(figsize=(12,8))
ax0 = plt.subplot2grid((3, 1), (0, 0))
ax1 = plt.subplot2grid((3, 1), (1, 0),rowspan=2)

df_srad[np.logical_and(df_srad.index>=pd.to_datetime('2017-08-21 07:00:00'), 
                       df_srad.index<=pd.to_datetime('2017-08-22 07:00:00'))].plot(ax=ax0, logy=True, color='r')
ax0.set_ylabel('Radiation (W/m^2)', fontsize=14)

MVBS_200k.plot(x='ping_time', y='depth',
               ax=ax1, vmin=-80, vmax=-30, cmap='jet',
               yincrease=False, add_colorbar=False)
plt.subplots_adjust(hspace = 0.14)
plt.xlabel('Ping time (UTC)', fontsize=14)
plt.ylabel('Depth (m)', fontsize=14)

plt.title('')
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%b-%d %H:%M'))
plt.show()

Look how the dip at solar radiataion reading matches exactly with the upwarding moving blip at UTC 10:21. The animals were fooled by the temporary mask of the sun and thought it's getting dark as at dusk!