# Exploring the Arctic Ocean through Ice-Tethered Profiles

*Written by Alexandra Rivera, Duke University, Sep 10, 2020*

> Woods-Hole Oceanographic Institute runs a series of ice-tethered profiles (ITPs) to measure specific qualities of the Arctic Ocean. Each day, the active machines send data via satellite to WHOI's server. For each machine, every data file notes the time and geographic coordinates of the measurements. Each file also records ocean temperature, salinity, and water pressure. **In this project, we will be using these data points to visualize how ocean qualities change with depth and through time.**

> Reference: https://www.whoi.edu/website/itp/overview

## Preliminary Steps

### (1) Understanding our data
> In order to work with data coming from an online server, it is crucial to understand how that data is organized. This knowledge will help when you are creating scripts to download the data to your own computer.

> WHOI stores the data from each ITP machine in its own folder. Each folder contains text files with the format ```"itpgrd0001.dat"```, where the ITP number and data file number change accordingly. Each text file represents the data collected from one cycle down and up through the Arctic ocean (roughly one-day intervals).

> The image below is an example of the first couple lines of the text file. Some files have more information than others, but for this visualization we are only interested in two main sections: (1) the second line, which gives information on the time and location, and (2) the data in the columns pressure, temperature, and salinity.


![](https://drive.google.com/uc?export=view&id=1BrnQb-q2HJSv39UzcL_iWHry6A7qv_Vb)





###(2) Writing data into CSV files
> The next step would be to create CSV files containing the data we are interested in. For this specific visualization, we want to make a CSV file for each text file. The CSV's are provided to you.



## Beginning the Profile Visualizations


> We will be plotting ocean profiles through the seasons. Since each text file corresponds to one cycle up and down the ocean, we can consider observing temperature and salinity as they vary with depth.



In [1]:
# Imports modules
import pandas as pd
import os
import numpy as np
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
output_notebook()
from bokeh.models import Label, Panel, Tabs
from bokeh.layouts import gridplot
import io

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

> We can pick whichever machine we want to visualize. **Choose the machine from the dropdown menu, and run the cell.**

In [None]:
machine = 'ITP1' #@param ["ITP1", "ITP2", "ITP6", "ITP8", "ITP41", "ITP48", "ITP49", "ITP86", "ITP91", "ITP92"]
itpnumber = machine

> We will be plotting our data using dataframes. To create these dataframes, the code reads each CSV file. Here we create ```csvfilelist``` which is a list containing all of the CSV file names. **Run the cell to view a portion of 
```csvfilelist```.**

In [None]:
# Creates an empty list, to be filled later
csvfilelist = []

# Iterates over the files in each folder and appends the CSV's filename to its coresponding list
for path in os.listdir('/content/drive/My Drive/MIT WHOI Ocean Data Project/Colab Notebooks/' + itpnumber.lower() + 'final_csvs.zip (Unzipped Files)/' + itpnumber.lower()+ 'final_csvs/'):
  full_str2 = str(path)
  if full_str2.endswith('.csv'): 
    csvfilelist.append(full_str2)

print(csvfilelist[:5])

> The first step is to create the graphs using Bokeh. We just want them to exist, they do not have any data in them at this time. **Run the cell to view the empty graphs we created with their proper axis labels.**

Note: Bokeh will print out warnings; these can be disregarded since they only mean the plot is empty.

In [None]:
# Cumulative (all seasons together)
p1c = figure(title= itpnumber + " Temperature Profiles by Season")
p2c = figure(title= itpnumber + " Salinity Profiles by Season")
# Winter graphs
p1w = figure(title= itpnumber + " Temperature Profiles, Jan-Mar")
p2w = figure(title= itpnumber + " Salinity Profiles, Jan-Mar")
# Spring graphs
p1sp = figure(title= itpnumber + " Temperature Profiles, Apr-Jun")
p2sp = figure(title= itpnumber + " Salinity Profiles, Apr-Jun")
# Summer Graphs
p1su = figure(title= itpnumber + " Temperature Profiles, Jul-Sep")
p2su = figure(title= itpnumber + " Salinity Profiles, Jul-Sep")
# Fall Graphs
p1f = figure(title= itpnumber + " Temperature Profiles, Oct-Dec")
p2f = figure(title= itpnumber + " Salinity Profiles, Oct-Dec")

# Creating axis labels
x_temp = 'Temperature, \N{DEGREE SIGN}C'
x_sal = 'Salinity'
y_all = 'Pressure, dbar'

# Applying the axis labels to each graph
(p1w.xaxis.axis_label, p1sp.xaxis.axis_label, p1su.xaxis.axis_label, p1f.xaxis.axis_label, 
 p1c.xaxis.axis_label) = x_temp, x_temp, x_temp, x_temp, x_temp  

(p2w.xaxis.axis_label, p2sp.xaxis.axis_label, p2su.xaxis.axis_label, p2f.xaxis.axis_label, 
 p2c.xaxis.axis_label) = x_sal, x_sal, x_sal, x_sal, x_sal

(p1w.yaxis.axis_label, p2w.yaxis.axis_label,
 p1sp.yaxis.axis_label, p2sp.yaxis.axis_label,
 p1su.yaxis.axis_label, p2su.yaxis.axis_label,
 p1f.yaxis.axis_label, p2f.yaxis.axis_label,
 p1c.yaxis.axis_label, p2c.yaxis.axis_label) = y_all, y_all, y_all, y_all, y_all, y_all, y_all, y_all, y_all, y_all

# Flipping all y ranges to view ocean from top to bottom
(p1w.y_range.flipped, p2w.y_range.flipped,
 p1sp.y_range.flipped, p2sp.y_range.flipped,
 p1su.y_range.flipped, p2su.y_range.flipped,
 p1f.y_range.flipped, p2f.y_range.flipped,
 p1c.y_range.flipped, p2c.y_range.flipped) = True, True, True, True, True, True, True, True, True, True

show(p1c)



> Now we use the `tab` module so we can click through the different graphs we made. The `gridplot` module puts our temperature and salinity plots for each season on one tab together. **Run the cell to click through all the tabs of the empty graphs we have created.**

In [None]:
# Cumulative Tab
tab1c = Panel(child=p1c, title='All Seasons, Temperature')
tab2c = Panel(child=p2c, title='All Seasons, Salinity') 
# Winter Tab
gridw = gridplot([[p1w,p2w]], plot_width=400, plot_height=400)
tabw = Panel(child=gridw, title='Winter')
# Spring Tab
gridsp = gridplot([[p1sp,p2sp]], plot_width=400, plot_height=400)
tabsp = Panel(child=gridsp, title='Spring')
# Summer Tab
gridsu = gridplot([[p1su,p2su]], plot_width=400, plot_height=400)
tabsu = Panel(child=gridsu, title='Summer')
# Fall Tab
gridf = gridplot([[p1f,p2f]], plot_width=400, plot_height=400)
tabf = Panel(child=gridf, title='Fall')

tabs=Tabs(tabs=[ tab1c, tab2c, tabw, tabsp, tabsu, tabf ])
show(tabs)



> Now it is time to fill the graphs with data! To do this, we will be working with `Panda DataFrames`.

> A dataframe is made for each CSV file. Depending on the time of the year that the data was colleted, our code plots the line in a specific color. This is a way to help visualize ocean temperature and salinity through the seasons. In addition, the code tallies how many of each data files there are per season.

> **Run the cell to see how our graphs now have data in them. If you want to make them your own, you can change the color of each line in the code.**

In [None]:
# This will record the amount of profiles in each season
wintertally = 0
springtally = 0
summertally = 0
falltally = 0

# Lists serve the purpose of directing the data to the right plot
specific_t_graph = []
specific_s_graph = []

# Choosing the columns from the CSV that we want to use
col_list = ['%pressure(dbar)', 'temperature(C)', 'salinity']

# Iterating through each CSV file, and creating 2 dataframes
for csvfilename in csvfilelist:
  to_read = '/content/drive/My Drive/itp1final_csvs.zip (Unzipped Files)/itp1final_csvs/' + csvfilename
  # Dataframe of pressure, temperature and salinity datapoints
  df = pd.read_csv(to_read, skiprows=1, usecols = col_list)
  df.columns = ['Pressure', 'Temperature', 'Salinity']
  # Dataframe of file's information (time and geographic coordinates)
  info = pd.read_csv(to_read, nrows=1, names=['Year', 'YearFrac', 'Long', 'Lat', 'Ndepths'])
  yrfr = int(info.YearFrac)

  # Depending on the time of the year the file was taken, assign the light a specific color and plot it in the right graph
  # Also adds a tally to the amount of profiles in each season       
  if yrfr in range(0, 92): 
    color = 'steelblue'
    wintertally += 1
    specific_t_graph = p1w
    specific_s_graph = p2w
  elif yrfr in range(92, 183): 
    color = 'palevioletred'
    springtally += 1
    specific_t_graph = p1sp
    specific_s_graph = p2sp
  elif yrfr in range(183, 275):
    color = 'lightgreen'
    summertally += 1
    specific_t_graph = p1su
    specific_s_graph = p2su
  elif yrfr in range(275, 367):
    color = 'skyblue'
    falltally += 1
    specific_t_graph = p1f
    specific_s_graph = p2f

  # For each csv file, plot its profile in its specific season graph, and in the cumulative graph    
  specific_t_graph.line(df['Temperature'], df['Pressure'], color = color, alpha=0.5)
  p1c.line(df['Temperature'], df['Pressure'], color = color, alpha=0.5)
  specific_s_graph.line(df['Salinity'], df['Pressure'], color = color, alpha=0.5)
  p2c.line(df['Salinity'], df['Pressure'], color = color, alpha=0.5)

tabs=Tabs(tabs=[ tab1c, tab2c, tabw, tabsp, tabsu, tabf ])
show(tabs)

In [None]:
# Labeling each plot with the amount of profiles in each one
citationw = Label(x=5, y=5, x_units='screen', y_units='screen',
                 text='Amount of Profiles: ' + str(wintertally), text_font_size='10pt')
citationsp = Label(x=5, y=5, x_units='screen', y_units='screen',
                 text='Amount of Profiles: ' + str(springtally), text_font_size='10pt')
citationsu = Label(x=5, y=5, x_units='screen', y_units='screen',
                 text='Amount of Profiles: ' + str(summertally), text_font_size='10pt')
citationf = Label(x=5, y=5, x_units='screen', y_units='screen',
                 text='Amount of Profiles: ' + str(falltally), text_font_size='10pt')
citationc = Label(x=5, y=5, x_units='screen', y_units='screen',
                 text='Winter: Dark Blue' + ',   \n' +
                 "Spring: Green" + ",   \n" +
                 'Summer: Pink' + ',   \n' +
                 'Fall: Light Blue', text_font_size='10pt')

# Adding the citations to their corresponding graphs
p1w.add_layout(citationw)
p1sp.add_layout(citationsp)
p1su.add_layout(citationsu)
p1f.add_layout(citationf)
p1c.add_layout(citationc)
p2c.add_layout(citationc)

tabs=Tabs(tabs=[ tab1c, tab2c, tabw, tabsp, tabsu, tabf ])
show(tabs)

## A Second Visualization: Heatmaps



> Heatmaps are another way to visualize the ITP data by observing ocean temperature by depth *through time.* Click the link below to explore a second colab notebook that explains the process behind making and understanding these heatmaps.

> [ITP Heatmap Visualization](https://colab.research.google.com/drive/1w_YFKmDuZFp3ZDv6keUQSN_ZP-8zx0QC?usp=sharing)

