# <center> Lab 6: Plotting and Analyzing CTD Data </center>

---
Welcome to the Python component of Lab 5! Since we went out in the boat and have collected a transect of data, we are going to actually analyze our own data! 

We will go through some more plotting and manipulation of the data we actually collected.

Just as before, we would like you to download your work as both an html and ipynb file and **save a figure of your choice** to Canvas.

---
#### Please acknowledge that you understand the instructions by copying and pasting each of the following into the next cells.

#I understand how to save my progress and reopen the notebook.

#I understand that I am being asked to save and submit copies of my notebook as well as a figure of my choice for the assignment.

#I understand that I need to comment my code.


## Let's get started!

First, import a couple packages that we will use throughout this assignment

In [None]:
#Import Pandas and alias it as 'pd'
#for manipulating tables and timeseries
import pandas as pd

#Import MatPlotLib and alias it as 'plt'
#for making plots/graphs/figures
import matplotlib.pyplot as plt

#Import Numpy and alias it as 'np'
#for the arrays
import numpy as np

### Read in data

In [None]:
#Start by reading in the data files
#We'll help with by loading station 4
#Keeping the variable names consisten with our recommendations will help later in the assignment

filename4 = "station4_CTD0_092422.xls" # name it filename4 for station 4, filename1 for station 1, etc.
df_4 = pd.read_excel(filename4) # name it df_4 for station 4, df_1 for station 1, etc.

In [None]:
# peek at the data
df_4.head()

Now it's your turn! Read in the remaining data files, take a peek at the data, and answer the questions from Exercise 1.

### Exercise 1: Answer some Qs about the data

1. How does the CTD calculate depth?

2. How does it measure salinity?

3. What is the importance of O2 measurements? What are some possible sources and sinks of oxygen in the water column?


4. Look at the headings of the columns. Make a table in markdown showing the abbreviation (in the raw data), measured variable, and unit for *any five variables*. See the syntax for a table in the next cell.


| Heading1 | Heading2 | Heading3 |
| :-: | :-: | :-: |
| cellinfo1 | cellinfo2 | cellinfo3 |

This is the corresponding code; do not execute this cell

| Heading1 | Heading2 | Heading3 | #You can have as many rows as you'd like, but each must be bookended by vertical lines
| :-: | :-: | :-: | #This line aligns the cells in the middle
| cellinfo1 | cellinfo2 | cellinfo3 | #Repeat this line for as many additional rows as you'd like

Make the table below. Keep the cell in markdown mode.

---
# Vertical Profiles

In [None]:
#Let's start with the temperature at station 1

fig1, (ax1) = plt.subplots() # There's only 1 plot, so subplot argument is empty
ax1.plot(df_4["TempC"],df_4["Depth"])

# Draw x label
ax1.set_xlabel('Temperature (C)')
ax1.xaxis.set_label_position('top') # this moves the label to the top
ax1.xaxis.set_ticks_position('top') # this moves the ticks to the top

# Draw y label
ax1.set_ylabel('Depth (m)')
ax1.set_ylim(ax1.get_ylim()[::-1]) #this reverses the yaxis (i.e. deep at the bottom)


### Add another CTD profile to the plot

Let's add another variable to the graph to see two CTD profiles plotted together.

We can plot a second variable with temperature by creating another axis using
ax2 = ax.twiny()

By default, ax1 will be on the bottom, and ax2 will be on the top. Because we want temperature displayed on the top, we will call it ax2. We no longer need the command moving the x-axis and tick marks for temperature to the top of the graph.

Let's add a plot of DO on ax1.

In [None]:
#plot DO and temp profiles
fig1, (ax1) = plt.subplots() # There's only 1 plot, so subplot argument is empty

ax1.plot(df_4["O2"],df_4["Depth"]) #plot DO
ax1.set_xlabel('DO (mg/l)')
ax1.set_ylabel('Depth (m)')
ax1.set_ylim(ax1.get_ylim()[::-1]) #this reverses the yaxis (i.e. deep at the bottom)

ax2 = ax1.twiny() #plot a second variable
ax2.plot(df_4["TempC"],df_4["Depth"]) # plot temp
ax2.set_xlabel('Temperature (°C)')

Ok this looks pretty good except we can't tell which line is which variable. 
Let's change the colors for each variable and axis, so we can easily read the plot. 
Let's take a look at the base colors available for matplotlib.

| Syntax | Color |
| :-: | :-: |
| 'b' | blue |
| 'g' | green |
| 'r' | red |
| 'c' | cyan |
| 'm' | magenta |
| 'y' | yellow |
| 'k' | black |
| 'w' | white |

There are a lot more colors available, but these are your basic choices. 


Now we can use these commands to change the colors of the plot

| Syntax | Effect |
| :-: | :-: |
| ax.xaxis.label.set_color('y') | Set X-axis label color |
| ax.tick_params(axis='x', colors='y') | Set X-axis tick color |
| ax.spines['top'].set_color('y') | Set top spine color |

Let's start with DO.

In [None]:
#plot DO and temp profiles
fig1, (ax1) = plt.subplots() # <-- only 1 plot, so subplot argument is empty
ax1.plot(df_4["O2"],df_4["Depth"], "g") #plot DO in green
ax1.set_xlabel('DO (mg/l)')
ax1.set_ylabel('Depth (m)')
ax1.set_ylim(ax1.get_ylim()[::-1]) #this reverses the yaxis (i.e. deep at the bottom)
ax1.xaxis.label.set_color('g') #setting up X-axis label color to green
ax1.tick_params(axis='x', colors='g') #setting up X-axis tick color to green
ax1.spines['bottom'].set_color('g') #setting up bottom spine as green

ax2 = ax1.twiny()
ax2.plot(df_4["TempC"],df_4["Depth"]) # plot temp
ax2.set_xlabel('Temperature (°C)')


### Exercise 2

Recreate the plot above, but make the below changes:

Change the xticks, xlabel, xspine, and x values for temp to all the same color.
You can keep it as the default blue or select another color. 

---
# Vertical Profile Subplots

Instead of graphing the CTD profiles for variables on the same graph, let's try doing subplots. Sometimes a single graph for multiple variables is too messy, and having subplots can display the information more clearly.

We'll start with our original plot for temp, then add a plot next to it for salinity

In [None]:
# Create subplots of temp and salinity

#subplot now has ax1 and ax2
#Subplot arguments are (1, 2, sharey = True)
#(1,2) makes 1 row and 2 columns of subplots
#sharey = true allows them to share y-axes with the same scaling and scale labels.
fig1, (ax1, ax2) = plt.subplots(1,2,sharey=True)

# Temperature
ax1.plot(df_4["TempC"],df_4["Depth"]) # plot
ax1.set_xlabel('Temperature (°C)') # x axis
ax1.xaxis.set_label_position('top') 
ax1.xaxis.set_ticks_position('top') 
ax1.set_ylim(ax1.get_ylim()[::-1])
ax1.set_ylabel('Depth (m)') # y axis

# Salinity
ax2.plot(df_4["Salinity"],df_4["Depth"], 'r') # plot
ax2.set_xlabel('Salinity (psu)') # x axis
ax2.xaxis.set_label_position('top') 
ax2.xaxis.set_ticks_position('top') 
ax2.set_ylabel('Depth (m)') # y axis
fig1.suptitle("Station 4, Sep 24, 2022") #gives a grand title that covers all subplots

How does the title look to you? 
We need to move it in order to read it! 

### Exercise 3

Add x and y coordinates to move the grand title up using:

fig1.suptitle("Station 1, Sep 24, 2022", x= , y= ) 

Try your own values for x and y until you find ones that make sense.

Let's also get rid of the ylabel for the 2nd plot since we don't really need it. 

### Exercise 4:

Add 2 more subplots of the variable of your choosing. Make each variable a different color or style of line. 

Put the new plots next to the first two.


### Exercise 5: Answer the Questions Below

Look at the variables you plotted. 

1. Describe the characteristics of each vertical profile. What, if any, are the differences between the surface, middle, and bottom water?

2. How many layers of water masses are indicated from the vertical profiles? 

3. Under what conditions would you expect to see the values for a given variable to be the same at the surface and bottom? If surface and bottom values were the same for salinity, would you expect them to be the same for other variables too?

4. What variable do you think is driving the differences between the water column layers?


---
# Vertical profiles of our Transect
This transect runs from inshore (station 1) to offshore (station 4). 

Let's plot a 2D contour map of a variable across the transect.

This will be similar to the colormap we made in the Lab 3 of Python Intro I

First we have to read in our location data and make corresponding arrays. Since this is a North to South transect, we will deal with latitude only. Not all cruise tracks will be mostly in a straight line; often you will have to deal with both longitude and latitude.

In [None]:
#Ensure that you have uploaded all your data files.

filename5 = "Lab6_lat.xlsx"
df_lat = pd.read_excel(filename5)

In [None]:
print(df_lat)

Our goal is graph each variable using depth as the x axis, latitude as the y axis, and z, or the contour, as the CTD variable we are interested in. We can also think of the y axis as the station locations, just represented by their latitude. 

First we need to create an array of each latitude for each station.

Let's start with the station 1, the most offshore station.

In [None]:
#how long of a latitude array do we need?
#pick any variable in MB and check its size

df_1["TempC"].size

In [None]:
#We need to make a 361x1 array for latitude at station MB
#Populate the array with the lat for the station.

MB_lat = np.full((361,1),30.16694)
print(MB_lat)

Looks good. Now we need to do the same thing for the rest of the stations. 

### Exercise 6

Create latitude arrays for stations 2, 3, and 4.

Great! Now we have all the latitude arrays.

If we want to plot the CTD profile of a variable across an entire transect, we need to combine the data from each station. We use concatenate commands to do this. Concatenation basically just means "combining".

Let's start by doing that for the latitudes arrays that we just created. We can combine them using

np.concatenate([array1, array2, etc.])

In [None]:
# combine the lat arrays

lat_all = np.concatenate([]) #fill in the arguments
print(lat_all)

Great. Now let's combine all our depths from each station.
We must concatenate in the same order as we did for the latitudes, so that all the values match up.

Since our depths are in a dataframe, we will use pd.concat() instead of np.concatenate()

In [None]:
# vert cat depths
#make sure the dataframe names match yours.

depth_all = pd.concat([df_1["Depth"],df_2["Depth"], df_3["Depth"], df_4["Depth"]])
print(depth_all)

### Exercise 7

Pick a CTD variable that you want to plot across the transect.

Using pd.concat(), concatenate that variable just like you did for depth.

So at this point we should have nice concatenated arrays or dataframe columns for depth, latitude, and whichever CTD variable you choose

To plot our transect contour, let's first reassign the names depth, latitude, and CTD variable so we don't get confused

Execute this code below:
```
    x = lat_all
    y = depth_all
    z = temp_all <--- put in the concatenated CTD variable you chose
```

Now we have our x, y, and z in order, we can plot the transect contour using this code:

```
colorby = z #color by the variable you chose
colormap = '' #pick a colormap scheme
plt.figure()
plt.scatter(x,y,z, c=colorby, cmap=colormap, alpha=1.0) 
plt.gca().invert_yaxis()
plt.gca().invert_xaxis() # invert lats so we get a plot that is oriented N-->
plt.colorbar()
```

Remember from our first Python lab:
```
x = data position
y = data position
c = sequence of numbers to be mapped to colors
cmap = colormap
alpha = transparency, where opaque==1
```

And you can change pick a different colormap scheme from here: https://matplotlib.org/stable/tutorials/colors/colormaps.html

Now add some axis labels and a title to your plot

### Exercise 8: Describe the plot

How does the variable change from inshore to offshore?

Do the values at the surface change from inshore to offshore?

What do you think is the main driver of the pattern you see?


### Exercise 9: Create 2 more CTD profile transects


Now let's examine the transect of 2 more CTD profiles.
Pick 2 other CTD variables and plot them in the same way you did above.


Change the colormap scheme for each of the plots.


Remember, you will have to concatenate the new variables for your new zs, but you can use the same x and y as we already defined. 


After you complete each plot, describe the patterns you see in the profiles as they move from the bay to offshore. 


Add as many cells below as you need to complete these two plots.