In [1]:
# Initialize Otter
import otter
grader = otter.Notebook("group_activity3.ipynb")

# In-class coding exercise #3
Objective: In this exercise you will practice numpy, datestrings, and control flow.

## Introduction to group coding exercises
Today you’ll work on this exercise in the same group of 3-4 as last week, submitting a single notebook file at the end of the class period. Decide amongst yourselves which member will upload the completed notebook to Gradescope this week. Make sure that everyone takes a turn being the “Uploader”. _You cannot upload the final code two weeks in a row._

### Workflow
Each question will be timed to ensure that everyone gets to work on at least a part of every question. Group activities are not graded by completeness or correctness, but by effort. We will be breaking down each question in the following order:  
1. Independent work 
2. Group work and discussion on coding question
3. Group work and discussion on reflection questions

You are welcome and encouraged to communicate with other groups and the teaching team when you feel stuck on a problem. 

As a reminder, we will be grading based best practices in coding. These include: 
1) Variables are used to store objects

2) Code is commented adequately

3) Variables are names appropriately

4) Code is efficient with minimal unnecessary lines 

5) Documenting help from outside sources, such as from other groups or online documentation. 

6) Final notebook fully runs from start to finish. A good way to check this is restarting the kernel and fully running through all the cells to check for any errors.

## Note here **and in the Gradescope submission** each of your group members:
1. Liangtong(Sry) Wei
2. Jiaqi Li
3. Sofia Almeida

# Question 1

![picture](https://www.awi.de/fileadmin/_processed_/9/9/csm_12072012_Polarstern_FMehrtens_001_5b8b956779.jpg)
*Image: The research vessel R/V Polarstern.*

### **Thermosalinograph time series**
*Useful resources:* Prelectures on NumPy arrays and functions, multidimensional arrays, datetime objects, online documentation on `datetime` and `numpy`.

During an oceanographic cruise, research vessels will often have an instrument called a thermosalinograph taking temperature measurements throughout the duration of the cruise. To get the most accurate picture of the surrounding conditions, there are usually 2 different temperature sensors in the water, situated at different depths. For this problem you are provided two NumPy arrays of sample data from a cruise in 2003 aboard the [R/V Polarstern](https://mosaic-expedition.org/expedition/polarstern/).

<br>

**Numpy ndarrays():**

* `T_data` contains 4 columns (2-D) (longitude, latitude, temperature [˚C] at 5 meters, and temperature [˚C] at 11 meters)

* `time_data` contains a single dimension (1-D) with strings containing the date/time information for the temperature measurements with the format %Y-%m-%d %H:%M:%S

<br>

### Instructions:
1. Use the provided print statements to display your results. 
2. Store your answers in the provided answer variables.

In [2]:
# Do not alter the code in this cell
# you only need to import these one time in a notebook!
import numpy as np
from datetime import datetime

# Data in this array consists of 4 columns:
# Latitude, longitude, T at 5 m (˚C), T at 11 m (˚C)

T_data = np.array([[51.7439,2.4476,14.726,14.736],[51.7147,2.4071,14.746,14.756],[51.6851,2.3664,14.796,14.816],[51.6561,2.3254,14.856,14.866],
  [51.627,2.2854,14.866,14.876],[51.5981,2.2454,14.896,14.916],[51.5689,2.2055,14.936,14.946],[51.5404,2.1661,14.946,14.956],
  [51.5122,2.127,14.936,14.946],[51.4831,2.087,14.956,14.966],[51.4545,2.0478,15.016,15.026],[51.4271,2.01,15.106,15.116],
  [51.3959,1.9686,15.136,15.146],[51.3635,1.9252,15.086,15.086],[51.3304,1.8848,14.826,14.826],[51.2986,1.8437,14.616,14.626],
  [51.2679,1.8036,14.527,14.547],[51.2371,1.7642,14.636,14.646],[51.207,1.7255,14.666,14.686],[51.1782,1.6886,14.766,14.786],
  [51.1497,1.6519,14.736,14.756],[51.1215,1.6156,14.716,14.726],[51.0984,1.581,14.656,14.666],[51.077,1.5485,14.567,14.577],
  [51.0586,1.5198,14.467,14.477],[51.0354,1.4841,14.247,14.257],[51.0088,1.4431,14.117,14.147],[50.9829,1.4033,14.307,14.327],
  [50.957,1.3635,14.337,14.347],[50.9314,1.324,14.307,14.327],[50.9077,1.2801,14.327,14.337],[50.8867,1.2301,14.207,14.217],
  [50.8654,1.1789,14.157,14.177],[50.8436,1.1266,14.167,14.187],[50.8213,1.0736,14.137,14.157],[50.7988,1.0196,14.257,14.277],
  [50.776,0.9649,14.437,14.447],[50.7527,0.9096,14.626,14.646],[50.7295,0.8538,14.796,14.806],[50.7059,0.7976,14.836,14.846],
  [50.6826,0.7407,14.806,14.816],[50.6626,0.6806,14.806,14.816],[50.6388,0.6227,14.826,14.836],[50.615,0.5641,14.826,14.836],
  [50.6005,0.4986,14.786,14.796],[50.5881,0.4317,14.786,14.786],[50.5756,0.3649,14.756,14.766],[50.5632,0.2975,14.826,14.836],
  [50.5509,0.2306,14.886,14.896],[50.5386,0.1641,15.006,15.016],[50.5263,0.0974,15.176,15.186],[50.5138,0.0313,15.196,15.196],
  [50.5018,-0.0345,15.186,15.196],[50.4897,-0.0997,15.286,15.296],[50.4778,-0.1644,15.346,15.356],[50.466,-0.2284,15.386,15.396],
  [50.454,-0.2916,15.376,15.386],[50.4426,-0.3536,15.366,15.376],[50.4313,-0.4153,15.416,15.416],[50.4168,-0.4275,15.456,15.466],
  [50.409,-0.4882,15.436,15.446],[50.4017,-0.5474,15.466,15.476],[50.3933,-0.6047,15.426,15.426],[50.3796,-0.6583,15.396,15.406],
  [50.3668,-0.7114,15.396,15.406],[50.3524,-0.763,15.396,15.406],[50.3396,-0.8151,15.396,15.406],[50.3288,-0.8668,15.476,15.486],
  [50.3223,-0.9188,15.556,15.566],[50.316,-0.97,15.616,15.636],[50.3092,-1.0191,15.696,15.706],[50.3024,-1.0675,15.746,15.756]])


# Data in this array has only one dimension:
# Date/Time (Y-m-d H:M:S) string

time_data = np.array(['2003-10-23 06:25:00','2003-10-23 06:35:00',
  '2003-10-23 06:45:00','2003-10-23 06:55:00','2003-10-23 07:05:00',
  '2003-10-23 07:15:00','2003-10-23 07:25:00','2003-10-23 07:35:00',
  '2003-10-23 07:45:00','2003-10-23 07:55:00','2003-10-23 08:05:00',
  '2003-10-23 08:15:00','2003-10-23 08:25:00','2003-10-23 08:35:00',
  '2003-10-23 08:45:00','2003-10-23 08:55:00','2003-10-23 09:05:00',
  '2003-10-23 09:15:00','2003-10-23 09:25:00','2003-10-23 09:35:00',
  '2003-10-23 09:45:00','2003-10-23 09:55:00','2003-10-23 10:05:00',
  '2003-10-23 10:15:00','2003-10-23 10:25:00','2003-10-23 10:35:00',
  '2003-10-23 10:45:00','2003-10-23 10:55:00','2003-10-23 11:05:00',
  '2003-10-23 11:15:00','2003-10-23 11:25:00','2003-10-23 11:35:00',
  '2003-10-23 11:45:00','2003-10-23 11:55:00','2003-10-23 12:05:00',
  '2003-10-23 12:15:00','2003-10-23 12:25:00','2003-10-23 12:35:00',
  '2003-10-23 12:45:00','2003-10-23 12:55:00','2003-10-23 13:05:00',
  '2003-10-23 13:15:00','2003-10-23 13:25:00','2003-10-23 13:35:00',
  '2003-10-23 13:45:00','2003-10-23 13:55:00','2003-10-23 14:05:00',
  '2003-10-23 14:15:00','2003-10-23 14:25:00','2003-10-23 14:35:00',
  '2003-10-23 14:45:00','2003-10-23 14:55:00','2003-10-23 15:05:00',
  '2003-10-23 15:15:00','2003-10-23 15:25:00','2003-10-23 15:35:00',
  '2003-10-23 15:45:00','2003-10-23 15:55:00','2003-10-23 16:05:00',
  '2003-10-23 16:15:00','2003-10-23 16:25:00','2003-10-23 16:35:00',
  '2003-10-23 16:45:00','2003-10-23 16:55:00','2003-10-23 17:05:00',
  '2003-10-23 17:15:00','2003-10-23 17:25:00','2003-10-23 17:35:00',
  '2003-10-23 17:45:00','2003-10-23 17:55:00','2003-10-23 18:05:00',
  '2003-10-23 18:15:00','2003-10-23 18:25:00'])

## Part 1 (20 minutes)
Create a new 1-dimensional array of `datetime` objects by converting the date strings in the `time_data` array. Store your converted array in the `dstr_array` variable. Using your `dstr_array` variable, answer the following questions:
> **a.** How much time passed between the first and last measurement? Report this in hours and store your answer in the `timediff` variable.
>
> **b.** What is the frequency of these measurements (e.g. how often did measurements occur)? You can assume that the measurements were all collected at the same frequency. Report this in minutes and store your answer in the `samprate` variable.
>

_HINT_: You can leave your answers as `timedelta` objects.

In [15]:
# your code

## answer variables
dstr_array = [datetime.strptime(date, "%Y-%m-%d %H:%M:%S") 
              for date in time_data]
# part 1a
timediff = dstr_array[-1] - dstr_array[0]
# part 1b
samprate = dstr_array[1] - dstr_array[0]

# print statements 
print("part 1a) The time difference is", timediff, "hours")
print("part 1b) The sample rate is", samprate, "minutes")

part 1a) The time difference is 12:00:00 hours
part 1b) The sample rate is 0:10:00 minutes


# Part 2 (10 minutes)
Find the maximum and minimum values of latitude and longitude in the`T_data` array.

Store your answers for maximum/minimum latitude in the `max_lat` and `min_lat` variables, and maximum/minimum longitude in the `min_lat` and `min_lon` variables.

In [16]:
# your code
latitude = [data[0] 
            for data in T_data]
longitude = [data[1] 
             for data in T_data]

## answer variables
min_lat = min(latitude)
max_lat = max(latitude)
min_lon = min(longitude)
max_lon = max(longitude)
print("part 2) max lat =", max_lat, "and min lat =", min_lat)
print("max lon =", max_lon, "and min lon =", min_lon) 

part 2) max lat = 51.7439 and min lat = 50.3024
max lon = 2.4476 and min lon = -1.0675


# Part 3 (25 minutes)
Create and print a new 1-dimensional array containing the averaged temperatures between 5 meters and 11 meters for each measurement time. Store your new temperature array in the `t_mean` variable. 

If ocean temperatures change linearly between these two depths, what is the approximate _depth_ in the ocean that these average temperatures represent? Store your averaged depth in the `mean_depth` variable.

In [17]:
# your code
# example use of numpy.mean()
# np.mean([1, 2, 3, 4])

# answer variables
t_mean = [np.mean([data[2],data[3]]) 
          for data in T_data
         ]
mean_depth = (5 + 11) / 2

print("Part 3)", t_mean)
print("The mean depth is", mean_depth, "meters") #print mean depth

Part 3) [14.731000000000002, 14.751000000000001, 14.806000000000001, 14.861, 14.870999999999999, 14.906, 14.940999999999999, 14.951, 14.940999999999999, 14.960999999999999, 15.021, 15.111, 15.141, 15.086, 14.826, 14.620999999999999, 14.536999999999999, 14.641, 14.676, 14.776, 14.746, 14.721, 14.661000000000001, 14.572, 14.472000000000001, 14.251999999999999, 14.132000000000001, 14.317, 14.341999999999999, 14.317, 14.332, 14.212, 14.167, 14.177, 14.147, 14.267, 14.442, 14.636, 14.800999999999998, 14.841000000000001, 14.811, 14.811, 14.831, 14.831, 14.791, 14.786, 14.761, 14.831, 14.891, 15.011, 15.181000000000001, 15.196, 15.190999999999999, 15.291, 15.350999999999999, 15.391, 15.381, 15.370999999999999, 15.416, 15.460999999999999, 15.440999999999999, 15.471, 15.426, 15.401, 15.401, 15.401, 15.401, 15.481000000000002, 15.561, 15.626, 15.701, 15.751000000000001]
The mean depth is 8.0 meters


4. Make a new array with the following data as **rows** and print your new array*: (25 minutes)

      Latitude

      Longitude
      
      Averaged Temperature from part 3
    
    *Your final array should be a 3-dimensional array with a final shape of (3,72). HINT: what numpy array manipulation routine returns the dimensions (shape) of an array?
    

_**Store all your answers in the specified variables in the code below.**_

## Part 4 (30 minutes)

Make a new array with the following data as **rows***: (2 points)

      Latitude

      Longitude
      
      Averaged Temperature from part 3
    
    
    
Your final array should be a 3-dimensional array with a final shape of (3,72). Store your final array in the `new_array` variable.

_HINT_: Use the [Numpy documentation](https://numpy.org/doc/1.26/reference/routines.array-manipulation.html) to help you reshape your array.

In [20]:
# your code

# answer variable
new_array = np.array([latitude, longitude, t_mean])
print(new_array)
print(np.shape(new_array))

[[ 5.17439e+01  5.17147e+01  5.16851e+01  5.16561e+01  5.16270e+01
   5.15981e+01  5.15689e+01  5.15404e+01  5.15122e+01  5.14831e+01
   5.14545e+01  5.14271e+01  5.13959e+01  5.13635e+01  5.13304e+01
   5.12986e+01  5.12679e+01  5.12371e+01  5.12070e+01  5.11782e+01
   5.11497e+01  5.11215e+01  5.10984e+01  5.10770e+01  5.10586e+01
   5.10354e+01  5.10088e+01  5.09829e+01  5.09570e+01  5.09314e+01
   5.09077e+01  5.08867e+01  5.08654e+01  5.08436e+01  5.08213e+01
   5.07988e+01  5.07760e+01  5.07527e+01  5.07295e+01  5.07059e+01
   5.06826e+01  5.06626e+01  5.06388e+01  5.06150e+01  5.06005e+01
   5.05881e+01  5.05756e+01  5.05632e+01  5.05509e+01  5.05386e+01
   5.05263e+01  5.05138e+01  5.05018e+01  5.04897e+01  5.04778e+01
   5.04660e+01  5.04540e+01  5.04426e+01  5.04313e+01  5.04168e+01
   5.04090e+01  5.04017e+01  5.03933e+01  5.03796e+01  5.03668e+01
   5.03524e+01  5.03396e+01  5.03288e+01  5.03223e+01  5.03160e+01
   5.03092e+01  5.03024e+01]
 [ 2.44760e+00  2.40710e+00  2.36

<!-- BEGIN QUESTION -->

# Reflection questions (5 minutes)
The purpose of the reflection is to inform us as instructors about students comfort level with course content. We use these answers to inform how we spend class time and design coursework in subsequent weeks. This question is graded for completeness, so please answer each question in the text box below. Be concise in your answers (max. 2 sentences). 

1) What do you feel you excelled at in this exercise? Why?

2) What did you struggle with most in the exercise? Why?

3) Is there any section of the question that you did not complete? If so, briefly describe why and the section you spent the most time on. 

4) Is there any topic you feel we need to revisit or review in class? Why?

1) Really good reviewing of numpy syntax and got a lot of practice on the datatime module which I did not practice much before
2) How to get the mean value in part 3. Struggled with the function and how it works. 
3) N/A
4) N/A

<!-- END QUESTION -->

