<a href="https://www.hydroffice.org/epom/"><img src="images/000_000_epom_logo.png" alt="ePOM" title="Open ePOM home page" align="center" width="12%" alt="Python logo\"></a>

<a href="https://piazza.com/class/js5dnu0q39n6qe"><img src="images/help.png" alt="ePOM" title="Ask questions on Piazza.com" align="right" width="10%" alt="Piazza.com\"></a>
# Summing-Up

This is the last notebook of this collection. We will not introduce significant new concepts, but we will apply what has been discussed in the past notebooks.

We will do this by creating a class that holds the data and the functions for reading and writing data in a format that is more complex than previously encountered.

This is the text content of the `ctd.txt` file in the `data` folder:

![ctd_txt](images/010_000_ctd_txt.png)

As you can see in the above image, the first four rows contain some metadata describing when and where the data were collected.

Starting from the fifth row, the file has a structure of four columns, with observations of depth, sound speed, temperature, and salinity.

In ocean mapping, it is common to collect multiple oceanographic measurements using a [CTD instrument](https://en.wikipedia.org/wiki/CTD_(instrument)). 

## Data Class Creation

As done in the [A Class as a Data Container notebook](008_A_Class_as_a_Data_Container.ipynb), we will first create a class with the `init(self)` special method:

In [None]:
import os

class CTDData:
    """A class for CTD data"""
    
    def __init__(self):
        self.metadata = dict()        
        self.depth_values = list()
        self.ss_values = list()        
        self.temp_values = list()
        self.sal_values = list()

The above class is richer in attributes than the previous ones that we created. We need to accommodate the metadata and the four columns representing different types of observations. 

## Data Path Retrieval

We will retrieve the full path to the `ctd.txt` file:

In [None]:
def get_data_paths():
    data_paths = list()
    cur_folder = os.path.abspath(os.path.curdir)
    data_folder = os.path.join(cur_folder, "data")
    data_filenames = os.listdir(data_folder)
    
    for data_filename in data_filenames:
        data_path = os.path.join(data_folder, data_filename)
        data_paths.append(data_path)
    
    data_paths.sort()  # sort in alphabetical order
    
    return data_paths

retrieved_paths = get_data_paths()
input_path = retrieved_paths[0]  # 0 because is the first file in our data folder
print("input path: " + input_path)

## Creating a Function to Read the CTD File

Similarly to what we did in the [Read a Text File section](006_Read_and_Write_Text_Files.ipynb#Read-a-Text-File), we will define a function to read the data:

In [None]:
def read_ctd_data(path, data):
    # check whether the passed file does not exist
    if not os.path.exists(path):
        raise RuntimeError("Unable to locate %s" % (path, ))

    # read the file content
    ctd_file = open(path)
    ctd_content = ctd_file.read()
    ctd_file.close()

    ctd_lines = ctd_content.splitlines()
    count = 0  # initialize the counter for the number of rows read 
    for ctd_line in ctd_lines:

        if count < 4: # metadata 
            meta_pair = ctd_line.split()  # 'tokenize' the string
            data.metadata[meta_pair[0]] = meta_pair[1]  # use the tokens as keys and values

        else:  # measures
            measures = ctd_line.split()
            data.depth_values.append(float(measures[0]))
            data.ss_values.append(float(measures[1]))
            data.temp_values.append(float(measures[2]))
            data.sal_values.append(float(measures[3]))

        count += 1  # the result is equal to writing: count = count + 1

In the above code we use the [`str.split()`](https://docs.python.org/3.9/library/stdtypes.html?#str.split) method. 

This method returns a list of the words in a string by splitting it using a delimiter (e.g., `":"`) passed as a parameter. 

In [None]:
time_str = "14:02:39"
time_list = time_str.split(":")
print("The resulting list after splitting time_str is: %s" % (time_list, ))

In case that a delimiter parameter is **not** specified (as we did for the measures section of the code), the following splitting algorithm is applied: *"runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace."* This sentence comes from the official Python documentation! We will leave the interpretation to you... To help you, look at the example below:

In [None]:
sample_str = "0.003  1501.09    3.7610       25.0900"
sample_list = sample_str.split()
print("The resulting list after splitting sample_str is: %s" % (sample_list, ))

## Reading the Data

It is now time to create an instance (**instantiate**) our `CTDData` class and to call our `read` function:

In [None]:
ctd_data = CTDData()
read_ctd_data(input_path, ctd_data)
print("The metadata are: %s" % (ctd_data.metadata, ))
print("Nr. of samples: %s" % (len(ctd_data.depth_values), ))

The data are now **loaded in memory**. 

We can check the success of this operation by printing depths and sound speed values. We will do this by accessing the values by index with the help of the [`range()`](https://docs.python.org/3.9/library/stdtypes.html?#range) type.

A `range()` with an integer value as single parameter represents a sequence of numbers ranging from 0 up to (but not including) the value passed as a parameter. In the code below, we use `range` with `10`:

In [None]:
for value in range(10):
    print("Current range value: %s" % (value, ))

Thus, we can use `range()` with the number of loaded samples to print all the values in the `depth_values` and `ss_values` lists preceded by the corresponding index:

In [None]:
nr_of_samples = len(ctd_data.depth_values)

for index in range(nr_of_samples):
    print("%s %.3f %.2f" % (index, ctd_data.depth_values[index], ctd_data.ss_values[index]))

It worked! We have been able to read a complex file format.

***

<img align="left" width="6%" style="padding-right:10px; padding-top:10px;" src="images/refs.png">

## Useful References

* [The official Python 3.9 documentation](https://docs.python.org/3.9/index.html)
* [CTD instrument](https://en.wikipedia.org/wiki/CTD_(instrument))

<img align="left" width="5%" style="padding-right:10px;" src="images/email.png">

*For issues or suggestions related to this notebook, write to: epom@ccom.unh.edu*

<!--NAVIGATION-->
[< A Class as a Data Container](008_A_Class_as_a_Data_Container.ipynb) | [Contents](index.ipynb) | [Congrats >](congrats.ipynb)