# Exercise 4 - Hydro sections


**Aim:** The aim of this exercise is to get you comfortable plotting and working with matrix (2-dimensional) data.

**Learning outcomes:** At the end of this lab, you will be able to:

1. Plot 2-dimensional fields with a colorbar
   
2. Loop through a 2-dimensional matrix to perform a calculation on each profile

**Data**: work with hydrographic data from the Weddell Sea.  This is a large dataset.  First make a map of the station locations, then pick a single section to work with.
<hr>


## Handling HEX files

First use the list of hex files and read them as text to parse the latitude and longitude.  This is contained in the header information.  You can run through the `*.hex` files to find lines that start with `NMEA Latitude` and `NMEA Longitude`.

```
* Sea-Bird SBE 9 Data File:
* FileName = D:\Data\JR310\Raw data\JR310_068.hex
* Software Version Seasave V 7.22.3
* Temperature SN = 5043
* Conductivity SN = 3491
* Number of Bytes Per Scan = 37
* Number of Voltage Words = 4
* Number of Scans Averaged by the Deck Unit = 1
* System UpLoad Time = Apr 09 2015 19:55:54
* NMEA Latitude = 60 38.68 S
* NMEA Longitude = 042 05.50 W
* NMEA UTC (Time) = Apr 09 2015 19:55:52
* Store Lat/Lon Data = Append to Every Scan
** Vessel: James Clark Ross 
** Cruise: JR310 
** Station Number:  68
** Depth (EM122):  3711
** PSO: EPA 
** Operator: RDP
* System UTC = Apr 09 2015 19:55:54
*END*
```


## Writing a function to parse header files.



In [1]:
import xarray as xr
def read_hex_header(file_path: str, encoding = 'latin-1') -> xr.Dataset:
    ''' Reads a *.HEX file from Seabird into an xarray dataset. '''

    # Read the file
    with open(file_path, 'r', encoding=encoding) as file:
        lines = file.readlines()

    # Find the line with column names
    header_line_index = next((i for i, line in enumerate(lines) if line.startswith('*END*')), None)

    if header_line_index is None:
        raise ValueError("Line with header end not found in the file.")

    # Extract column names
    column_names = lines[header_line_index].strip().split()[1:]

    # Extract column units
    units = [None] + lines[header_line_index + 1].replace('[','').replace(']','').strip().split()[1:]

    # Load data into pandas DataFrame
    data_start_index = header_line_index + 3
    data = pd.read_csv(
        file_path,
        skiprows=data_start_index,
        delim_whitespace=True,
        names=column_names,
        parse_dates={'Timestamp': ['IntD', 'IntT']},
        encoding=encoding,
    )

    # Convert DataFrame to xarray dataset
    ds = xr.Dataset.from_dataframe(data.set_index('Timestamp'))

    # Assign units to data fields
    for index, name in enumerate(column_names):
        if name in ds and units[index]:
            ds[name].attrs['units'] = units[index]

    # Rename fields
    ds = ds.rename({
        'SALIN': 'Salinity',
        'Temp': 'Temperature',
        'Cond': 'Conductivity',
        'Press': 'Pressure',
        'SOUND': 'SoundVelocity',
        'SIGMA': 'Sigma',
        'Datasets': 'Sample',
    })

    # Ensure 'Timestamp' coordinate is datetime type
    ds['Timestamp'] = pd.to_datetime(ds['Timestamp'], errors='coerce')

    return ds


# Super-brief explanation of git

Git is keeping track of changes to your files within the local repository (your copy of the course folder on your computer).  When you change a file in the folder, git knows.  How do you know it knows?  If you go into your Github desktop (purple icon, desktop app) and choose the repository you're working in in the dropdown menu on the left, and you have changed a file, the filename  for the file with changes will appear in a list, and to the right you can preview the changes that have been made (previous commit and current edited version).  

After you've been working on a file for a while (a day, an hour, 5 minutes, etc), e.g. a piece of code in your repository on your computer, and you've made some important changes, you will want to tell git to save a snapshot of these changes.  To do this you click **commit** in Github desktop.  This will then make git store a version according to the point in time when you clicked commit.  *Advanced tip: in the commit message, you can write something useful to describe the changes, like "added a new function to load data from a microcat".*  So far, only your own computer knows about these changes.

When you want to share these changes with your group, i.e. have these changes show up in the repository on the gitlab.rrz.uni-hamburg.de location, then you need to **push** the changes.  This is a button to the right in your Github desktop (or top right), where you click.  This will then make the changes appear in the online copy of the repository.

When your group member has made changes, and you want these changes to appear in your copy of the code on your own computer, then you need to **fetch** these changes from the version on gitlab.rrz.uni-hamburg.de.  This will not overwrite any of your files.  It will simply add what they have done.

**Note:** For beginner usage, I recommend that you all keep a separate copy of your python code to start with.  If you work on the same piece of code, it's possible for you to delete something and commit it to your repository (on your own computer), and for someone else to edit the same thing without deleting it and commit it on their own computer.  If you both then push these changes to the repository, or you fetch your colleagues changes from the online repository onto your computer, you will get a **conflict** which needs to be merged manually.

Read more: [https://git-scm.com/book/en/v2/Getting-Started-What-is-Git%3F](https://git-scm.com/book/en/v2/Getting-Started-What-is-Git%3F)

More advanced: It's best to work inside an environment.  This controls the version of each of the packages that you have installed, so that the code is more likely to run smoothly on another person's computer.  Some background on [conda environments](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html).


# Check whether you have python installed

In Mac or Linux, at the command line, type
```
python --version
```

For example, mine says
```
(base) 9:34 ~ $ python --version
Python 3.9.5
```

## Using conda

For Mac users, you will probably want `conda` and `pip` installed on your computer.

You can get conda here: [https://conda.io/projects/conda/en/latest/user-guide/install/index.html](https://conda.io/projects/conda/en/latest/user-guide/install/index.html).  I would recommend *miniconda* (Here is the direct link for [installing miniconda](https://docs.anaconda.com/free/miniconda/miniconda-install/)).  If you already have Anaconda, that is fine too.

We will also use jupyter notebooks in this course.

## Installing jupyterlab

You can install jupyterlab with conda

    conda install -c conda-forge jupyterlab=4.0.7 notebook=7.0.6

Since we recommend managing your environments, then conda (as above) is the better way to install it.  Otherwise you can use pip: [https://jupyter.org/install](https://jupyter.org/install).

If you already have Anaconda, then jupyter lab comes by default [explained here](https://test-jupyter.readthedocs.io/en/latest/install.html).

## Now, open this notebook in jupyter notebook.

Please record for yourself the steps you took to get jupyter running on your computer.

## (optional) setting up an environment

On a Mac, in a terminal window, you will create an environment using a specified version of python.

```
conda create --name seaocn_env python=3.8 -y
conda activate seaocn_env
```
