<img align="center" width="12%" style="padding-right:10px;" src="../Images/Ccom.png">

# Lab A, Step 1: File Parsing

In [18]:
%load_ext autoreload
%autoreload 2

import sys
import os
import numpy as np

sys.path.append(os.getcwd())  # add the current folder to the list of paths where Python looks for modules 

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In this Notebook you will further develop classes for handling various ocean mapping data by adding a read method. 

To refresh your memory on reading **text files** please refer to the ePOM *Programming Basics with Python for Ocean Mapping* [Read and Write Text Files](../../python_basics/006_Read_and_Write_Text_Files.ipynb) notebook. Similarly for help with **classes** use the [A class as a Data Container](../../python_basics/008_A_Class_as_a_Data_Container.ipynb) notebook.

Progressing through this notebook You will create the class definitions and keep adding **code** to it. Each class definition will be contained in a single code cell in a notebook that has the same name as the class. 

---
___

## 1.1 Time Series File Parsing


---
### 1.1.1 File Parsing Method Definition

In the code cell below you see a method added to the WaterLevel `class`, this method provides the ability to parse a data file. So what happens here?


In [19]:
import os.path
from datetime import datetime, timezone

class WaterLevel:
    """A Class for Water Level Data"""

    def __init__(self):

        ...

    # The I/O methods:
            
            
    def read_jhc_file(self, fullpath):

        # Check the File's existence
        if os.path.exists(fullpath):
            self.metadata["Source File"] = fullpath
            print('Opening water level data file:' + fullpath)
        else:  # Raise a meaningful error
            raise RuntimeError('Unable to locate the input file' + fullpath)

        # Open, read and close the file
        wl_file = open(fullpath)
        wl_content = wl_file.read()
        wl_file.close

        # Tokenize the contents
        wl_lines = wl_content.splitlines()
        count = 0  # initialize the counter for the number of rows read
        for wl_line in wl_lines:
            observations = wl_line.split()  # Tokenize the string
            epoch=datetime.fromtimestamp(float(observations[5]), timezone.utc)
            self.times.append(epoch)
            self.water_levels.append(float(observations[6]))
            count += 1

___

So what happens here?

    def read_jhc_file(self, fullpath):
    
The line defines a method named **read_jhc_file** for the class **WaterLevel**. It takes the arguments *self*, a reference to the object of the class itself, and *fullpath* which we expect to contain a `string` holding the full path and file name of file containing water level data.

    # Check the File's existence
    if os.path.exists(fullpath):
        print('Opening water level data file:' + fullpath)
    else:  # Raise a meaningful error
        raise RuntimeError('Unable to locate the input file' + fullpath)*

These lines ensure that a file exists at the location provided by *fullpath*, if one does not exist a meaningful error message will be produced. The amount of checking that you do will depend on the purpose of the code. In the case of this class where we are creating some development code we will do relatively little checking leading to code that is not overly robust. Commercial software developers will need to produce code that is very robust i.e., they need to spend a lot of time evaluating the validity of the arguments passed to methods.

    # Open, read and close the file
    wl_file = open(fullpath)
    wl_content = wl_file.read()
    wl_file.close
    
In these lines a `file` located at *fullpath* is opened. The variable *wl_file* references this file. Again, this could be a location where we test the validity of the file to make the code more robust, but we will not do so. We read the contents of the file into the variable *wl_content*. It should be pointed out here that the data is contained in a **text** or **ASCII** file and that the variable will be of type `str`. After we have read the file all the contents are stored in *wl_content* and we may close the file.

The contents of the file looked as follows

        2011 124 03 30 00.00 1304479800.000 0.137
        2011 124 03 36 00.00 1304480160.000 0.137
        2011 124 03 42 00.00 1304480520.000 0.127
        2011 124 03 48 00.00 1304480880.000 0.137
        2011 124 03 54 00.00 1304481240.000 0.127
        2011 124 04 00 00.00 1304481600.000 0.127
        2011 124 04 06 00.00 1304481960.000 0.117
        2011 124 04 12 00.00 1304482320.000 0.127
        
That is: the first columns represent the year, day number, hour, minute, second, UNIX time and water level respectively. Each row represents a single record. Therefore it makes sense to parse the file on record by record i.e., line by line basis. However, the variable wl_content holds the contents of the file in a single string; let's break up the file contents into lines i.e., records:        

    # Tokenize the contents
    wl_lines = wl_content.splitlines()
    
After this the records are now contained as strings in the `list` wl_lines. This helps as we can now address each record individually. However, each record is still represented by a single string.
    
    count = 0  # initialize the counter for the number of rows read
    for wl_line in wl_lines:
        observations = wl_line.split()  # Tokenize the string
        epoch=datetime.fromtimestamp(float(observations[5]), timezone.utc)
        self.times.append(epoch)
        self.water_levels.append(float(observations[6]))
        count += 1

The code above breaks the strings into tokens, that is those parts of the string that are separated by a space character. After this the 6th column value is converted to an epoch by using:  
    
    datetime.fromtimestamp(float(observations[5]), timezone.utc)
    
Once this the epoch, that now conforms to the UTC time standard, is added to the list times variable times which is a member of the class. Finally:

    self.water_levels.append(float(observations[6]))
   
Adds the water level for the record to the list **self.water_levels**. If we assume that the every record in the file is populated correctly then the result will be that the class now how a list of epochs with a corresponding list of water levels. 



<img align="left" width="6%" style="padding-right:10px;" src="../Images/test.png">

On Piazza post a message in which you describe in your words what the result is from the line below:
    
    datetime.fromtimestamp(float(observations[5]), timezone.utc)
    
    

<img align="left" width="6%" style="padding-right:10px;" src="../Images/test.png">

Also on Piazza discuss whether you think that having two separate lists for the water levels and epochs is better/worse/indifferent than using a single numpy array with the rows indicating the records and the columns the times and levels respectively.

---
### 1.1.2 Create a Class instance 

In [20]:
from mycode.waterlevel import WaterLevel
abs_path=os.path.abspath(os.path.curdir)+"/Data/"

water_levels = WaterLevel()
water_levels.metadata["datum_type"]="geoid"
water_levels.metadata["datum_name"]="EGM08"

print(water_levels)

Location name          : Unknown
Reference Surface Type : geoid
Reference Surafce Name : EGM08
Observation Time Basis : UTC
Observations Units     : m
No time data present
No water level data present



    abs_path=os.path.abspath(os.path.curdir)+"/Data/"
    
The code snippet above sets the absolute path to point to the *Data* directory in the current directory 

<img align="left" width="6%" style="padding-right:10px;" src="../Images/test.png">

Also in the code cell above create an instance of the `WaterLevel` class called *water_levels*. For the `metadata` set the datum_type to "geoid" and the datum_name to "EGM08".

In [21]:
water_levels.read_jhc_file(abs_path+"Lab_A_TIDE.txt")

Opening water level data file:/home/jupyter-semme/ESCI_OE_774_874/Lab_A/Data/Lab_A_TIDE.txt


In the code cell above a file named **Lab_A_TIDE.txt** is opened and read

___

### 1.1.3 Creating a Parser for Positioning data


In [None]:
<img align="left" width="5%" style="padding-right:10px;" src="../../python_basics/images/email.png">

*For issues or suggestions related to this notebook, write to: epom@ccom.unh.edu*