# Output files

[This note-book is in oceantracker/tutorials_how_to/]

After running OceanTacker, output files are in the files are the folder given by parameters ./"root_output_dir"/"output_file_base".
Note that the type of slashes is platform dependant (Linux & Mac: "/", Windows: "\").

Hint: Firefox will display json files in expandable strcture, other programs also can do this

The main files are:

   *   **users_params_*.json**, a copy of the parameters as supplied by the user, useful in debugging or re-running.
    
   * **_caseInfo.json** files, have all the output file names, plus information about the run and data useful in plotting. Eg. 

      * full set of working parameters with defaults used

      * timings of parts of code

      * output_files, the names of all output files generated by the run separated by type.

      * information about the hindcast, eg start date, end date, time step, ...
      
      * basic information from each class used in the computational pipeline

   * **_hindcast_info.json** has a catalog of information about al hindcast file variables (dims, sizes etc.) plus which files hold each variable


   * **_caseLog_log.txt** has a copy of what appeared on the screen during the run 

   * **_tracks000.nc** holds the particle tracks in a netcdf file, see below for code example on reading the tracks


   * **_grid.nc** a netcdf of the hydo-model's grid and other information, useful in plotting and analysis

   * **hindcast_variable_catalog.json**  holds info mapping file variables to internal variables 

   * **_events.nc** a netcdf output from events classes, which only writes output when events occur, eg. a particle entering or exiting given polygons.
   
   
Time variables in these file are in seconds since 1970-01-01
  
Below list the files after running the minimal example. 
    

In [None]:
# Notes for debugging if the scripts below fail:
# * These scripts assume that you already installed oceantracker. If you didn't take a look at https://oceantracker.github.io/oceantracker/_build/html/info/installing.html
# * Paths in this directory are relative to the location of the ipython notebook.
#   I.e. On Linux or Mac, running a cell with "!ls" should return a list containing the notebook you are running.

Hindcast data found locally at ./demo_hindcast


In [1]:
# show a list of output files after running  minimal_example

# Note:
# To make this tutorial run on all operating system we us the 'os' package to handle file paths
# If you don't care for that you can just use path/to/your/files as usual

import os
import glob

output_dir = os.path.join('output', 'minimal_example')
for f in glob.glob(os.path.join(output_dir, '*')):
    print(f)

output/minimal_example/completion_state.json
output/minimal_example/minimal_example.txt
output/minimal_example/minimal_example_caseInfo.json
output/minimal_example/minimal_example_grid000.nc
output/minimal_example/minimal_example_hindcast_info.json
output/minimal_example/minimal_example_raw_user_params.json
output/minimal_example/minimal_example_release_groups.nc
output/minimal_example/minimal_example_tracks_compact_000.nc
output/minimal_example/minimal_example_tracks_rectangular_000.nc


## Reading particle tracks

The below shows how to read the netcdf particle track output file into a python dictionary. The track file has a record of each of the particle properties, plus other useful information. The netcdf tracks file has a compact format, the below code reads this file into rectangular numpy  [ time, particle] arrays. Key variables are
:

* tracks['x']- the particle locations as a rectangular array of position vectors. With dimensions [ time, particle, vector component]. So that the 2D location is 

    ``x, y = tracks['x'][:,:,0], tracks['x'][:,:,1]``

and for a 3D model the vertical position is z= tracks['x'][:,:,2]

* tracks['time'] - array of time in seconds since 1970-1-1

* tracks['date'] - array of dates as numpy datetime64[s]

* tracks['status'] - the numerical codes of particle status, eg moving, dead etc as [ time, particle] array. The values of these status codes are also in the dictionary, eg tracks['status_stranded_by_tide'] = 3. 

* tracks['age'] - time series of each particles age, ie time since release in seconds. 

* tracks['IDrelease_group'] and tracks['IDpulse'] - id's of which release group particles where released from and which pulse within that group. These are  based indices and arrays of size particle, the total number of particles released during the run.

The index within the "particle" dimension, is the individual particleID, ordered from first to last release across the entire model run.

Note: For very large track files reading may fail, eg where variables exceed the 2-4Gb numpy array limit in Windows. To avoid this rerun using the tracks_writer setting "time_steps_per_per_file" to split the track file into files with given number of time steps.   Key variables in the dictionary are:
 

The below also shows how read the hydrodynamic grid. 

In [None]:
# example of reading tracks file

# read netcdf into dictionary
from oceantracker.read_output.python import read_ncdf_output_files
import os

output_dir = os.path.join('output', 'minimal_example')

tracks_file = os.path.join(output_dir, 'minimal_example_tracks_compact_000.nc')
tracks = read_ncdf_output_files.read_tracks_file(tracks_file)
print('Track data', tracks.keys())

# read the hydro-dynamic grid file, useful in plotting
grid_file = os.path.join(output_dir, 'minimal_example_tracks_rectangular_000.nc')
grid = read_ncdf_output_files.read_grid_file(grid_file)
print('Grid data', grid.keys())

loading oceantracker read files
prelim:     Starting package set up
Reading compact track file minimal_example_tracks_compact_000.nc
Track data dict_keys(['dimensions', 'status', 'IDpulse', 'dry_cell_index', 'x0', 'ID', 'water_depth', 'time', 'IDrelease_group', 'hydro_model_gridID', 'status_last_good', 'particle_ID', 'user_release_groupID', 'num_part_released_so_far', 'tide', 'particles_written_per_time_step', 'time_step_range', 'time_released', 'x', 'age', 'z', 'date'])
Grid data dict_keys(['file_created', 'first_ID_in_file', 'total_num_particles_released', 'time_steps_written', 'status_unknown', 'status_notReleased', 'status_dead', 'status_outside_domain', 'status_outside_open_boundary', 'status_stationary', 'status_stranded_by_tide', 'status_on_bottom', 'status_moving', 'particles_written_per_time_step', 'time_step_range', 'time', 'num_part_released_so_far', 'x', 'x0', 'status', 'status_last_good', 'age', 'ID', 'IDrelease_group', 'user_release_groupID', 'IDpulse', 'hydro_model_gridI

## Load data method

Load data method, reads the netcdf, grid, and other  information needed to plot into a dictionary. This is the  recommended method for reading track output.  It uses the case_info file to locate all these files associated with the case run.

In [None]:
# load netcdf with grid and other useful info for plotting
from oceantracker.read_output.python import load_output_files
import os

case_info_file = os.path.join('output', 'minimal_example', 'minimal_example_caseInfo.json')
tracks_plot = load_output_files.load_track_data(case_info_file)

print('tracks_plot data', tracks_plot.keys())

Merging rectangular track files
	 Reading rectangular track file "minimal_example_tracks_rectangular_000.nc"
tracks_plot data dict_keys(['particles_written_per_time_step', 'time_step_range', 'time', 'num_part_released_so_far', 'x', 'x0', 'status', 'status_last_good', 'age', 'ID', 'IDrelease_group', 'user_release_groupID', 'IDpulse', 'hydro_model_gridID', 'time_released', 'water_depth', 'tide', 'dry_cell_index', 'grid', 'particle_status_flags', 'particle_release_groups', 'axis_lim'])
