# Basic functionality of octant package

Import the necessary modules

In [1]:
from pathlib import Path
from octant.core import TrackRun, OctantTrack, HOUR

Define the common data directory

In [2]:
sample_dir = Path(".") / "sample_data"

Data are usually organised in hierarchical directory structure. Here, the relevant parameters are defined.

In [3]:
dataset = "era5"
period = "test"
run_id = 0

Construct the full path

In [4]:
track_res_dir = sample_dir / dataset / f"run{run_id:03d}" / period

## Load the data

The standard loading procedure is done during the initialisation of `TrackRun` by reading text files (output from PMCTRACK) from the given directory.

In [5]:
tr = TrackRun(track_res_dir)

In [6]:
print(tr)

<octant.core.TrackRun>
[671 tracks]

Data columns:
lon | lat | vo | time | area | vortex_type

Sources:
sample_data/era5/run000/test


The `TrackRun` object also has an HTML view available in Jupyter Notebooks

In [7]:
tr

Cyclone tracking results,Cyclone tracking results.1,Cyclone tracking results.2,Cyclone tracking results.3,Cyclone tracking results.4,Cyclone tracking results.5,Cyclone tracking results.6
Number of tracks,671,671,671,671,671,671
Data columns,"lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type"
Sources,,,,,,
Sources,sample_data/era5/run000/test,sample_data/era5/run000/test,sample_data/era5/run000/test,sample_data/era5/run000/test,sample_data/era5/run000/test,sample_data/era5/run000/test


In [8]:
type(tr)

octant.core.TrackRun

In [9]:
len(tr)

671

In [10]:
tr.size()

671

In [11]:
tr.tstep_h

1.0

### Main data container

The main attribute of `TrackRun` is `.data`, which has all the tracks stored in one DataFrame-like object.

In [12]:
tr.data.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,lon,lat,vo,time,area,vortex_type
track_idx,row_idx,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0,0,9.3,79.2,0.000345,2011-01-01 00:00:00,10617.93652,0
0,1,9.3,79.2,0.000352,2011-01-01 01:00:00,10154.14648,0
0,2,9.3,79.2,0.000362,2011-01-01 02:00:00,10136.94629,0
1,0,20.4,70.5,0.000317,2011-01-01 00:00:00,34750.66016,1
1,1,20.7,70.5,0.00031,2011-01-01 01:00:00,36727.84375,1


It always has `track_idx` level in order to index through individual tracks.

### Tracking run configuration

If a `.conf` file is present in the same directory, it is loaded to `TrackRun`.

In [13]:
tr.conf

Tracking algorithm settings (43),Tracking algorithm settings (43).1
dt_start =,201101010000
dt_end =,201101312300
vor_lvl =,950
steer_lvl_btm =,1000
steer_lvl_top =,700
datadir =,../../reanalysis/era5
outdir =,../results/test7
vort_name =,vo
u_name =,u
v_name =,v


## Other methods of initialisation

`TrackRun` class can be initialised empty:

In [14]:
tr_empty = TrackRun()

In [15]:
tr_empty

Cyclone tracking results,Cyclone tracking results.1,Cyclone tracking results.2,Cyclone tracking results.3,Cyclone tracking results.4,Cyclone tracking results.5,Cyclone tracking results.6
Number of tracks,0,0,0,0,0,0
Data columns,"lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type"


### Concatenate data from several directories

```python
TR = TrackRun(one_directory)
TR2 = TrackRun(another_directory)
TR.extend(TR2)
```

or

```python
TR += TrackRun(another_directory)
```

## Some attributes of TrackRun

`TrackRun` object has a few useful properties

* for example, it is possible to access the list of files that the data were loaded from. 

In [16]:
tr.filelist[:5]

[PosixPath('sample_data/era5/run000/test/vortrack_0001_0001.txt'),
 PosixPath('sample_data/era5/run000/test/vortrack_0002_0001.txt'),
 PosixPath('sample_data/era5/run000/test/vortrack_0003_0001.txt'),
 PosixPath('sample_data/era5/run000/test/vortrack_0004_0001.txt'),
 PosixPath('sample_data/era5/run000/test/vortrack_0005_0001.txt')]

* there is a shortcut to group tracks by their index

In [17]:
tr.gb

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fee3dd9f390>

It is equivalent of doing `tr.data.groupby("track_idx")`.

In [18]:
for idx, a_track in tr.gb:
    print(f"Track with index={idx}")
    print(a_track)
    break

Track with index=0
                   lon   lat        vo                time         area  \
track_idx row_idx                                                         
0         0        9.3  79.2  0.000345 2011-01-01 00:00:00  10617.93652   
          1        9.3  79.2  0.000352 2011-01-01 01:00:00  10154.14648   
          2        9.3  79.2  0.000362 2011-01-01 02:00:00  10136.94629   

                   vortex_type  
track_idx row_idx               
0         0                  0  
          1                  0  
          2                  0  


* current `TrackRun` is not categorised yet (see "Categorisation" examples), so this attribute is empty:

In [19]:
tr.cat_labels

[]

## Serialising TrackRun

`TrackRun` and all its metadata can be saved to and loaded from an HDF file.

In [20]:
tr.to_archive("test.h5")

In [21]:
new_tr = TrackRun.from_archive("test.h5")

In [22]:
new_tr

Cyclone tracking results,Cyclone tracking results.1,Cyclone tracking results.2,Cyclone tracking results.3,Cyclone tracking results.4,Cyclone tracking results.5,Cyclone tracking results.6
Number of tracks,671,671,671,671,671,671
Data columns,"lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type","lon, lat, vo, time, area, vortex_type"
Sources,,,,,,
Sources,sample_data/era5/run000/test,sample_data/era5/run000/test,sample_data/era5/run000/test,sample_data/era5/run000/test,sample_data/era5/run000/test,sample_data/era5/run000/test


## Units of TrackRun

In [23]:
import random

Each cyclone track stored in lists of `TrackRun` class as a `OctantTrack` instance

In [24]:
ot = random.choice([*tr[:].gb])[1]

It is essentially a sub-class of pandas.DataFrame

In [25]:
ot

Unnamed: 0_level_0,Unnamed: 1_level_0,lon,lat,vo,time,area,vortex_type
track_idx,row_idx,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
124,0,12.3,72.0,0.000423,2011-01-05 23:00:00,7615.17285,1
124,1,10.8,72.0,0.000393,2011-01-06 00:00:00,19943.6875,1
124,2,8.7,72.0,0.000349,2011-01-06 01:00:00,18887.55273,1
124,3,-3.6,67.2,0.000666,2011-01-06 02:00:00,95968.20312,3


In [26]:
type(ot)

octant.core.OctantTrack

It has a few useful properties

In [27]:
ot.lifetime_h

3.0

In [28]:
ot.total_dist_km

836.7043121073503

including maximum vorticity in $ s^{-1}$:

In [29]:
ot.max_vort

0.00066578

In [30]:
ot.gen_lys_dist_km

810.8722660749592

In [31]:
ot.average_speed

278.90143736911676

In [32]:
ot.lonlat

array([[12.3, 72. ],
       [10.8, 72. ],
       [ 8.7, 72. ],
       [-3.6, 67.2]])

## Running octant with a progress bar

Having either the `fastprogress` or `tqdm` module installed allows for running some methods with a bar that shows progress. To enable it, set the attribute of `RUNTIME` variable to `True`:

In [33]:
import octant
octant.RUNTIME.enable_progress_bar = True

In [34]:
conditions = [
    ("long_lived", [lambda ot: ot.lifetime_h >= 6]),
    (
        "far_travelled_and_very_long_lived",
        [lambda ot: ot.lifetime_h >= 36, lambda ot: ot.gen_lys_dist_km > 300.0],
    ),
    ("strong", [lambda x: x.max_vort > 1e-3]),
]

In [35]:
tr.classify(conditions)

More `classify()` [examples](01_Categorisation.ipynb)

## octant's utilities

In [36]:
from octant.utils import great_circle

In [37]:
great_circle(lon1=9.6, lon2=10.2, lat1=76.9, lat2=78.9)

222826.50759451024