# OUTPUT TRACKING ALGORITHM


---
Author: **Helvecio B. Leal Neto** & **Alan J. P. Calheiros**\
**National Institute for Space Research - Brazil - (2021)**



## About

This notebook is designed for viewing the tracking results of the storm/precipitation tracking algorithm beta version. The results presented here refer to the tracking of clusters via radar data provided by the GoAmazon project, for the following periods:

**Start**: 2014-09-07 00:00:00

**End**: 2014-09-09 00:00:00

The tracking threshold is:

* **20** dBZ
* inner 1 - ***35*** dBZ
* inner 2 - ***40*** dBZ

Minimum size threshold per cluster:

* **30** pixels
* inner 1 - ***15*** pixels
* inner 2 - ***10*** pixels

## Dependencies libraries

In [5]:
import sys
sys.path.append("../")

import stanalyzer as sta

frame = sta.read_file('../tracks/S201409070000_E201409100000_VDBZc_T20_L5.pkl')


# print(sta.life_cicle(frame))

In [None]:
help(sta.read_file)

In [None]:
import pandas as pd

In [None]:
### PATH
PATH_FILE = '../tracks/S201409070000_E201409100000_VDBZc_T20_L5.pkl'

In [None]:
### Read tracking file
df = sta.read_file(PATH_FILE)
df.head()

In [None]:
sta.life_cicle()

In [None]:
def life_cicle(dframe):
    
    ## Group by uid
    grouped_frame,life_time,uid_,start_,end_ = [],[],[],[],[]
    for group in dframe.groupby(pd.Grouper(key="uid")):
        grouped_frame.append(group)

    ## Calculate by initial time and final time
    for f in range(len(grouped_frame)):
        life_time.append(len(fam_iop1_all_true[f][1]))
        uid_.append(fam_iop1_all_true[f][1].uid.values[0])
        start_.append(fam_iop1_all_true[f][1].timestamp.values[0]),end_.append(fam_iop1_all_true[f][1].timestamp.values[-1])

    ## Create cicle life dataframe
    cicle_life = pd.DataFrame(list(zip(uid_, life_time,start_,end_)), 
                   columns =['uid', 'times','begin','end'])
    ## Calculate duration
    cicle_life['duration'] = pd.to_timedelta(pd.to_datetime(cicle_life['end']) - pd.to_datetime(cicle_life['begin']))
    return cicle_life

In [None]:
life_cicle(df)

In [None]:
# Dataframe library
import pandas as pd
# Numerical Python library
import numpy as np
# netCDF4 library
import netCDF4
# Import gzip to open netCDF
import gzip
# Visualization library
import matplotlib.pyplot as plt

## Variables

**Fam_Nº**-> Refers to the number of the Tracked Family.
<br>

**timestamp** ->A digital record of the time of occurrence of a particular event.
<br>
**time** -> Refers to the tracking time in the algorithm.
<br>
**uid** -> Unique IDentifier, it is used to generate the families.
<br>
**id_t** -> Referring cluster identifier at the time of tracking occurrence. From the DBSCAN clustering algorithm.
<br>
**lat** -> Refers latitude centroid, taken from the reference matrix of the original nc files.
<br>
**lon** -> Refers longitude centroid, taken from the reference matrix of the original nc files.
<br>
**p0** -> The first coordinate point of centroid in matrix (clusters or nc_file): (p0,p1)=(x,y)=(lon,lat).
<br>
**p1** -> The second coordinate point of centroid in matrix (clusters or nc_file): (p0,p1)=(x,y)=(lon,lat).
<br>
**size_%THRESHOLD** -> Total number of Pixels in the main cluster. Each point depends on the sensor's spatial resolution (pixel size): RADAR 2x2km.
<br>
**mean_ref_%THRESHOLD** -> Averaged reflectivity of the cluster. Value in dBZ.
<br>
**max_ref_%THRESHOLD** -> Max reflectivity of the cluster. Value in dBZ.
<br>
**angle_%THRESHOLD_orig** -> Original displacement angle of the cluster at the current time.
<br>
**angle_%THRESHOLD_cor** -> Corrected displacement angle of the cluster at the current time. 
<br>
**vel_%THRESHOLD_orig** -> Original displacement velocity of the cluster at the current time in kilometers per hour (km/h).
<br>
**vel_%THRESHOLD_cor** -> Corrected displacement velocity of the cluster at the current time in kilometers per hour (km/h).
<br>
**mean_total_ref_%THRESHOLD** -> Average reflectivity of the inner clusters by threshold (Value in dBZ).
<br>
**total_size_%THRESHOLD** -> Total size of inner clusters by threshold (number of pixels).
<br>
**n_cluster_%THRESHOLD** -> Total number of inner clusters by Threshold.
<br>
**avg_angle_%THRESHOLD** -> Averaged angle for the inner cluster by threshold (Value in degree).
<br>
**avg_vel_%THRESHOLD** -> Averaged velocity for inner clusters by threshold (Value in km/h).
<br>
**status** -> Status of occurrence, type: NEW-> New cluster; CONT-> Continous cluster; SPLT -> Splitted cluster; MERG -> Merged Cluster.
<br>
**delta_t** -> Time interval for cluster life cycle.
<br>
**nc_file** -> Path of netCDF file.
<br>
**cluster_file** -> Path of cluster file (From DBSCAN).
<br>
**dsize_%THRESHOLD** -> Difference between the sizes of two consecutive clusters (in Pixel).
<br>
**dmean_ref_%THRESHOLD** -> Difference between the mean reflectivities of two consecutive clusters for main threshold (in dBZ).
<br>
**dmean_total_ref_%THRESHOLD** -> Difference between the mean reflectivities of all clusters between two consecutive times for an inner threshold (in dBZ).
<br>
**dtotal_size_%THRESHOLD** -> Difference between the total size (in pixel) of all clusters between two consecutive times for an inner threshold (values in pixel).

## Read tracking file

Tracking DataFrame.

In [None]:
fam_tracking = pd.read_pickle("./S201409070000_E201409100000_VDBZc_T20_L5.pkl")
fam_tracking

### Example how to select a FAM by uid

In [None]:
uid = 97
selected_fam = fam_tracking.query('uid == @uid')
selected_fam

## Example how to select a cluster in the family

In [None]:
line = 0 #first line 

selected_line = selected_fam.iloc[[line]]
selected_line

## Example how to open the cluster file and the original data to extract reflectivity values

In [None]:
## OPEN CLUSTERS
def open_cluster(path):
    try:
        cluster = np.load(path['cluster_file'].values[0])['arr_0']
        cluster[cluster == 0] = np.NAN
        return cluster
    except:
        print('File not found!')

In [None]:
selected_line

In [None]:
cluster_matrix_all = open_cluster(selected_line)
print('Original dimensions of cluster->',cluster_matrix_all.shape)

THRESHOLD_LEVEL = 0 #to select the main threshold (ex: 0-20dBZ,1-35dBZ,2-40dBZ)
cluster_matrix = cluster_matrix_all[:,:,THRESHOLD_LEVEL]
print('Selected dimensions of cluster->',cluster_matrix.shape)

In [None]:
### OPEN NETCDF
def open_file(file_path):
    VAR_NAME = 'DBZc'
    LEVEL = 5 #2.5km height
    THRESHOLDS = [20,35,40] #dBZ
    with gzip.open(file_path['nc_file'].values[0]) as gz:
        with netCDF4.Dataset('dummy', mode='r', memory=gz.read()) as nc:
            data = nc.variables[VAR_NAME][0][LEVEL][:].filled()
            data[data == -9999.] = np.NAN
    data[data < THRESHOLDS[0]] = np.nan
    return data

In [None]:
nc_matrix = open_file(selected_line)
print('NetCDF Max/Min values (thresholded):\n',np.nanmax(nc_matrix),np.nanmin(nc_matrix))

In [None]:
fig, (ax,ax1) = plt.subplots(1,2, figsize=(15,6))

ax.imshow(nc_matrix)
ax1.imshow(cluster_matrix);
ax.set_title('Original file')
ax1.set_title('Cluster file');

## Extracting reflectivities from the selected cluster

To extract the reflectivity values of an individual cluster, you will need to choose the tracking 'id_t', this should be done as follows:

Visualization of individual line.

In [None]:
selected_line

In [None]:
### This line shows that id_t is equal to 20. 
selected_id_t = selected_line.id_t.values[0]

### Get XY coordinates from cluster matrix
x,y = np.where(cluster_matrix == selected_id_t)

### Get reflectivities values from nc_file cluster
dbz_list = nc_matrix[x,y]

In [None]:
print('List with reflectivity values of an individual cluster.\n',dbz_list)

### Cluster location view

In [None]:
fig, (ax,ax1) = plt.subplots(1,2, figsize=(15,6))
ax.imshow(nc_matrix)
ax1.imshow(cluster_matrix);
ax.set_title('Original file')
ax1.set_title('Cluster file');

ax.scatter(selected_line.p0,selected_line.p1,marker='x',color='r',s=100)
ax1.scatter(selected_line.p0,selected_line.p1,marker='x',color='r',s=100)