<div align="center"; span style="color:#336699"><b><h2>pyForTraCC - Track Infrared (Real Time Data) </h2></b></div>
<hr style="border:2px solid #0077b9;">
<br/>
<div style="text-align: center;font-size: 90%;">
   <sup><a href="https://www.linkedin.com/in/helvecio-leal/"> Helvécio B. Leal Neto, <i class="fab fa-lg fa-orcid" style="color: #a6ce39"></i></a></sup><t>&nbsp;</t> 
    <sup><a href="https://www.linkedin.com/in/alan-calheiros-64a252160/">Alan J. P. Calheiros<i class="fab fa-lg fa-orcid" style="color: #a6ce39"></i></a></sup>
   <br/><br/>
    National Institute for Space Research (INPE)
    <br/>
    Avenida dos Astronautas, 1758, Jardim da Granja, São José dos Campos, SP 12227-010, Brazil
    <br/><br/>
    Contact: <a href="mailto:helvecio.neto@inpe.br">helvecio.neto@inpe.br</a>, <a href="mailto:alan.calheiros@inpe.br">alan.calheiros@inpe.br</a>
    <br/><br/>
    Last Update: Nov 6, 2024
</div>

<br/>

<div style="text-align: justify;  margin-left: 25%; margin-right: 25%;">
<b>Abstract.</b> This Jupyter Notebook shows how to use pyfortracc to track latest images from the GOES-16 satellite processed by the CPTEC/INPE.<br>
The algorithm uses the brightness temperature data from the ABI sensor to identify and track precipitating systems over the south America region.<br>
The output data is a tracking table containing the system's lifecyle.
</div>    
<br/>
<div style="text-align: justify;  margin-left: 15%; margin-right: 15%;font-size: 75%; border-style: solid; border-color: #0077b9; border-width: 1px; padding: 5px;">
    <b>In this example, we will use fortracc to compute track of precipitating systems over the globe and explore the output data after the algorithm workflow.
</b>
    <div style="margin-left: 10px; margin-right: 10px; margin-top:10px">
      <p> Leal Neto, H.B.; Calheiros, A.J.P.;  fortracc Algorithm. São José dos Campos, INPE, 2024. <a href="https://github.com/fortracc-project/" target="_blank"> Online </a>. </p>
    </div>
</div>

### Schedule

### Schedule
 [1. Installation](#install)<br>
 [2. Input Data](#input)<br>
 [3. Read Function](#data)<br>
 [4. Parameters (Name_list)](#namelist)<br>
 [5. Tracking Routine](#track)<br>
 [6. Tracking Visualization](#visualization)<br>

<a id='install'></a>
#### 1. Installation

Installing the pyFortraCC package can be done using the pip install command. 

All dependencies will be installed in the current Python environment and the code will be ready to use.

In [None]:
# Install latest version of pyfortracc from github
!python -m pip install -q -U git+https://github.com/fortracc/pyfortracc.git@main#egg=pyfortracc

<a id='data'></a>
#### 2. Data Input (Download Files)

The get_cptec script downloads the latest processed infrared images from the GOES-16 satellite, provided by CPTEC/INPE (National Institute for Space Research). These images, available for download at this [link](http://ftp.cptec.inpe.br/goes/goes16/retangular/), are from Channel 13, representing infrared data that have been reprojected onto a rectangular grid over South America.

In [None]:
# Install regex package to use with the script
!python -m pip install -q -U regex

import requests
url = 'https://raw.githubusercontent.com/fortracc/pyfortracc/refs/heads/main/examples/03_Track-Infrared-Dataset/get_cptec.py'
response = requests.get(url)
with open('get_cptec.py', 'wb') as f:
    f.write(response.content)

# Set the number of images, channel and path to save the images
n_images = 5
channel = 'ch13'
path = 'input/'

# Run the script to download the images
!python get_cptec.py --n $n_images --c $channel --p $path

<a id='namelist'></a>
#### 3. Read Function

The `read_function` function reads the data from the NetCDF file and returns a numpy array with the data.<br>
We select Band 1 of NetCDF file, which corresponds to the infrared channel of the GOES-16 satellite, and divide the data by 100 to convert it to the temperature in Kelvin.

In [None]:
# The function below reads the data from the downloaded files
import xarray as xr
import glob
def read_function(path):
	ds=xr.open_dataset(path)
	return ds['Band1'].data / 100

In [None]:
# Set the lon_min, lon_max, lat_min and lat_max of domain
files = glob.glob(path + '*.nc')
ds = xr.open_dataset(files[0])

lon_min = float(ds['lon'].min().values)
lon_max = float(ds['lon'].max().values)
lat_min = float(ds['lat'].min().values)
lat_max = float(ds['lat'].max().values)

<a id='namelist'></a>
#### 4. Parameters: name_list

The `name_list` function creates a list of the files in the directory. The function receives the path to the directory as input and returns a list of the files in the directory.<br>
We track the convective systems by threshold of 235 K and minimum area of 1000 km².

In [None]:
name_list = {} # Set name_list dict
name_list['input_path'] = 'input/'
name_list['output_path'] = 'output/'
name_list['thresholds'] = [235, 210]
name_list['min_cluster_size'] = [300, 250]
name_list['operator'] = '<='
name_list['timestamp_pattern'] = 'S10635346_%Y%m%d%H%M.nc'
name_list['delta_time'] = 10
name_list['cluster_method'] = 'ndimage'
name_list['lon_min'] = lon_min
name_list['lon_max'] = lon_max
name_list['lat_min'] = lat_min
name_list['lat_max'] = lat_max

# Add correction methods
name_list['spl_correction'] = True # It is used to perform the correction at Splitting events
name_list['mrg_correction'] = True # It is used to perform the correction at Merging events
name_list['inc_correction'] = True # It is used to perform the correction using Inner Core vectors
name_list['opt_correction'] = True # It is used to perform the correction using the Optical Flow method
name_list['elp_correction'] = True # It is used to perform the correction using the Ellipse method
name_list['validation'] = True # It is used to perform the validation of the correction methods

<a id='track'></a>
#### 5. Track Routine

The `track` function receives the data as input and use name_list to track the convective systems.

In [None]:
# Import the library
import pyfortracc

In [None]:
# Track the clusters
pyfortracc.track(name_list, read_function)

<a id='visualize'></a>
#### 6. Visualize the Track Output

`tracking_table` is a pandas DataFrame containing the tracking information of the convective systems.<br>

In [None]:
import duckdb

# Connect to the database
con = duckdb.connect(database=':memory:', read_only=False)

# Read the tracking table from name_list['output_path'] + '/track/trackingtable/*.parquet'
con.execute("CREATE TABLE tracking_table AS SELECT * FROM parquet_scan('output/track/trackingtable/*.parquet')")

# Display the tracking table
tracking_table = con.execute("SELECT * FROM tracking_table").fetch_df()

# Display the tracking table
display(tracking_table.tail())

The `plot_animation` receives the data and the track as input and plots the data and the track on the same map.<br>
We need to set the dimensions of the plot, the projection, and the extent of the plot.

In [None]:
import numpy as np

# Plot the animation
pyfortracc.plot_animation(read_function=read_function, # Read function
                          figsize=(8,8), # Figure size
                          name_list=name_list, # Name list dictionary
                          start_timestamp = str(tracking_table['timestamp'].min()), # Start timestamp
                          end_timestamp= str(tracking_table['timestamp'].max()), # End timestamp
                          info_col_name=False, # Info column name
                          cbar_title='Temperature(k)', # Colorbar title
                          trajectory=True, # Plot the trajectory
                          smooth_trajectory=True, # Smooth the trajectory
                          cmap='turbo', # Colormap
                          min_val=200, # Min value
                          max_val=235, # Max value
                          nan_value=235, # NaN value
                          nan_operation=np.greater_equal, # NaN operation
                          bound_color='blue', # Bound color
                          info_cols=['uid'], # Info columns from tracking table
                          )

To zoom in on the region of interest, we can set the extent of the plot to the region of interest.

In [None]:
# Plot the tracking data for a specific region (Sao Paulo state)
sp_lat_min = -25
sp_lat_max = -20
sp_lon_min = -53
sp_lon_max = -45
zoom_region = [sp_lon_min, sp_lon_max, sp_lat_min, sp_lat_max]

# Note: The parallel option is not available for MacOS in Notebook, use parallel=False
pyfortracc.plot_animation(read_function=read_function, # Read function
                          figsize=(8,8), # Figure size
                          name_list=name_list, # Name list dictionary
                          start_timestamp = str(tracking_table['timestamp'].min()), # Start timestamp
                          end_timestamp= str(tracking_table['timestamp'].max()), # End timestamp
                          info_cols=['uid','status','lifetime'],
                          cbar_title='Temperature(k)', # Colorbar title
                          trajectory=True, # Plot the trajectory
                          smooth_trajectory=True, # Smooth the trajectory
                          cmap='turbo', # Colormap
                          min_val=200, # Min value
                          max_val=235, # Max value
                          nan_value=235, # NaN value
                          nan_operation=np.greater_equal, # NaN operation
                          zoom_region=zoom_region, # Zoom region
                          bound_color='blue', # Bound color
                          background='google', # Background
                          )

### 7. Convert the Parquet files to a tracking family like fortracc file and csv

In [None]:
from pyfortracc.post_processing import convert_parquet_to_family
convert_parquet_to_family(name_list)