# File-based I/O

Tipically, the velocity field is obtained from simulations or experiments and stored in files. In these cases, the solver can read the velocity data from disk. This notebook demonstrates this (most common) use case of **file-based I/O**.

In this notebook, we revise the **double-gyre flow problem** as a toy example. The core idea is to show how the input files must be formatted for pyFTLE, and how to use the terminal CLI to process these files.

## Generating the input files and saving them on disk:

In a production run, these files would come from your simulations or experiment.
Here, though, we put a particular attention on the data format expected by pyFTLE, so that you can adapt your own data accordingly.

pyFTLE expects MATLAB `.mat` files containing velocity fields, spatial coordinates, and seed particle positions used in
FTLE computations.

### Supported data layouts:

Each `.mat` file must contain specific variable names:

- **Velocity data**:
  - 2D: `velocity_x`, `velocity_y`
  - 3D: `velocity_x`, `velocity_y`, `velocity_z` (optional)

- **Coordinate data**:
  - 2D: `coordinate_x`, `coordinate_y`
  - 3D: `coordinate_x`, `coordinate_y`, `coordinate_z` (optional)

- **Seed particle data**:
  - 2D: `left`, `right`, `top`, `bottom`
  - 3D: `left`, `right`, `top`, `bottom`, `front`, `back` (optional)


In [1]:
# Define some containers to hold the data
velocity_data = {
    "velocity_x": None,
    "velocity_y": None
}

coordinate_data = {
    "coordinate_x": None,
    "coordinate_y": None
}

particles_data = {
    "left": None,
    "right": None,
    "top": None,
    "bottom": None
}

For this example, we assume that the velocity measurements are taken on a regular grid of coordinates, with a shape of 100 x 50. Since the coordinates do not change over time, we only need to save them once. PyFTLE will efficiently handle this situation, ensuring that the redundant data is not read from disk multiple times during processing.

> OBS: pyFTLE also supports time-varying points and/or irregular grids. Here we are using a regular grid for simplicity only.

However, if the velocity measurements are taken at different locations for each snapshot, such as in Lagrangian Particle Tracking (LPT) results, then a separate coordinate file must be saved for each corresponding velocity snapshot. In such cases, the coordinate data will vary with time, and PyFTLE requires these distinct coordinate files to align with each velocity measurement for accurate analysis.

In [2]:
import numpy as np

# Domain size
nx = 100
ny = 50

# Domain limits
x_min, x_max = 0, 2
y_min, y_max = 0, 1

# Generate an example of grid domain
x = np.linspace(x_min, x_max, nx)
y = np.linspace(y_min, y_max, ny)
X, Y = np.meshgrid(x, y)

# Store the data (flattened)
coordinate_data["coordinate_x"] = X.ravel()
coordinate_data["coordinate_y"] = Y.ravel()

Let's save this coordinate data on disk.
This file and all other files generated by this example will be placed in a folder called `example_generated_files`. **Feel free to delete this folder after finish reading this notebook**.

In [3]:
import os
from pathlib import Path
from scipy.io import savemat

script_dir = Path(os.getcwd())
output_folder = script_dir / 'example_generated_files'
output_folder.mkdir(parents=True, exist_ok=True)

file_path = output_folder / 'coordinates.mat'

savemat(file_path, coordinate_data)

Now, let's create the velocity files using the 2D double-gyre flow as example. For this, we assume:

- The timestep between snapshots is `0.001`
- We have `1200` snapshots
- The velocity files will be saved in the same directory where we saved the grid (for simplicity).

In [4]:
timestep = 0.001

nfiles = 1200  # Number of velocity files (snapshots) to be saved


times = np.linspace(0, (nfiles - 1) * timestep, nfiles)


# Define some constants for the double-gyre flow
A = 0.1
EPSILON = 0.25
OMEGA = 2.0
x = coordinate_data["coordinate_x"]
y = coordinate_data["coordinate_y"]

for i, t in enumerate(times):
    # Compute the velocity at the given time step `t`
    f = (
        EPSILON * np.sin(OMEGA * t) * x**2
        + (1.0 - 2.0 * EPSILON * np.sin(OMEGA * t)) * x
    )
    dfdx = 2 * EPSILON * np.sin(OMEGA * t) * x + (
        1.0 - 2.0 * EPSILON * np.sin(OMEGA * t)
    )

    velocity_data["velocity_x"] = -np.pi * A * np.sin(np.pi * f) * np.cos(np.pi * y)
    velocity_data["velocity_y"] = np.pi * A * np.cos(np.pi * f) * np.sin(np.pi * y) * dfdx

    filename = f"velocity{str(i+1).zfill(4)}.mat"
    file_path = output_folder / filename
    
    savemat(file_path, velocity_data)

To compute the FTLE field, we need to generate a group of particles that will be tracked over time. For all FTLE fields, we will use the same set of seed particles.

For the 2D case, we will consider four neighboring particles around each centroid: `top`, `bottom`, `left`, and `right`. These particles are placed sufficiently close to one another, and the FTLE value will be computed at the centroid of these neighboring particles.

To facilitate the creation of these particles, we will start by defining the desired centroid locations where we want to compute the FTLE. From each centroid, the neighboring particle positions will be determined by adding or subtracting a small gap.

Additionally, to avoid potential numerical issues during the integration process, we will ensure that these neighboring particles are not placed too close to the domain boundaries. If the particles are too near the edges, there's a risk they could move outside the domain during tracking, leading to erroneous FTLE calculations. Therefore, we will carefully place the centroids and neighboring particles within the interior of the domain to minimize the risk of such errors.

In [5]:
# Spacing for neighboring particles
gap = 0.025

# Define the valid region for central locations (avoiding boundaries)
margin = 2 * gap
x_valid_min, x_valid_max = x_min + margin, x_max - margin
y_valid_min, y_valid_max = y_min + margin, y_max - margin

# Create a grid of central locations
nx_centroids = 100
ny_centroids = 50
central_x = np.linspace(x_valid_min, x_valid_max, nx_centroids)
central_y = np.linspace(y_valid_min, y_valid_max, ny_centroids)
Xc, Yc = np.meshgrid(central_x, central_y)

# Now, define the neighboring locations
central_locations = np.column_stack((Xc.ravel(), Yc.ravel()))

coordinates_top = central_locations + np.array([0, gap])
coordinates_bottom = central_locations - np.array([0, gap])
coordinates_left = central_locations - np.array([gap, 0])
coordinates_right = central_locations + np.array([gap, 0])

particles_data = {
    "top": coordinates_top,
    "bottom": coordinates_bottom,
    "left": coordinates_left,
    "right": coordinates_right,
}

file_path = output_folder / "particles.mat"
savemat(file_path, particles_data)

We have almost everything we need: the **coordinate** file(s), the **velocity** files and the **particle** file(s).
However, pyFTLE expects as input not these files directly, but a file that holds the path to them.

That is, say we have the following list of files in a given folder:

```
velocity0001.mat

velocity0002.mat

...

velocity1200.mat
```

Now, we need to create a `.txt` file that holds this list. One can achieve this using the following python code:

In [6]:
import os

# Specify the directory where the .mat files are located
directory = 'path/to/your/directory'  # Replace with your actual directory
directory = output_folder  # Comment out this like - it is just for this example

# Get the list of filenames that match the pattern (in this case, velocity)
filenames = [f for f in os.listdir(directory) if f.startswith('velocity') and f.endswith('.mat')]

# Sort the filenames
filenames.sort()

# Suffix the directory name to the filenames to get the full path:
full_paths = [os.path.join(directory, filename) for filename in filenames]

# Define the path for the output .txt file
output_txt_file = 'velocity_filenames.txt'

# Write the filenames to the .txt file
with open(output_txt_file, 'w') as file:
    for full_path in full_paths:
        file.write(f"{full_path}\n")

After creating the `.txt` file for the velocity files, repeat the same process for the coordinate and particle files. In this particular example, there is only one file each for the coordinates and particles. However, we still need to create separate `.txt` files that list these files, each containing a single entry.

To make it easier, let's move the previous code to a function that works for either velocity, coordinate or particle files:

In [7]:
def generate_list_of_filenames(pattern, directory, output_txt_file):
    # Get the list of filenames that match the pattern (in this case, velocity)
    filenames = [f for f in os.listdir(directory) if f.startswith(pattern) and f.endswith('.mat')]

    # Sort the filenames
    filenames.sort()

    # Suffix the directory name to the filenames to get the full path:
    full_paths = [os.path.join(directory, filename) for filename in filenames]

    # Write the filenames to the .txt file
    with open(output_txt_file, 'w') as file:
        for full_path in full_paths:
            file.write(f"{full_path}\n")


generate_list_of_filenames("coordinates", directory, "coordinate_filenames.txt")
generate_list_of_filenames("particles", directory, "particle_filenames.txt")

In the root directory of the project, you'll find a script called `create_list_of_input_files.py`. This script is designed to help you generate the `.txt` files listing the velocity, coordinate, and particle files. It basically does the same as the function we just defined above (`generate_list_of_filenames`). You can easily modify and customize this script to suit your specific needs.

### Running pyFTLE

Once the `.txt` files are created, you can run pyFTLE.

pyFTLE expects a series of command-line arguments, which you can view by running:

```
pyftle -h
```

For more detailed information about these arguments, refer to [pyFTLE's documentation](https://pyftle.readthedocs.io/en/latest/).

The required parameters can be passed directly through the terminal, or alternatively, using a more convenient `config.yaml` file.

Here, we'll use the following parameters:
```yaml
experiment_name: "example_file-based_IO"
list_velocity_files: "examples/velocity_filenames.txt"  # assuming you run pyftle from project rootdir
list_coordinate_files: "examples/coordinate_filenames.txt"  # assuming you run pyftle from project rootdir
list_particle_files: "examples/particle_filenames.txt"  # assuming you run pyftle from project rootdir
snapshot_timestep: 0.001  # must match your snapshot timestep
flow_map_period: 0.5
integrator: rk4
interpolator: grid
num_processes: 4
output_format: "vtk"
flow_grid_shape: 100, 50  # comment this line for unstructured data
particles_grid_shape: 100, 50  # comment this line for unstructured data
```


Once you've specified the arguments, you can run pyFTLE with the following command:

```
pyftle -c config.yaml
```

This command can be executed from any directory, as the location of the `.txt` files is specified as an argument in the `config.yaml`. The code will automatically create an `output` folder in the directory from which the `pyftle` command is run and will populate this directory with the FTLE results, stored inside a subdirectory specified by the `experiment_name` argument.