# FLEKS Python Visualization Toolkit: AMReX Data

flekspy is a Python package for processing FLEKS data. This notebook focuses on handling data in the AMReX format, which is used for both field and particle data. We will cover two main ways to load and analyze this data:

1.  **Using the `yt`-based loader**: This is the primary method, leveraging the power of the `yt` library for slicing, plotting, and analyzing both field and particle data.
2.  **Using the experimental native AMReX particle loader**: A faster, more direct way to load and plot particle data without the `yt` overhead.

## Setup: Imports and Data Downloads

### Imports

In [None]:
import flekspy
from flekspy.util import download_testfile, unit_one
from flekspy.util.transformations import create_field_transform
from flekspy.amrex import AMReXParticleData

import numpy as np
import matplotlib.pyplot as plt

### Downloading Demo Data

We'll download two different AMReX datasets to demonstrate various features.

In [None]:
# Dataset 1: General purpose field and particle data
url_1 = "https://raw.githubusercontent.com/henry2004y/batsrus_data/master/3d_particle.tar.gz"
download_testfile(url_1, "data")

In [None]:
# Dataset 2: Smaller particle dataset for more specific examples
url_2 = "https://raw.githubusercontent.com/henry2004y/batsrus_data/master/fleks_particle_small.tar.gz"
download_testfile(url_2, "data")

## Analyzing Field and PIC Particle Data with the `yt` Loader

The primary way to load AMReX data is with `flekspy.load`, which uses `yt` on the backend. This returns a `yt` dataset object, giving you access to the full power of the `yt` analysis framework.

In [None]:
ds = flekspy.load("data/3d*amrex", use_yt_loader=True)

### Slicing and Plotting Field Data

We can easily take a 2D slice of the 3D data.

In [None]:
dc = ds.get_slice("z", 0.5)
dc

Flekspy provides convenient wrappers for creating plots from these slices.

In [None]:
f, axes = dc.plot("By", pcolor=True)
dc.add_stream(axes[0], "Bx", "By", color="w")
dc.add_contour(axes[0], "Bx")

For more control, you can also extract the data and plot it directly with Matplotlib.

In [None]:
f, axes = plt.subplots(
    1, 2, figsize=(12, 4), constrained_layout=True, sharex=True, sharey=True
)

fields = ["By", "Bz"]
for ivar in range(2):
    v = dc.evaluate_expression(fields[ivar])
    vmin = v.min().v
    vmax = v.max().v
    ax = axes[ivar]
    ax.set_title(fields[ivar], fontsize=16)
    ax.set_ylabel("Y", fontsize=16)
    ax.set_xlabel("X", fontsize=16)
    c = ax.pcolormesh(dc.x.value, dc.y.value, np.array(v.T), cmap="turbo")
    cb = f.colorbar(c, ax=ax, pad=0.01)

    ax.set_xlim(np.min(dc.x.value), np.max(dc.x.value))
    ax.set_xlim(np.min(dc.y.value), np.max(dc.y.value))
    dc.add_stream(ax, "Bx", "By", density=0.5, color="w")

plt.show()

### Visualizing Phase Space Distributions

We can also analyze the particle data. `yt` allows us to create derived fields, which are useful for weighting particles. Here, we create a `unit_one` field that simply assigns a value of 1 to each particle.

In [None]:
ds.add_field(
    ds.pvar("unit_one"),
    function=unit_one,
    sampling_type="particle",
    units="dimensionless",
)

Now we can create a phase space plot. We'll select a spatial region and plot the distribution of y- and z-velocities (`p_uy`, `p_uz`), weighted by the particle weight (`p_w`).

In [None]:
x_field = "p_uy"
y_field = "p_uz"
z_field = "p_w"
xleft = [-0.016, -0.01, 0.0]
xright = [0.016, 0.01, 1.0]

zmin, zmax = 1e-5, 2e-3

region = ds.box(xleft, xright)
pp = ds.plot_phase(
    x_field,
    y_field,
    z_field,
    region=region,
    unit_type="si",
    x_bins=100,
    y_bins=32,
    domain_size=(xleft[0], xright[0], xleft[1], xright[1]),
)

pp.set_cmap(pp.fields[0], "turbo")
pp.set_zlim(pp.fields[0], zmin=zmin, zmax=zmax)
pp.set_xlabel(r"$V_y$")
pp.set_ylabel(r"$V_z$")
pp.set_colorbar_label(pp.fields[0], "weight")
pp.set_log(pp.fields[0], True)
pp.show()

### Selecting and Plotting Particles in Geometric Regions

`yt` makes it easy to select particles within various geometric shapes.

#### Sphere
Plot the spatial location and velocity space of particles within a sphere.

In [None]:
sp = ds.sphere(center=[0, 0, 0], radius=1)

# Plot particle locations
pp_loc = ds.plot_particles(
    "p_x", "p_y", "p_w", region=sp, unit_type="si", x_bins=64, y_bins=64
)
pp_loc.show()

# Plot phase space
pp_phase = ds.plot_phase(
    "p_uy", "p_uz", "p_w", region=sp, unit_type="si", x_bins=64, y_bins=64
)
pp_phase.show()

#### Disk
Plot the spatial location and velocity space of particles within a disk.

In [None]:
disk = ds.disk(center=[0, 0, 0], normal=[0, 0, 1], radius=0.5, height=1.0)

# Plot particle locations
pp_loc = ds.plot_particles(
    "p_x", "p_y", "p_w", region=disk, unit_type="si", x_bins=64, y_bins=64
)
pp_loc.show()

# Plot phase space
pp_phase = ds.plot_phase(
    "p_uy", "p_uz", "p_w", region=disk, unit_type="si", x_bins=64, y_bins=64
)
pp_phase.show()

### Advanced Example: Slicing with Range Limits

Now we'll use the second dataset to show a more advanced slicing feature. Inheriting from IDL procedures, you can pass strings to `plot` to limit the plotting range (note: this syntax is experimental).

In [None]:
ds2 = flekspy.load("data/fleks_particle_small/3d*amrex")
dc2 = ds2.get_slice("z", 0.001)

f, axes = dc2.plot("Bx>(2.2e5)<(3e5) Ex", figsize=(12, 6))
dc2.add_stream(axes[0], "Bx", "By", color="w")
dc2.add_contour(axes[1], "Bx", color="k")

### Advanced Example: Transforming Velocity Coordinates (WIP)

This section demonstrates a work-in-progress for transforming particle velocities into a new coordinate system (e.g., field-aligned coordinates).

In [None]:
# l = [1, 0, 0]
# m = [0, 1, 0]
# n = [0, 0, 1]

# def _vel_l(field, data):
#     return (l[0] * data[("particles", "p_ux")] + l[1] * data[("particles", "p_uy")] + l[2] * data[("particles", "p_uz")])

# def _vel_m(field, data):
#     return (m[0] * data[("particles", "p_ux")] + m[1] * data[("particles", "p_uy")] + m[2] * data[("particles", "p_uz")])

# ds2.add_field(ds2.pvar("vel_l"), units="code_velocity", function=_vel_l, sampling_type="particle")
# ds2.add_field(ds2.pvar("vel_m"), units="code_velocity", function=_vel_m, sampling_type="particle")

# sp2 = ds2.sphere(center=[0, 0, 0], radius=0.001)

# x_field = ds2.pvar("vel_l")
# y_field = ds2.pvar("vel_m")
# z_field = ds2.pvar("p_w")

# pp = yt.create_profile(data_source=sp2, bin_fields=[x_field, y_field], fields=z_field, n_bins=[64, 64]).plot()
# pp.set_unit(x_field, "km/s")
# pp.set_unit(y_field, "km/s")
# pp.show()

## Using the Native AMReX Particle Loader (Experimental)

For scenarios where you only need to load particle data and the overhead of `yt` is not desired, `flekspy` provides a direct, experimental loader called `AMReXParticleData`. This is much faster than the yt loader with a simple interface.

In [None]:
data_file = "data/3d_particle_region0_1_t00000002_n00000007_amrex"
pd = AMReXParticleData(data_file)
pd

Particles within a given region can be extracted as follows:

In [None]:
x_range = (-0.008, 0.008)
y_range = (-0.005, 0.005)

rdata = pd.select_particles_in_region(x_range=x_range, y_range=y_range)

### Phase Plot

Phase space plotting is achieved via `plot_phase`:

In [None]:
fig, ax = pd.plot_phase("vy", "vz", bins=(50, 32))
plt.show()

Adding 1D distributions via `marginals=True`:

In [None]:
pd.plot_phase("velocity_x", "velocity_y", bins=(32, 32), marginals=True)
plt.show()

Customization:

In [None]:
# Define the magnetic field vector for the transformation
B = np.array([1.0, 1.0, 0.0])
# Define an electric field vector\n
E = np.array([0.0, 1.0, 0.0])

# Create the transformation function using the utility
# We pass the original component names from the data header
b_field_transform = create_field_transform(B, pd.header.real_component_names)
eb_field_transform = create_field_transform(
    B, pd.header.real_component_names, e_field=E
)

fig, ax = plt.subplots(2, 2, figsize=(8, 6), constrained_layout=True)

pd.plot_phase(
    "vx",
    "vy",
    ax=ax[0, 0],
    add_colorbar=False,
    use_kde=True,
    kde_grid_size=30,
    kde_bandwidth=0.0002,
    title="KDE with GMM Estimate",
)

pd.plot_phase(
    "x",
    "vz",
    ax=ax[0, 1],
    add_colorbar=False,
    x_range=(-0.004, 0.004),
    y_range=(-0.005, 0.005),
    plot_zero_lines=False,
    ylabel=r"$v_z [m/s]$",
    title="x-vz",
)

pd.plot_phase(
    "v_parallel",
    "v_perp",
    ax=ax[1, 0],
    transform=b_field_transform,
    add_colorbar=False,
    plot_zero_lines=False,
    x_range=(-0.004, 0.004),
    y_range=(-0.005, 0.005),
    xlabel=r"$v_{\parallel}$",
    ylabel=r"$v_{\perp}$",
    title="B-field Aligned Velocities",
)

pd.plot_phase(
    "v_B",
    "v_E",
    ax=ax[1, 1],
    transform=eb_field_transform,
    add_colorbar=False,
    plot_zero_lines=False,
    x_range=(-0.004, 0.004),
    y_range=(-0.005, 0.005),
    xlabel=r"$v_B$",
    ylabel=r"$v_{E, \perp}$",
    title="E-B Aligned Velocities",
)
# GMM fit
gmm = pd.fit_gmm(1, "velocity_x", "velocity_y")
pd.plot_gmm_fit(gmm, ax=ax[0, 0], scale=1.0)

gmm2 = pd.fit_gmm(1, "v_B", "v_E", transform=eb_field_transform)
pd.plot_gmm_fit(gmm2, ax=ax[1, 1], scale=1.0)

plt.show()

Pairplot:

In [None]:
pd.pairplot()
plt.show()

In [None]:
pd.pairplot(corner=True)
plt.show()

Orthogonal intersecting planes:

In [None]:
pd.plot_intersecting_planes("x", "vx", "vy")
plt.show()

3D scattering:

In [None]:
xmin, xmax, ymin, ymax, zmin, zmax = -0.001, 0.001, -0.001, 0.001, 0.0, 0.01
hist_range = [[xmin, xmax], [ymin, ymax], [zmin, zmax]]
pd.plot_phase_3d("vx", "vy", "vz", normalize=True, hist_range=hist_range)
plt.show()

Matplotlib does not have real 3D rendering, which makes it hard for visualizing 3D phase plots.

## GMM Fitting

To demonstrate the capability of the GMM fitting and parameter extraction, we can create a synthetic dataset of particles with known properties, fit a GMM to it, and verify that the extracted parameters match the input parameters.

In [None]:
from sklearn.mixture import GaussianMixture

# Define the parameters for two synthetic Maxwellian distributions
n_samples_1 = 1500
mean_1 = np.array([0.0, 0.0])  # Center of the first distribution
temp_1 = 0.5                   # Isotropic temperature
cov_1 = np.array([[temp_1, 0], [0, temp_1]]) # Covariance matrix

n_samples_2 = 1000
mean_2 = np.array([1.0, 2.0])  # Center of the second distribution
temp_2 = 0.2
cov_2 = np.array([[temp_2, 0], [0, temp_2]])

# Generate the synthetic data using numpy
rng = np.random.RandomState(0)
data_1 = rng.multivariate_normal(mean_1, cov_1, n_samples_1)
data_2 = rng.multivariate_normal(mean_2, cov_2, n_samples_2)
synthetic_data = np.vstack([data_1, data_2])

# Shuffle the data to mix the two populations
np.random.shuffle(synthetic_data)

print(f"Shape of the synthetic dataset: {synthetic_data.shape}")

Fit a GMM to the synthetic data with 2 components:

In [None]:
gmm_synthetic = GaussianMixture(n_components=2, random_state=0)
gmm_synthetic.fit(synthetic_data)

Extract the physical parameters from the fitted GMM:

In [None]:
# Since our synthetic data is isotropic, we set isotropic=True
extracted_params = AMReXParticleData.get_gmm_parameters(gmm_synthetic, isotropic=True)

Compare the results:

In [None]:
print("--- Original Synthetic Data Parameters ---")
print(f"Population 1: Center = {mean_1.tolist()}, Temperature = {temp_1}")
print(f"Population 2: Center = {mean_2.tolist()}, Temperature = {temp_2}")
print("\n--- GMM Extracted Parameters ---")
for i, params in enumerate(extracted_params):
    center = [round(c, 5) for c in params["center"]]
    temp = round(params["temperature"], 5)
    print(f"Component {i+1}: Center = {center}, Temperature = {temp}")

Finally, we visualize the result. The plot below shows a 2D histogram of the synthetic particle data, with the fitted GMM components overlaid as red dashed ellipses. The ellipses represent the 1, 2, and 3-sigma contours of the Gaussian distributions, clearly showing how the GMM has identified the two distinct populations in the data.

In [None]:
from matplotlib.patches import Ellipse

fig, ax = plt.subplots(figsize=(8, 8))

# Plot the 2D histogram of the synthetic data
ax.hist2d(
    synthetic_data[:, 0], synthetic_data[:, 1], bins=50, cmap="turbo", density=True
)
ax.set_title("Synthetic Data with GMM Fit Overlay")
ax.set_xlabel("vx")
ax.set_ylabel("vy")

# Overlay the GMM ellipses
for i in range(gmm_synthetic.n_components):
    cov = gmm_synthetic.covariances_[i]
    mean = gmm_synthetic.means_[i]

    # Get eigenvalues and eigenvectors to determine ellipse orientation and size
    vals, vecs = np.linalg.eigh(cov)
    angle = np.degrees(np.arctan2(*vecs[:, 0][::-1]))

    # Plot ellipses for 1, 2, and 3 standard deviations
    for n_std in [1.0, 2.0, 3.0]:
        width, height = 2 * n_std * np.sqrt(vals)
        ellipse = Ellipse(
            xy=mean,
            width=width,
            height=height,
            angle=angle,
            edgecolor="red",
            facecolor="none",
            lw=1.5,
            ls="--",
        )
        ax.add_patch(ellipse)

ax.set_aspect("equal", "box")
plt.show()

### Anisotropic Example\n\nNow, let's do the same for an anisotropic distribution, where the temperature is different in different directions. This is common in magnetized plasmas, where temperature is often described in terms of $T_\\parallel$ (parallel to the magnetic field) and $T_\\perp$ (perpendicular to the field).

In [None]:
# Define parameters for a synthetic anisotropic distribution in field-aligned coordinates\nn_samples_aniso = 2500\nmean_aniso = np.array([0.5, 0.5])  # Center in (v_parallel, v_perp)\ntemp_parallel = 0.8\ntemp_perp = 0.2\ncov_aniso = np.array([[temp_parallel, 0], [0, temp_perp]]) # Covariance matrix is diagonal\n\n# Generate the synthetic data\nrng = np.random.RandomState(42)\nanisotropic_data = rng.multivariate_normal(mean_aniso, cov_aniso, n_samples_aniso)\n\n# Fit the GMM\ngmm_anisotropic = GaussianMixture(n_components=1, random_state=0)\ngmm_anisotropic.fit(anisotropic_data)\n\n# Extract parameters, specifying isotropic=False\nextracted_params_aniso = AMReXParticleData.get_gmm_parameters(gmm_anisotropic, isotropic=False)\n\n# Print comparison\nprint(\"--- Original Anisotropic Synthetic Data Parameters ---\")\nprint(f\"Population 1: Center = {mean_aniso.tolist()}, Temp Parallel = {temp_parallel}, Temp Perp = {temp_perp}\")\nprint(\"\\n--- GMM Extracted Anisotropic Parameters ---\")\nfor i, params in enumerate(extracted_params_aniso):\n    center = [round(c, 5) for c in params[\"center\"]]\n    temp_par = round(params[\"temp_parallel\"], 5)\n    temp_per = round(params[\"temp_perp\"], 5)\n    print(f\"Component {i+1}: Center = {center}, Temp Parallel = {temp_par}, Temp Perp = {temp_per}\")

In [None]:
from matplotlib.patches import Ellipse\n\nfig, ax = plt.subplots(figsize=(8, 8))\n\n# Plot the 2D histogram of the synthetic anisotropic data\nax.hist2d(\n    anisotropic_data[:, 0], anisotropic_data[:, 1], bins=50, cmap=\"turbo\", density=True\n)\nax.set_title(\"Anisotropic Synthetic Data with GMM Fit Overlay\")\nax.set_xlabel(r\"$v_{\\parallel}$\")\nax.set_ylabel(r\"$v_{\\perp}$\")\n\n# Overlay the GMM ellipses\nfor i in range(gmm_anisotropic.n_components):\n    cov = gmm_anisotropic.covariances_[i]\n    mean = gmm_anisotropic.means_[i]\n\n    # Get eigenvalues and eigenvectors to determine ellipse orientation and size\n    vals, vecs = np.linalg.eigh(cov)\n    angle = np.degrees(np.arctan2(*vecs[:, 0][::-1]))\n\n    # Plot ellipses for 1, 2, and 3 standard deviations\n    for n_std in [1.0, 2.0, 3.0]:\n        width, height = 2 * n_std * np.sqrt(vals)\n        ellipse = Ellipse(\n            xy=mean,\n            width=width,\n            height=height,\n            angle=angle,\n            edgecolor=\"red\",\n            facecolor=\"none\",\n            lw=1.5,\n            ls=\"--\",\n        )\n        ax.add_patch(ellipse)\n\nax.set_aspect(\"equal\", \"box\")\nplt.show()