# <center>Lesson 8: Time-series</center>
### <center>yt user/developer workshop, July 2025</center>

## Considering multiple datasets, we may want to:
* perform a uniform analysis
* take advantage of easy parallelism
* find a specific dataset or set of datasets

## The [DatasetSeries](https://yt-project.org/docs/dev/reference/api/yt.data_objects.time_series.html#yt.data_objects.time_series.DatasetSeries)
* an object that holds multiple datasets

In [None]:
import os
import yt

In [None]:
data_path = "/Users/britton/EnzoRuns/yt-workshop-2025/primordial_star"
fns = ["DD0096/DD0096", "DD0118/DD0118", "DD0130/DD0130", "DD0140/DD0140", "DD0157/DD0157"]

In [None]:
ts = yt.DatasetSeries([os.path.join(data_path, fn) for fn in fns])

In [None]:
for ds in ts:
    print (ds.current_time)

### Or with wildcards using `yt.load`

In [None]:
ts = yt.load(os.path.join(data_path, "DD????/DD????"))

## Parallel Iteration
* if analysis for each dataset is independent, this should be embarrassingly parallel
* add `.piter()` to the loop construction to parallelize

### Parallelism covered in Lesson 11

In [None]:
for ds in ts.piter():
    print (ds.current_time)

## Locating specific datasets
* [get_by_time](https://yt-project.org/docs/dev/reference/api/yt.data_objects.time_series.html#yt.data_objects.time_series.DatasetSeries.get_by_time)
* [get_by_redshift](https://yt-project.org/docs/dev/reference/api/yt.data_objects.time_series.html#yt.data_objects.time_series.DatasetSeries.get_by_redshift)

In [None]:
ds = ts.get_by_time((123.4, "Myr"))

In [None]:
ds.current_time.to("Myr")

## Loading an entire simulation
* simulation output is usually pre-defined
* [load_simulation](https://yt-project.org/docs/dev/reference/api/yt.loaders.html#yt.loaders.load_simulation) will calculate what outputs should exist based on the simulation parameter file
* the `all_outputs` attribute is a list of dicts with filename, time, redshift (if cosmological)
* currently supported for Enzo, Gadget, OWLS, and Exodus II frontends

In [None]:
my_sim = yt.load_simulation(os.path.join(data_path, "gas+dm-L3.enzo"), "Enzo")

In [None]:
my_times = my_sim.arr([output["time"] for output in my_sim.all_outputs])
my_fns = [output["filename"] for output in my_sim.all_outputs]

In [None]:
my_times.to("Myr")
# my_fns

### What datasets are actually available?
* more or less output may occur during the simulation
* not all datasets may be present
* add the `find_outputs=True` keyword to `load_simulation` to check the filesystem

In [None]:
my_sim = yt.load_simulation(os.path.join(data_path, "gas+dm-L3.enzo"), "Enzo", find_outputs=True)

### Get the exact time-series you want
* the [get_time_series](https://yt-project.org/docs/dev/reference/api/yt.frontends.enzo.simulation_handling.html#yt.frontends.enzo.simulation_handling.EnzoSimulation.get_time_series) method will turn the simulation object into a `DatasetSeries`.
* it accepts a variety of keywords to specify times and redshifts of the datasets to be included

In [None]:
my_sim.get_time_series()

In [None]:
my_sim.get_time_series(initial_redshift=28)

In [None]:
from matplotlib import pyplot as plt
%matplotlib inline

my_sim.get_time_series()
for ds in my_sim:
    v, c = ds.find_max(("gas", "density"))
    sp = ds.sphere(c, (5, "pc"))

    prof = yt.create_profile(sp, [("index", "radius")], [("gas", "density")],
                             weight_field=("gas", "cell_mass"))
    plt.loglog(prof.x.to("pc")[prof.used],
               prof["gas", "density"][prof.used],
               label=f"z = {ds.current_redshift:.3f}")

plt.xlabel("r [pc]")
plt.ylabel("$\\rho\\ [g/cm^{-3}]$")
plt.legend()
plt.show()