# Simulation

Each simulation is encapsulated as an object that is responsible for
finding the location of each dataset in the simulation, loading it —
which may involve downloading and caching the relevant file — and
returning an object that can be used to access that piece of data.


## Location

We begin by simply loading the simulation.  There are three pieces of
information needed to identify a simulation:

  1. SXS ID

     This identifies the simulation type and includes a number.  For
     example, "SXS:BBH:0123", which identifies this simulation as a
     product of the SXS collaboration, specifies that it is a binary
     black hole simulation, and that it has been assigned the number
     "0123" in that series.  Note that these numbers are not
     necessarily sequential.  Nor do they necessarily correspond to
     the relative age of the simulation; that information is contained
     in the metadata itself in the various `date_*` keys.

  2. Version
     
     This identifies the version of the data — like "v2.0".  Unlike
     the SXS ID, this is an optional specifier.  If not provided, the
     most recent version is used.  All versions refer to the same
     underlying simulation, but the raw data may have been processed
     differently, may be provided in incompatible formats, etc.  For
     exploratory work, it is often convenient to simply use the most
     recent version.  However, for reproducibility, it is important to
     specify the version of the data you use for a given analysis.

   3. Lev (resolution)
      
      This identifies the resolution of the simulation — like "Lev5".
      This is also optional.  If not provided, the highest resolution
      is used.  Note that there is no consistency in the "Lev"s
      provided for different simulations, nor is there necessarily
      even consistency in the meaning of the "Lev" between different
      simulations (they are not always directly comparable).  Again,
      for reproducibility, it is important to specify the resolution
      of the data you use for a given analysis.

These three pieces of information may be combined into a single string as in
any of the following examples of valid inputs:

    SXS:BBH:0123
    SXS:BBH:0123v2.0
    SXS:BBH:0123/Lev5
    SXS:BBH:0123v2.0/Lev5

The full specification including ID, version, and Lev is called the
"location", but any of these can be provided to load the simulation:

```python
sxs.load("SXS:BBH:0123")
sxs.load("SXS:BBH:0123v2.0")
sxs.load("SXS:BBH:0123/Lev5")
sxs.load("SXS:BBH:0123v2.0/Lev5")
```

## Deprecated or superseded

Many simulations are now quite old, and do not have the benefit of
years of refinements to the simulation code.  As a result, the SXS
collaboration has deprecated many of them, and replaced them with
newer simulations of (nearly) the same physical parameters.

By default, a deprecated simulation raise an error if you attempt to
load it.  However, you can still load it if you want to, by choosing
one of the following options:

1. Pass `ignore_deprecation=True` to completely bypass even checking
   for deprecation or supersession.  No warnings or errors will be
   issued.
2. Include an explicit version number in the `location` string, as in
   "SXS:BBH:0123v2.0".  A warning will be issued that the simulation
   is deprecated, but it will be loaded anyway.
3. Pass `auto_supersede=True` to automatically load the superseding
   simulation, if there is only one.  Because no superseding
   simulation can be *precisely* the same as the deprecated one, there
   may be multiple superseding simulations that have very similar
   parameters, in which case an error will be raised and you must
   explicitly choose one.  If there is only one, a warning will be
   issued, but the superseding simulation will be loaded.
4. Configure `sxs` to automatically load superseding simulations by
   default with `sxs.write_config(auto_supersede=True)`.  This has the
   same effect as passing `auto_supersede=True` to every call to
   `sxs.load`.

Otherwise, a `ValueError` will be raised, with an explanation and
suggestions on what you might want to do.

In this case, "SXS:BBH:0123" is deprecated, and has been superseded by
the much newer simulation "SXS:BBH:2394".  We can load the superseding
simulation as

In [None]:
import sxs

sxs_bbh = sxs.load("SXS:BBH:0123", auto_supersede=True)

We can see that, even though we requested "SXS:BBH:0123", the location of the output
simulation object is "SXS:BBH:2394":

In [None]:
sxs_bbh.location

Note that the version "v2.0" and "Lev3" were automatically chosen as the highest values, respectively.

## Metadata

At this point, only the metadata (mentioned in the previous notebook) has been loaded, which we can access naturally:

In [None]:
sxs_bbh.metadata

Just as `simulations.dataframe` allows us to extract the metadata for all simulations as a uniform `pandas` table, we can extract the metadata for this one simulation in a format consistent with other simulations as a `pandas` series:

In [None]:
import pandas as pd

# The next line is just to ensure the output doesn't look ugly in the docs
with pd.option_context("max_colwidth", 46):
    display(sxs_bbh.series)

Various relevant pieces of information about the simulation are also
available as attributes of the simulation object.  For example, we may
wish to know which versions are available:

In [None]:
sxs_bbh.versions

In this case, only the generic version, `""`, and a version from the second catalog, `"v2.0"`, are available, because SXS:BBH:2394 is a new simulation.

These versions track modifications to the files representing the data and — together with the SXS ID — establish the unique identifier for the data set.  This unique identifier is also published as a DOI.  The DOI prefix for SXS data is 10.26138, and the full DOI for any simulation is given by combining these:

In [None]:
sxs_bbh.url

These DOIs are permanent and can be used to refer to the data in
publications, in the same way that DOIs for journal articles are used.
They point to deposits of the data in the Zenodo repository, which is
a long-term, open-access archive.  This is also where the data are
automatically obtained when you load a particular data set for the
first time.

## Data

Besides the metadata, the remaining data sets are loaded lazily.  This
means that we can access the data as needed, but the cost in time and
resources is not paid unless and until the data is actually accessed.
Specifically, the time to download the data if needed, the disk space
required to cache it if desired, the time to load the data from disk,
and the memory required to store it are all deferred to the point of
use.

For example, we can access data describing the horizons as

In [None]:
sxs_bbh.horizons

And data describing the waveform as

In [None]:
sxs_bbh.h

The objects returned will be the subject of the next two notebooks in this series:

- [`Horizons`](/tutorials/03-Horizons)
- [`WaveformModes`](/tutorials/04-Waveforms)