# Callbacks

PassengerSim includes a variety of optimized data collection processes
that run automatically during a simulation, but these pre-selected data
may not be sufficient for every analysis.  To supplement this, users can 
choose to additionally collect any other data while running a simulation.
This is done by writing a "callback" function.  Such a function is invoked
regularly while the simulation is running, and can inspect and store almost
anything from the Simulation object.



In [None]:
import passengersim as pax

pax.versions()

Here, we'll run a quick demo using the "3MKT" example model.  We'll
give AL1 the 'P' RM system to make it interesting.

In [None]:
cfg = pax.Config.from_yaml(pax.demo_network("3MKT"))

cfg.simulation_controls.num_samples = 100
cfg.simulation_controls.burn_samples = 50
cfg.simulation_controls.num_trials = 1
cfg.db = None
cfg.outputs.reports.clear()

cfg.carriers.AL1.rm_system = "P"

sim = pax.Simulation(cfg)

## Types of Callback Functions

To collect data, we can write a function that will interrogate the simulation and 
grab whatever info we are looking for.  There are three different points
where we can attach data collection callback functions:

- `begin_sample`, which will trigger data collection at the beginning of each
    sample, after the RM systems for each carrier are initialized (e.g. with
    forecasts, etc) but before any customers can arrive.
- `end_sample`, which will trigger data collection at the end of each
    sample, after customers have arrive and all bookings have be finalized.
- `daily`, which will trigger data collection once per day during every sample,
    just after any DCP or daily RM system updates are run.

The first two callbacks (begin and end sample) are written as a function that accepts one argument 
(the `Simulation` object), and either returns nothing (to ignore that event)
or returns a dictionary of values to store, where the keys are all strings
naming what's being stored and the values can be whatever is of interest.
We can attach each callback to the Simulation by using a Python decorator.

## Example Callback Functions

For example, here we create a callback to collect carrier revenue at the end 
of every sample. Note that we skip the burn period by returning nothing for those
samples; this is not required by the callback algorithm but is good practice for
analysis.

In [None]:
@sim.end_sample_callback
def collect_carrier_revenue(sim):
    if sim.sim.sample < sim.sim.burn_samples:
        return
    return {c.name: c.revenue for c in sim.sim.carriers}

The daily callback operates similarly, except it accepts a second argument that gives the 
number of days prior to departure for this day.  You don't need to *use* the second argument
in the callback function, but you need to including in the function signature (and you can
use it if desired, e.g. to collect data only at DCPs instead of every day).  In the example 
here, we collect daily carrier revenue, but only every 7th sample, which is a good way
to reduce the overhead from collecting detailed data.

In [None]:
@sim.daily_callback
def collect_carrier_revenue_detail(sim, days_prior):
    if sim.sim.sample < sim.sim.burn_samples:
        return
    if sim.sim.sample % 7 == 0:
        return {c.name: c.revenue for c in sim.sim.carriers}

Multiple callbacks of the same kind can be attached (i.e. there can be two
end_sample callbacks).  The only limitation is that the named values in 
the return values of each callback function must be unique, or else they
will overwrite one another.

Once we have attached all desired callbacks, we can run the simulation as normal.

In [None]:
summary = sim.run()

All the usual summary data remains available for review and analysis.

In [None]:
summary.fig_carrier_revenues()

## Callback Data

In addition to the usual suspects, the summary object includes the collected callback data from
our callback functions.

In [None]:
summary.callback_data

Because we connected a "daily" callback, the data we collected is available under the 
`callback_data.daily` accessor.

In [None]:
summary.callback_data.daily[:5]

As you might expect, the "begin_sample" or "end_sample"
callbacks are available under `callback_data.begin_sample` or `callback_data.end_sample`, 
respectively.

In [None]:
summary.callback_data.end_sample[:5]

The callback data can include pretty much anything, so it is stored in a 
very flexible (but inefficient) format: a list of dict's.  If the content
of the dicts is fairly simple (numbers, tuples, lists, or nexted dictionaries thereof), 
it can be converted into a pandas DataFrame using the `to_dataframe` method
on the `callback_data` attribute.  This may make subsequent analysis easier.

In [None]:
summary.callback_data.to_dataframe("daily")

Users are free to process this callback data now however they like, with typical
Python tools: analyze, visualize, interpret, etc. 

In [None]:
import altair as alt

alt.Chart(
    summary.callback_data.to_dataframe("daily").eval("DIFF = AL1 - AL2")
).mark_line().encode(
    x=alt.X("days_prior", scale=alt.Scale(reverse=True)),
    y="DIFF",
    color="sample:N",
)