# Running a scheduler simulation and adding it to an archive

## Development aid

The following is a development style aid; only uncomment if developing the notebook:

In [1]:
%load_ext lab_black
%load_ext autoreload
%autoreload 1

## Imports

In [2]:
import os
import re
import yaml
from pathlib import Path
from tempfile import TemporaryDirectory

from astropy.time import Time
import pandas as pd

from lsst.resources import ResourcePath

from rubin_scheduler.scheduler.example import example_scheduler
from rubin_scheduler.scheduler.model_observatory import ModelObservatory
from rubin_scheduler.sim_archive import drive_sim
from rubin_scheduler.scheduler.utils import SchemaConverter
from rubin_scheduler.utils import survey_start_mjd

## Configuring a simulation

### Set archive parameters

To use a local directory as the root of your archive, you can do this (updating to whatever directory you want to use):

In [3]:
archive_uri = "file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/"

If you want to use the pre-night archive at the USDF, make sure you have the needed credentials in `~/.lsst/aws-credentials.ini` in a `prenight` section, then change the cell type of the following cell from `Raw` to `Code`:

### Set simulation parameters

In [4]:
sim_mjd_start = survey_start_mjd()
sim_length = 2

### Create and configure the observatory and scheduler

In [5]:
observatory = ModelObservatory()
scheduler = example_scheduler()
scheduler.keep_rewards = True

INFO:healpy:Sigma is 254.796540 arcmin (0.074117 rad) 
INFO:healpy:-> fwhm is 600.000000 arcmin
INFO:healpy:Sigma is 0.000000 arcmin (0.000000 rad) 
INFO:healpy:-> fwhm is 0.000000 arcmin


Optimizing ELAISS1
Optimizing XMM_LSS
Optimizing ECDFS
Optimizing COSMOS
Optimizing EDFS_a


## Save cells run in this kernel to a notebook

Save the cells run in this kernel so that we can save the provenance of the simulation in the archive (optional).

In [6]:
scratch_dir = TemporaryDirectory()
scratch_path = Path(scratch_dir.name)
notebook_fname = scratch_path.joinpath("notebook.ipynb").as_posix()

In [7]:
%notebook $notebook_fname

## Actually run the simulation

In [8]:
(
    observatory,
    scheduler,
    observations,
    reward_df,
    obs_rewards_series,
    archive_resource_path,
) = drive_sim(
    observatory=observatory,
    scheduler=scheduler,
    archive_uri=archive_uri,
    label=f"Example simulation started at {Time.now().iso}.",
    notebook=notebook_fname,
    tags=["example"],
    mjd_start=sim_mjd_start,
    survey_length=sim_length,
    record_rewards=True,
)

progress = 97.91%



progress = 100.00%Skipped 0 observations
Flushed 76 observations from queue for being stale
Completed 2036 observations
ran in 0 min = 0.0 hours


your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed,key->block3_values] [items->Index(['basis_function', 'basis_function_class', 'tier_label', 'survey_label',
       'survey_class'],
      dtype='object')]

  reward_df.to_hdf(rewards_fname, "reward_df")
INFO:rubin_scheduler.sim_archive.sim_archive:Copied /tmp/tmp_y902i0i/environment.txt to file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/environment.txt
INFO:rubin_scheduler.sim_archive.sim_archive:Copied /tmp/tmp_y902i0i/notebook.ipynb to file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/notebook.ipynb
INFO:rubin_scheduler.sim_archive.sim_archive:Copied /tmp/tmp_y902i0i/opsim.db to file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/opsim.db
INFO:rubin_scheduler.sim_archive.sim_archive:Copied /tmp/tmp_y902i0i/pypi.json to file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/pypi.json


So, where did we put the archive?

In [9]:
archive_resource_path

ResourcePath("file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/")

## Examine the archive

See what simulations are available in the archive. (We don't actually need to do this here, because we already have `archive_resource_path` above, but we could look like this if we didn't know already.)

In [10]:
base_archive_resource_path = ResourcePath(archive_uri)
for dirpath, dirnames, filenames in base_archive_resource_path.walk():
    # ResourcePath.walk does have a file_filter filter argument, but it only
    # filters files, not directories, so does not do what we need here.
    for dirname in dirnames:
        full_url = dirpath.join(dirname).geturl()
        if re.search(r"2024-01-23/[0-9]+$", full_url):
            print(full_url)

file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2
file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/1


What files did we store?

In [11]:
for dirpath, dirnames, filenames in archive_resource_path.walk():
    for filename in filenames:
        print(dirpath.join(filename))

file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/opsim.db
file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/pypi.json
file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/environment.txt
file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/obs_stats.txt
file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/notebook.ipynb
file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/sim_metadata.yaml
file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/rewards.h5
file:///sdf/data/rubin/user/neilsen/data/test_sim_archive/2024-01-23/2/scheduler.pickle.xz


What metadata did we store?

In [12]:
metadata_resource_path = archive_resource_path.join("sim_metadata.yaml")

In [13]:
print(metadata_resource_path.read().decode())

files:
    environment:
        md5: 33f94ddf8975f9641a1f524fd22e362e
        name: environment.txt
    notebook:
        md5: 731f1ae2fb68e98b33c9b2667fdf39f3
        name: notebook.ipynb
    observations:
        md5: e65ea1226f435f113903b70378c8cdef
        name: opsim.db
    pypi:
        md5: 9c86ea9b4e7aa40d3e206fad1a59ea31
        name: pypi.json
    rewards:
        md5: 0818417ef31b6566fc6a30cb196dc166
        name: rewards.h5
    scheduler:
        md5: 14a41065a262e5c7ecd03c6d8747ee53
        name: scheduler.pickle.xz
    statistics:
        md5: 669ee31b361043abc9bb34a6d4a6f51e
        name: obs_stats.txt
host: neilsen-nb
label: Example simulation started at 2024-01-23 16:51:22.683.
scheduler_version: 1.0.1.dev25+gba1ca4d.d20240102
sim_runner_kwargs:
    mjd_start: 60796.0
    record_rewards: true
    survey_length: 2
simulated_dates:
    first: '2025-04-30'
    last: '2025-05-01'
tags:
- example
username: neilsen




Load up the metadata for convenient access:

In [14]:
sim_metadata = yaml.safe_load(metadata_resource_path.read().decode())

Read visit statistics from the archive:

In [15]:
statistics_url = metadata_resource_path.join(
    sim_metadata["files"]["statistics"]["name"]
).geturl()

pd.read_csv(statistics_url, sep="\t", index_col=0)

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
observationId,2036.0,1017.5,587.886894,0.0,508.75,1017.5,1526.25,2035.0
fieldRA,2036.0,185.725387,39.964681,0.564476,166.142302,178.504752,197.166779,358.489606
fieldDec,2036.0,-38.963969,26.412814,-89.677611,-59.876307,-41.925437,-16.865586,19.22998
observationStartMJD,2036.0,60796.76625,0.555865,60796.001439,60796.233521,60796.98774,60797.230212,60797.999983
flush_by_mjd,2036.0,58169.038223,12366.46075,0.0,60796.219107,60796.989094,60797.245734,60798.023375
visitExposureTime,2036.0,29.616896,2.36697,15.0,30.0,30.0,30.0,30.0
rotSkyPos,2036.0,217.409374,96.268294,0.391245,147.555804,247.523411,295.100362,359.756191
rotSkyPos_desired,2036.0,206.543403,110.191236,0.0,131.863537,246.680509,294.6204,359.756191
numExposures,2036.0,1.793713,0.404738,1.0,2.0,2.0,2.0,2.0
airmass,2036.0,1.525252,0.459431,1.013159,1.130718,1.365415,1.868428,2.91401


Get the visits themselves from the archive, both as an `numpy` array and a `pandas.DataFrame`:

In [16]:
opsim_visits_resource_path = metadata_resource_path.join(
    sim_metadata["files"]["observations"]["name"]
)

schema_converter = SchemaConverter()

with opsim_visits_resource_path.as_local() as opsim_visits_local_resource_path:
    observations = schema_converter.opsim2obs(opsim_visits_local_resource_path.ospath)

opsim_visits = schema_converter.obs2opsim(observations)
opsim_visits

Unnamed: 0,observationId,fieldRA,fieldDec,observationStartMJD,flush_by_mjd,visitExposureTime,filter,rotSkyPos,rotSkyPos_desired,numExposures,...,sunAz,sunRA,sunDec,moonRA,moonDec,moonDistance,solarElong,moonPhase,cummTelAz,scripted_id
0,0,200.782211,-50.116498,60796.001439,60796.043692,30.0,r,127.004982,0.000000,2,...,273.457997,38.109263,14.977460,83.711608,29.111132,128.922776,142.152585,24.587034,132.523734,0
1,1,205.280457,-48.985871,60796.001888,60796.043692,30.0,r,131.267789,0.000000,2,...,273.380529,38.109692,14.977597,83.717850,29.110354,131.850674,144.402975,24.589727,130.557654,0
2,2,208.768832,-50.721324,60796.002335,60796.043692,30.0,r,132.267931,0.000000,2,...,273.303332,38.110119,14.977733,83.724072,29.109578,133.962569,143.468442,24.592412,132.814350,0
3,3,212.491654,-52.367602,60796.002782,60796.043692,30.0,r,133.668631,0.000000,2,...,273.226054,38.110546,14.977868,83.730302,29.108800,135.992842,142.343662,24.595101,134.937940,0
4,4,209.531249,-47.656241,60796.003233,60796.043692,30.0,r,135.334507,0.000000,2,...,273.148211,38.110975,14.978005,83.736579,29.108015,134.729166,146.556968,24.597810,128.702991,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2031,2031,172.497004,-27.689426,60797.998183,60798.023375,30.0,g,160.168151,160.168151,2,...,274.576175,40.020184,15.575931,115.378901,26.701070,77.251374,134.488863,39.019113,106.706846,0
2032,2032,175.300205,-29.396924,60797.998632,60798.023375,30.0,g,157.242978,157.242978,2,...,274.499723,40.020615,15.576063,115.383392,26.699177,80.235058,136.729629,39.021226,112.672214,0
2033,2033,178.708344,-28.309793,60797.999082,60798.023375,30.0,g,159.756757,159.756757,2,...,274.423127,40.021047,15.576196,115.387893,26.697280,81.946828,139.848103,39.023343,111.284480,0
2034,2034,178.221156,-31.080350,60797.999533,60798.023375,30.0,g,155.019839,155.019839,2,...,274.346410,40.021479,15.576329,115.392401,26.695378,83.252761,138.905704,39.025464,117.759084,0
