This is a minimal example illustrating what `saltax` can give you. 

Lanqing & Dacheng, Jun 29 2025

There will be 3 datasets in the end:
- `data`: Exactly the same as offline real data.
- `simulation`: Events reconstructed using the simulation instruction only, there is nothing else in the reconstruction process.
- `sprinkled`: Events reconstructed by mixing simulation and data. Some time it is also called `salt` and they mean the same thing. 

In [1]:
import matplotlib.pyplot as plt
import strax
import straxen
import saltax
from saltax.utils import straxen_version

straxen.print_versions(("strax", "straxen", "cutax", "saltax"))

*** Detector definition message ***
You are currently using the default XENON10 template detector.

Note: You have installed the 'manylinux2014' variant of XGBoost. Certain features such as GPU algorithms or federated learning are not available. To use these features, please upgrade to a recent Linux distro with glibc 2.28+, and install the 'manylinux_2_28' variant.


Unnamed: 0,module,version,path,git
0,python,3.11.0,/opt/XENONnT/anaconda/envs/XENONnT_el7.2025.07...,
1,strax,2.2.1,/home/xudc/strax/strax,branch:bump_v221 | 3345a1b
2,straxen,3.2.2,/home/xudc/straxen/straxen,branch:master | b5235cc
3,cutax,2.3.0,/home/xudc/cutax/cutax,branch:master | e809b3f
4,saltax,0.2.2,/home/xudc/saltax/saltax,branch:write_run_id_to_config | fd68486


Now let's define contexts in the `saltax` fasion.

Once you define the contexts below, it will
- Try to fetch simulation instruction specified by the context
- If no instruction found, generate simulation instruction.
- Register the `saltax` plugins as well as the `cutax` and `straxen` standard ones if not replaced.

In [2]:
if straxen_version() == 2:
    run_id = "037119"
    xedocs_version = "global_v14"
elif straxen_version() == 3:
    run_id = "066016"
    xedocs_version = "global_v18"

In [3]:
# You only need run_id in context when you need to compute raw_records_simu
# salt mode: reconstruction from a mixture of data and simulation
st_salt = saltax.contexts.sxenonnt(
    run_id=run_id,
    corrections_version=xedocs_version,
    saltax_mode="salt",
)
# simu mode: reconstruction from simulation only
st_simu = saltax.contexts.sxenonnt(
    run_id=run_id,
    corrections_version=xedocs_version,
    saltax_mode="simu",
)

INFO:fuse.context:Using simulation config file: fuse_config_nt_sr1_dev.json
INFO:fuse.context:Using clustering method: dbscan
INFO:fuse.context:Overriding processing plugins:
INFO:fuse.context:Registering <class 'fuse.plugins.processing.corrected_areas.CorrectedAreasMC'>
INFO:fuse.context:[legacy] Using fdc_map_mc: XnT_3D_FDC_xyz_SR1_15_Mar_2024_MC.json.gz
INFO:fuse.context:Using simulation config file: fuse_config_nt_sr1_dev.json
INFO:fuse.context:Using clustering method: dbscan
INFO:fuse.context:Overriding processing plugins:
INFO:fuse.context:Registering <class 'fuse.plugins.processing.corrected_areas.CorrectedAreasMC'>
INFO:fuse.context:[legacy] Using fdc_map_mc: XnT_3D_FDC_xyz_SR1_15_Mar_2024_MC.json.gz


By default, the context above will simuilate flat beta ER band at 50 Hz.

In [4]:
# Just to bind the storage so we have access to the raw_records of a small run
st_salt.storage.append(strax.DataDirectory("/project2/lgrandi/tutorial_data", readonly=True))
st_simu.storage.append(strax.DataDirectory("/project2/lgrandi/tutorial_data", readonly=True))

You can take a look that some plugins are replaced while some are not.

In [5]:
st_simu._plugin_class_registry["peaklets"]

saltax.plugins.peaklets.SPeaklets

In [6]:
st_simu._plugin_class_registry["microphysics_summary"]

saltax.plugins.csv_input.SChunkCsvInput

In [7]:
st_simu._plugin_class_registry["event_info"]

straxen.plugins.events.event_info.EventInfo

In [8]:
st_simu._plugin_class_registry["cuts_basic"]

cutax.cut_lists.basic.BasicCuts

Now let's make some data! Note that both contexts have the same hashes until `peaklets`, where the merging happens.

In [9]:
st_simu.key_for(run_id, "peaklets")

066016-peaklets-sjukmjuzur

In [10]:
st_salt.key_for(run_id, "records")

066016-records-xq4nrzt4g3

In [11]:
st_simu.key_for(run_id, "records")

066016-records-xq4nrzt4g3

In [12]:
dtypes = [
    "microphysics_summary",
    "raw_records_simu",
    "records",
    "peaklets",
    "peak_basics",
    "events",
    "event_basics",
    "event_info",
    "cuts_basic",
]
for dt in dtypes:
    st_salt.make(run_id, dt, save=dt)
for dt in dtypes:
    st_simu.make(run_id, dt, save=dt)

ValueError: The global version is set to be global_v18. But InverseS2WidthCutLowER is still using ONLINE version config diffusion_constant, which is xedocs://electron_diffusion_ctes?attr=value&run_id=plugin.run_id&version=ONLINE.

Let's take a quick look.

In [None]:
events_simu = st_simu.get_array(run_id, "event_info")
events_salt = st_salt.get_array(run_id, "event_info")

In [None]:
plt.figure(dpi=150)
plt.scatter(events_salt["cs1"], events_salt["cs2"], alpha=0.5, label="Sprinkled Dataset")
plt.scatter(events_simu["cs1"], events_simu["cs2"], alpha=0.5, label="Simulated Dataset")
plt.legend()
plt.xlim(0, 100)
plt.ylim(0, 6000)
plt.xlabel("cS1 [PE]")
plt.ylabel("cS2 [PE]")
plt.show()

In an ideal worlad without ambience interference, all the orange dots will be fully overlapped with a blue dot. However, it seems not from the plot. You now starts to see what is ambience interference. See [here](https://xe1t-wiki.lngs.infn.it/doku.php?id=lanqing:ambience_interference_and_sprinkling) for details.