# Xedocs user guide

This tutorial serves as a basic introduction to xedocs. Xedocs is the replacement of CMT (Correction Management System) and will be used moving forward to store, save and access correction data as well as other metadata needed for the XENONnT experiment moving forward. The main goal with xedocs was to have a system with the following requierments: We want versioned data that is flexible enough to adapt to future changes, to have insertion rules and to follow time dependace requirements. 

In [1]:
import strax
import straxen
import xedocs as xd
import numpy as np
import rframe
import pymongo
import matplotlib.pyplot as plt

In [2]:
straxen.print_versions('strax straxen rframe xedocs'.split())

Unnamed: 0,module,version,path,git
0,python,3.9.18,/opt/XENONnT/anaconda/envs/XENONnT_2024.02.1/bin/python,
1,strax,1.6.1,/opt/XENONnT/anaconda/envs/XENONnT_2024.02.1/lib/python3.9/site-packages/strax,
2,straxen,2.2.1,/opt/XENONnT/anaconda/envs/XENONnT_2024.02.1/lib/python3.9/site-packages/straxen,
3,rframe,0.2.20,/opt/XENONnT/anaconda/envs/XENONnT_2024.02.1/lib/python3.9/site-packages/rframe,
4,xedocs,0.2.26,/home/gvolta/XENONnT/xedocs/xedocs,branch:ONLINE_corrections | 68c15d3


With xedocs we can use schemas to get different corrections from multiple sources such as bodega (refered to as 'detector numbers'), the xedocs databse, or even your own! Schemas are python classes with certain properties and functions to deal with the backend of storing data as well as rules such as preventing the deletion and overwritting of existing data!
<br>
Lets look at some of the schemas that are available!

In [3]:
xd.list_schemas()

['detector_numbers',
 'context_configs',
 'plugin_lineages',
 'context_lineages',
 'historian_tags',
 'fax_configs',
 'electron_drift_velocities',
 'electron_drift_time_gates',
 'electron_lifetimes',
 'rel_extraction_effs',
 'fdc_maps',
 'hit_thresholds',
 'pmt_area_to_pes',
 'posrec_models',
 's1_aft_xyz_maps',
 's1_xyz_maps',
 's2_xy_maps',
 'se_gains',
 'electron_diffusion_ctes',
 'baseline_samples_nv',
 'relative_light_yield',
 'avg_se_gains',
 'bayes_models',
 'som_classifiers',
 'hotspot_veto_thresholds',
 'photoionization_strengths',
 's2_pattern_maps',
 's1_pattern_maps',
 'z_bias_maps',
 'cs2_bottom_top_ratios',
 'utube_calibrations',
 'diffused_calibrations',
 'ibelt_calibrations',
 'calibration_sources',
 'hotspot_reports',
 'anode_ramps',
 'anode_washes',
 'abnormal_daq_rates',
 'processing_requests',
 'pmt_gain_calculations',
 'pmt_voltage_changes',
 'pmt_installs',
 'pmt_voltage_settings']

### Accessing Data

The development database is a database in which everyone can enter their own corrections, it will also have all the values for each correction. To avoid confusion make sure to label the version as something that cannot possibly be a real correction, here I will use test* 

In [4]:
xd_db_dev = xd.development_db()

The straxen database on the other had will have the real value of all correction and data can only be insterted into this database after other memebers of the colaboration have agreed that his value should be inserted.

In [5]:
xd_db_stx = xd.straxen_db()

The data can be returned to the user in a varaiety of formats, for example in a pandas dataframe structure:

#### Dataframe format

In [6]:
xd_db_stx.electron_lifetimes.find_df(version='ONLINE')

Unnamed: 0_level_0,Unnamed: 1_level_0,created_date,comments,value
version,time,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
ONLINE,2017-01-01 00:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,2.000000e+05
ONLINE,2020-10-14 00:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,7.125580e+04
ONLINE,2020-10-14 06:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,6.927400e+04
ONLINE,2020-10-14 12:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,6.903980e+04
ONLINE,2020-10-14 18:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,6.819510e+04
ONLINE,...,...,...,...
ONLINE,2022-09-30 03:06:03+00:00,2023-02-09 15:17:26.958000+00:00,,8.180421e+06
ONLINE,2022-09-30 09:06:03+00:00,2023-02-09 15:17:26.958000+00:00,,8.180421e+06
ONLINE,2022-09-30 15:06:03+00:00,2023-02-09 15:17:26.958000+00:00,,8.180421e+06
ONLINE,2022-09-30 21:06:03+00:00,2023-02-09 15:17:26.958000+00:00,,8.180421e+06


In [7]:
elife = xd_db_stx.electron_lifetimes.find_df(version='ONLINE') # if you try to 

In [8]:
elife

Unnamed: 0_level_0,Unnamed: 1_level_0,created_date,comments,value
version,time,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
ONLINE,2017-01-01 00:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,2.000000e+05
ONLINE,2020-10-14 00:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,7.125580e+04
ONLINE,2020-10-14 06:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,6.927400e+04
ONLINE,2020-10-14 12:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,6.903980e+04
ONLINE,2020-10-14 18:00:00+00:00,2023-02-09 15:17:26.958000+00:00,,6.819510e+04
ONLINE,...,...,...,...
ONLINE,2022-09-30 03:06:03+00:00,2023-02-09 15:17:26.958000+00:00,,8.180421e+06
ONLINE,2022-09-30 09:06:03+00:00,2023-02-09 15:17:26.958000+00:00,,8.180421e+06
ONLINE,2022-09-30 15:06:03+00:00,2023-02-09 15:17:26.958000+00:00,,8.180421e+06
ONLINE,2022-09-30 21:06:03+00:00,2023-02-09 15:17:26.958000+00:00,,8.180421e+06


#### list format

In [None]:
elife = xd_db_stx.electron_lifetimes.find(version='ONLINE') # if you try to 

In [None]:
list(elife)[:3]

#### Dict format

In [None]:
elife = xd_db_stx.electron_lifetimes.find_dicts(version='ONLINE') # if you try to 

In [None]:
elife[:3]

#### "Docs" format

In [None]:
elife = xd_db_stx.electron_lifetimes.find_docs(version='ONLINE') # if you try to 

In [None]:
elife[:3]

#### You can just get one data point if that is all you want!

In [None]:
elife = xd_db_stx.electron_lifetimes.find_one(version='ONLINE') # if you try to 

In [None]:
elife

##### You can pass many different arguments into the search parameters to make your scope as limited or as wide as you might want it, you can also use _sort options if you want the data to be returned in a particualr order.

In [None]:
rel_ext_eff = xd_db_stx.rel_extraction_effs.find_docs(version='v3')

In [None]:
rel_ext_eff[:3]

In [None]:
# lets sort by time and only get the ab partition
rel_ext_eff = xd_db_stx.rel_extraction_effs.find_docs(version='v3', partition='ab', __sort='time')

In [None]:
rel_ext_eff[-3:]

In [None]:
rel_ext_eff = xd_db_stx.rel_extraction_effs.find_docs(version='v3', 
                                                  run_id = '027434', 
                                                  partition = 'ab')

In [None]:
rel_ext_eff

##### Data stored in xedocs have multiple indecies, and there is a requierment that all indecies are unique, therefore you can only save new data when it has indecies where the combination of all indecies is unique

We can also access the Bodega data through schemas using 'DetectorNumber'

In [None]:
xd_db_stx.detector_numbers.find_one(field='g1')

In [None]:
se_gain = xd_db_stx.detector_numbers.find(field = 'se_gain')

In [None]:
list(se_gain)

### Saving data using xedocs

To save new data using xedocs we need to know what indecies that particular schema has. By defualt all schemas will require a version and a run_id or time for the indecies and a value to actually upload. However other shcemas can have additional indecies you need to spesify such as algorithm for machine learning related currections such as 'mlp', 'cnn' and 'gcn', format of the data, among others. As such we need to first know what are the indecies required and then we can upload the data!

For this set of correction we will save the data to the development db.

In [9]:
run_id = '047493'

In [10]:
# Get the ONLINE of elife for out run_id
elife_online = xd_db_stx.electron_lifetimes.find_docs(version = 'ONLINE', run_id = run_id)

In [11]:
list(xd_db_dev.electron_lifetimes.find(run_id = run_id))

[{'version': 'v6',
  'created_date': datetime.datetime(2023, 5, 19, 14, 59, 55, 878000, tzinfo=<UTC>),
  'comments': '',
  'time': datetime.datetime(2022, 9, 13, 11, 14, 46, 741000, tzinfo=<UTC>),
  'value': 26141032.007023267},
 {'version': 'v7',
  'created_date': datetime.datetime(2023, 2, 9, 15, 17, 26, 958000, tzinfo=<UTC>),
  'comments': '',
  'time': datetime.datetime(2022, 9, 13, 11, 14, 46, 741000, tzinfo=<UTC>),
  'value': 23933929.779056724},
 {'version': 'v8',
  'created_date': datetime.datetime(2023, 5, 19, 14, 59, 56, 820000, tzinfo=<UTC>),
  'comments': '',
  'time': datetime.datetime(2022, 9, 13, 11, 14, 46, 741000, tzinfo=<UTC>),
  'value': 23933929.779056724},
 {'version': 'v9',
  'created_date': datetime.datetime(2023, 12, 21, 17, 6, 54, 933000, tzinfo=<UTC>),
  'comments': 'v9 EL version',
  'time': datetime.datetime(2022, 9, 13, 11, 14, 46, 741000, tzinfo=<UTC>),
  'value': 24203724.610794935},
 {'version': 'test*',
  'created_date': datetime.datetime(2023, 1, 24, 1

In [12]:
elife_new = xd.schemas.ElectronLifetime(value=elife_online[0].value*0.90, version='test3*', 
                                        run_id=run_id, datasource = 'development_db',
                                        comments='giving a 10% decrease to Electron lifetime')

In [17]:
elife_new


        Xenon ElectronLifetime Document
        -------------------------------
 
        Category:      corrections
        Alias:         electron_lifetimes
        Index:         version=test*3, time=2022-09-13 11:14:46.741000+00:00
        Values:        created_date=2024-04-05 14:38:27.452000+00:00, comments=giving a 10% decrease to Electron lifetime, value=6544337.031248
        

In [13]:
xd_db_dev.electron_lifetimes.insert(elife_new)

ServerSelectionTimeoutError: No primary available for writes, Timeout: 30s, Topology Description: <TopologyDescription id: 66100caed949d86b2a548aa7, topology_type: ReplicaSetNoPrimary, servers: [<ServerDescription ('fried.rice.edu', 27017) server_type: Unknown, rtt: None, error=NetworkTimeout('fried.rice.edu:27017: timed out')>, <ServerDescription ('xenon-rundb.grid.uchicago.edu', 27017) server_type: RSSecondary, rtt: 0.013266996592283248>, <ServerDescription ('xenon1t-daq.lngs.infn.it', 27015) server_type: Unknown, rtt: None, error=NetworkTimeout('xenon1t-daq.lngs.infn.it:27015: timed out')>, <ServerDescription ('xenon1t-daq.lngs.infn.it', 27017) server_type: Unknown, rtt: None, error=NetworkTimeout('xenon1t-daq.lngs.infn.it:27017: timed out')>, <ServerDescription ('xenon1t-daq.lngs.infn.it', 27018) server_type: Unknown, rtt: None, error=NetworkTimeout('xenon1t-daq.lngs.infn.it:27018: timed out')>]>

In [14]:
list(xd_db_dev.electron_lifetimes.find(run_id = run_id))

[{'version': 'v6',
  'created_date': datetime.datetime(2023, 5, 19, 14, 59, 55, 878000, tzinfo=<UTC>),
  'comments': '',
  'time': datetime.datetime(2022, 9, 13, 11, 14, 46, 741000, tzinfo=<UTC>),
  'value': 26141032.007023267},
 {'version': 'v7',
  'created_date': datetime.datetime(2023, 2, 9, 15, 17, 26, 958000, tzinfo=<UTC>),
  'comments': '',
  'time': datetime.datetime(2022, 9, 13, 11, 14, 46, 741000, tzinfo=<UTC>),
  'value': 23933929.779056724},
 {'version': 'v8',
  'created_date': datetime.datetime(2023, 5, 19, 14, 59, 56, 820000, tzinfo=<UTC>),
  'comments': '',
  'time': datetime.datetime(2022, 9, 13, 11, 14, 46, 741000, tzinfo=<UTC>),
  'value': 23933929.779056724},
 {'version': 'v9',
  'created_date': datetime.datetime(2023, 12, 21, 17, 6, 54, 933000, tzinfo=<UTC>),
  'comments': 'v9 EL version',
  'time': datetime.datetime(2022, 9, 13, 11, 14, 46, 741000, tzinfo=<UTC>),
  'value': 24203724.610794935},
 {'version': 'test*',
  'created_date': datetime.datetime(2023, 1, 24, 1

In [15]:
elife_new = xd.schemas.ElectronLifetime(value=elife_online[0].value*0.80, version='test*3', 
                                        run_id=run_id, datasource = 'development_db',
                                        comments='giving a 10% decrease to Electron lifetime')

In [16]:
# there is another way to save this data
xd.insert_docs('electron_lifetimes', elife_new, 'development_db')

ServerSelectionTimeoutError: No primary available for writes, Timeout: 30s, Topology Description: <TopologyDescription id: 66100ce4d949d86b2a548aaa, topology_type: ReplicaSetNoPrimary, servers: [<ServerDescription ('fried.rice.edu', 27017) server_type: Unknown, rtt: None, error=NetworkTimeout('fried.rice.edu:27017: timed out')>, <ServerDescription ('xenon-rundb.grid.uchicago.edu', 27017) server_type: RSSecondary, rtt: 0.013740354090929033>, <ServerDescription ('xenon1t-daq.lngs.infn.it', 27015) server_type: Unknown, rtt: None, error=NetworkTimeout('xenon1t-daq.lngs.infn.it:27015: timed out')>, <ServerDescription ('xenon1t-daq.lngs.infn.it', 27017) server_type: Unknown, rtt: None, error=NetworkTimeout('xenon1t-daq.lngs.infn.it:27017: timed out')>, <ServerDescription ('xenon1t-daq.lngs.infn.it', 27018) server_type: Unknown, rtt: None, error=NetworkTimeout('xenon1t-daq.lngs.infn.it:27018: timed out')>]>

In [None]:
list(xd_db_dev.electron_lifetimes.find(run_id = run_id))

#### Now to save data in our own database!

You will not have access to the config file bellow, however if you have mongoDB installed locally you can just ommit the the host, username and password information and just connect to your local mongo_db. You can also call the database whatever you want.

In [None]:
import config

host = config.mongo_rdb_url
username = config.mongo_rdb_username
password = config.mongo_rdb_password

In [None]:
db = pymongo.MongoClient(host = host,
                        username = username,
                        password = password)['correctionsSandbox']

In [None]:
xd.schemas.ElectronLifetime.find(version = 'test*', datasource = db['electron_lifetimes'])

In [None]:
new_elife = xd.schemas.ElectronLifetime(value = 123456, version = 'test*', run_id = run_id)

In [None]:
new_elife.save(db['electron_lifetimes'])

In [None]:
xd.schemas.ElectronLifetime.find(version = 'test*', datasource = db['electron_lifetimes'])

Success! we have inserted data into our own mongo database!