# Pipeline Context and Domain Objects

## Overview

The **Pipeline Context** maintains the current state of a data processing session. It also serves as a container for metadata describing the observing project and individual datasets. Additionally, it encapsulates logical representations of abstract or tangible concepts from the instrument, operations, and generic astronomy/physics domains, using Pipeline Python **Domain Objects**. The Pipeline Context and Domain Object classes form the backbone of the pipeline infrastructure. Combined with PipelineTask concepts, interfaces, and weblog generation, they provide the foundation for the Pipeline Package.

## Use Cases

### Life Cycle

[context]: ../_autosummary/pipeline.infrastructure.launcher.Context.rst#pipeline.infrastructure.launcher.Context
[hifa_importdata]: ../_autosummary/pipeline.hifa.cli.hifa_importdata.rst#pipeline.hifa.cli.hifa_importdata
[results]: apisummary.html#results-classes
[h_init]: ../_autosummary/pipeline.h.cli.h_init.rst#pipeline.h.cli.h_init
[hifa_flagdata]: ../_autosummary/pipeline.hifa.cli.hifa_flagdata.html#pipeline.hifa.cli.hifa_flagdata
[h_save]: ../_autosummary/pipeline.h.cli.h_save.html#pipeline.h.cli.h_save
[h_resume]: ../_autosummary/pipeline.h.cli.h_resume.html#pipeline.h.cli.h_resume
[observingrun]: ../_automodapi/pipeline.domain.ObservingRun.rst#pipeline.domain.ObservingRun
[ms]: ../_automodapi/pipeline.domain.ObservingRun.rst#pipeline.domain.ObservingRun
[obsrun]: ../_automodapi/pipeline.domain.ObservingRun.rst#pipeline.domain.ObservingRun
[projsummary]: ../_autosummary/pipeline.infrastructure.project.ProjectSummary.rst#pipeline.infrastructure.project.ProjectSummary
[projstruct]: ../_autosummary/pipeline.infrastructure.project.ProjectStructure.rst#pipeline-infrastructure-project-projectstructure

From a technical perspective, the [Context][context] of a pipeline data processing session (PPS) is a Python class object with various attributes link to pipeline domain class objects.

During the initialization of a processing session, the [context][context] instance is created. The project-level and Execution Block (EB)-level metadata is extracted from on-disk datasets during the pipeline [h*_importdata][hifa_importdata] task stage. Pipeline domain objects are constructed as Pythonic representations of real-world concepts for the observations (e.g. scan, spectral windows, antennas). It also initializes various attributes/properties and starts to bookkeep the session processing state, including:

- The pipeline calibration state
- A tree of [Results][results] class objects summarizing data processing and quality assurance results for each pipeline stage
- Various internal pipeline variables and objects for configurations, etc.

The pipeline stage/context interaction overview (created by V. Geers at the 2022 Pipeline F2F meeting) illustrates the stage-by-stage context update process:

![context and stages](context_stages.png)

* A new context object is created with the Pipeline Task [h_init()][h_init]
* During the execution of [hifa_importdata()][hifa_importdata], measurements are attached to the context
* After accepting the results, the pipeline moves to the next stage, [hifa_flagdata()][hifa_flagdata]

The Pipeline context can be saved (serializing) to disk using [h_save()][h_save] and restored deserializingg) using [h_resume()][h_resume], although in the majority case the task directly interacts with the in-memory instance. However, in a standard workflow, the end-state context will always be present in the working directory and updated each time when a stage is completed.

<div class="alert alert-info">
stage: an execution of a pipeline task under a specific user input, including data processing control parameters and the current Pipeline context.
</div>

### Contents

In general, pipeline `Context` and `Domain` objects serve the following major purposes (note that we use `ctx` to denote an example `Context` object, see the code snippet at the end of this documentation):

* as the container for multi-level hierarchical metadata of observing projects, observing sessions, and execution blocks from online systems, e.g. 
    * [ctx.observing_run][observingrun]: Metadata about observing run that is processed
    * [ctx.observing_run.measurements_sets][ms]: (a list of pipeline `MeasurementSet` domain objects).
    * [ctx.observing_run.virtual_sceicne_spw_ids|names|shortnames][obsrun]: Info on virtual spectral windows across observing run
    * `ctx.observing_run.start|end_time|datatime`: Start/end time (and date) for run (based on first/last MS)
    * `ctx.observing_run.projectids|schedblocks_ids|execblockids|observers`: Return set of unique project/scheduling block / execution block IDs or observers for given run
    * [ctx.project_summary][projsummary], [ctx.project_structure][projstruct], [ctx.project_peroformance_parameters](../_autosummary/pipeline.infrastructure.project.PerformanceParameters.rst#pipeline.infrastructure.project.PerformanceParameters): Information about the observing project and imaging performance goals.
  they are either ingested from the pipeline processing request (PPR, official interface for the pipeline executation) or harvested from the on-disk datasets in the format of ASDM or MeasurementsSets. 
  They are commonly read-only/updated upon the indigestion of new data/metadata (e.g. [h*_importdata][hifa_importdata]), but could be updated when the on-disk dataset is updated: for example, if a new CASA ms data column is corrected to store self-calibration visibility data for field A and spw B, pipeline will update the python domain objects to reflect the latest state of the datatype information at the domain object level to keep the bookkeeping up-to-date without writing the data back to CASA data itself. Essentially, the python-layer domain object serves as add-into on the complementary data processing metadata outside of CASA table forms.

* provide Pythonic methods (sometimes with in-memory caching) to fast access metadata on individual CASA Measurement Sets (under the pipeline domain objects scope) to avoid repeatedly revisiting casatoo.s. msmd tables frequently, e.g. `ctx.observing_run.measurement_sets.get_scans()`, the design could be dated prior the introductio of `casatools.msmetadata`.

* processing session configuration and state tracking, transfer state from stage to stage.

    * [ctx.calimlist](../_automodapi/pipeline.infrastructure.launcher.Context.rst#pipeline.infrastructure.launcher.Context.calimlist), [ctx.sciimlist](../_automodapi/pipeline.infrastructure.launcher.Context.rst#pipeline.infrastructure.launcher.Context.sciimlist), `ctx.rmsimlist`, `ctx.subimlist`: Lists of images that have been compute in (calibrator, science, RMS, cutout) imaging stages.
    * calibration state [ctx.callibrary](../_autosummary/pipeline.infrastructure.callibrary.IntervalCalLibrary.rst#pipeline.infrastructure.callibrary.IntervalCalLibrary): Pipeline Calibration State
    * [ctx.products_dir](../_autosummary/pipeline.infrastructure.launcher.Context.rst#pipeline.infrastructure.launcher.Context.output_dir), [ctx.report_dir](../_autosummary/pipeline.infrastructure.launcher.Context.html#pipeline.infrastructure.launcher.Context.report_dir), [ctx.output_dir](../_autosummary/pipeline.infrastructure.launcher.Context.html#pipeline.infrastructure.launcher.Context.output_dir): Directories to store Pipeline output
    * `ctx.clean_targets`: list of images to be produced b imaging targets ()
    * selfcal targets and outcome (ctx.selfcal_targets)
    * which stage we are in, recipes, process job names
    * config and state of pipeline processing job, workflow instruction: e.g., what has been done, which stage in
    * current stage / last stage `ctx.stage`
    * information let as a cross-stage communication:

* Aggregating results from individual pipeline processing stages, each as an execution of a pipeline task
    * results [ctx.results](../_autosummary/pipeline.infrastructure.Context.rst#pipeline.infrastructure.Context.results): List of references to task results (stored on disk) here include the outcome and necessary information of actual heuristics-driving data processing (e.g. stats) and quality assurance (QA) scoring for weblog reporting.
    * Results as Python class objects, they are a mixture of "metadata" we traditionally called, meshed with various live objects (class/functions/arrays)..

* Other caching certain data properties from computational intensive tasks to avoid duplication transit of the data, `ctx.per_spw_cont_sensitivities_all_chan`, `synthesized_beams`. spectral window maps over the observing campaign

* As the use case of inter-stage communication grows (i.e. stage B wants to know certain information from stage A): private attributes are also added in "on-demand" ways as a backdoor channel inside the context... In some instances, a large array (e.g. stats / even "images") is attached into context, which also causes troubles (e.g. ~GiB np.arrays get under the Python object and serialized onto disk, see PIPE-1698). We do have a ticket to document the current state of various use / or misuse of context (PIPE-2160); maybe we will see some movement before this year's PL f2f meeting, `ctx.selfcal_targets`; cross-stage for late decision making `ctx.vla_skip_mfs_and_cube_imaging`.

* provide pythonic methods/property to access and manipulate context directory, logic decision treat from the current processing statee.g.
  * [ctx.get_oussid()](../_autosummary/pipeline.infrastructure.launcher.Context.rst#pipeline.infrastructure.launcher.Context.get_oussid): Returns the parent OUS “ousstatus” name
  * [ctx.get_recipe_name()](../_autosummary/pipeline.infrastructure.launcher.Context.rst#pipeline.infrastructure.launcher.Context.get_recipe_name): Returns the recipe name from the project structure
  * `ctx.save()`: Saves a copy of the context to disk
  * [ctx.vla_skip_mfs_and_cube_imaging()](../_automodapi/pipeline.infrastructure.launcher.Context.html#pipeline.infrastructure.launcher.Context.vla_skip_mfs_and_cube_imaging): Checks stage skipping condition for VLA specmode=mfs/cube imaging workflow
  * [ctx.observing_run.add_measurement_set()](../_automodapi/pipeline.domain.ObservingRun.html#pipeline.domain.ObservingRun.add_measurement_set): Register MS object with run
  * [ctx.observing_run.get_ms()](../_automodapi/pipeline.domain.ObservingRun.html#pipeline.domain.ObservingRun.get_ms): Retrieve MS by name (or intent)
  * `ctx.observing_run.get_measurements_sets()`: Retrieve MSes filtered by names, fields, intents
  * `ctx.observing_run.get_get_measurement_sets_of_type()`: Retrieve MSes filtered by name, data type, source name, spw
  * [ctx.get_real_spw_id_by_name(spw_name, target_ms)](../_automodapi/pipeline.domain.ObservingRun.html#pipeline.domain.ObservingRun.get_real_spw_id_by_name): Translate spw name to real spw ID for given MS
  * `ctx.get_virtual_spw_by_name(spw_name)`: Translate spw name to virtual spw ID for run

The full description of `Context`/`observing_run`/`measurementset` key properties/methods can be found below:

* [Context](../_automodapi/pipeline.domain.ObservingRun.rst)
* [Domain object classes](../apisummary.rst#module-pipeline.domain)
  * [ObservingRun](../_automodapi/pipeline.domain.ObservingRun.rst)
  * [AntennaArray](../_automodapi/pipeline.domain.AntennaArray.rst)
  * [Scan](../_autosummary/pipeline.domain.Scan.rst) 
  * [SpectralWindow](../_automodapi/pipeline.domain.SpectralWindow.rst)
  * [Source](../_automodapi/pipeline.domain.Source.rst)

<p></p>


## Limitations

* Lack of official API status and backward compatibility
    
    The context class and domain object classes are not officially designated or maintained as public APIs. Consequently, their interfaces and definitions are subject to frequent changes across releases or even within a single development cycle, with no assurance of backward compatibility.

    These changes to the underlying implementation can cause issues with serialization and deserialization. However, as the context primarily serves as a transient runtime or session object and data pool, such disruptions generally result in only minor inconveniences during the development process.

  * Serialization/Deserialization Challenges

      The pipeline context can be serialized and deserialized with pickle for specific use cases, such as saving and resuming sessions, debugging, and MPI prcoess communications. In the last case, shared storage is used as a workaround for certain limitations of our use of OpenMPI messaging protocol via [casampi](https://casadocs.readthedocs.io/en/stable/notebooks/parallel-processing.html)
      Deserialization of an on-disk serialized context is inherently unreliable due to potential changes in the context or domain classes. Currently, serialization is implemented using Python’s pickle, which has well-known limitations:
        
        * Objects that are not compatible with pickle cannot be serialized.
        * Deserialization may fail over time as the structure of the context or domain classes evolves.
        * The context object itself is not network-friendly and cross-node communication relies on a shared filesystem, which lacks support for concurrent writes and introduces additional dependencies.
    
* Scalability Concerns

    As mentioned above, the current Context and domain object implementation is designed for internal use only and is neither a stable interface nor an official API, but is the foundation bottleneck for the Pipeline parallelization and work-load orchestration.
    The current implementation is not designed for long-term persistence or scalability. 

* Documentation Gaps

    The use cases for pipeline context and domain objects have grown significantly over time, but the documentation has not evolved accordingly. This has led to inconsistencies and a lack of systematic clarity, particularly for software developers and contributors involved in heuristics development.

    More critically, the inadequate documentation has discouraged external developers/heuristics contributors from leveraging the existing pipeline context and domain objects for metadata access. Instead, heuristics development has often relied on typical CASA environments, leading to inefficiencies. Substantial duplicate effort has been required to translate these heuristics into the pipeline infrastructure, which could have been avoided with earlier integration.

    This underscores the need for future systems to include well-designed, multi-level public APIs tailored to diverse user groups, including operational scientists, data analysts, developers, domain experts, and general users. Such an approach would ensure clearer guidance, reduce redundancy, and streamline development efforts.

[casatools.measures]: https://casadocs.readthedocs.io/en/stable/api/tt/casatools.measures.html
[astropy.wcs]: https://docs.astropy.org/en/stable/wcs/index.html
[astropy.units.Quantity]: https://docs.astropy.org/en/stable/units/index.html
[casatools.coordsys]: https://casadocs.readthedocs.io/en/stable/api/tt/casatools.coordsys.html
[casatools.quanta]: https://casadocs.readthedocs.io/en/latest/api/tt/casatools.quanta.html
[pipe.measures]: ../_autosummary/pipeline.domain.measures.rstl

* Duplication and Compatibility Issues from Legacy Classes and Tools

    Over the years, technical debates and the coexistence of dated standards have led to significant maintenance costs and duplication within the pipeline codebase. This duplication stems from the use of overlapping or redundant "in-house" solutions for managing similar concepts which might may directly offered by off-shelf software tools or libraries.

    For instance, in handling physical quantities, the pipeline uses an internal set of [domain objects][pipe.measures], but also uses [CASA Measures][casatools.measures] (via [casatools.quanta][casatools.quanta]), and [astropy.units.Quantity][astropy.units.Quantity]. This often requires translation between these systems, resulting in complex, error-prone code patterns. Similarly, for World Coordinate Systems (WCS), the pipeline mixes the use of CASA coordinate systems [casatools.measures][casatools.measure] with Astropy’s [astropy.wcs][astropy.wcs].

    The maintenance burden of managing these conversions, along with the inconsistencies arising from multiple systems, increases development overhead and the risk of errors. Development heuristics often rely on accessing metadata in a typical CASA environment, bypassing the pipeline’s context and domain object infrastructure, further compounding the redundancy.


   | **Domain** | **Redundant Systems** | **Issues** |
   |-------------------------|---------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|
   | **Physical Quantities** | - [pipeline.domain.measures][pipe.measures] <br> - [casatools.measures][casatools.measures] <br> - [astropy.units.Quantity][astropy.units.Quantity] | Redundant definitions require translation between systems, increasing maintenance complexity.     |
   | **World Coordinates** | - CASA Coordinate Systems via ([casatools.coordsys][casatools.coordsys]) <br> - [astropy.wcs][astropy.wcs] | Mixed usage leads to inconsistencies and additional effort for conversions and compatibility. Mutiples sets of IERS data to maintain  |


## Future

* Standardizing Context's metadata tracking as a Database with documented API (potentially supporting multiple languages)
 
    With the increasing complexity of asynchronous workflows and longer processing intervals, it is time to redesign the context system. Key steps include:
    
    * Formalizing the schema and separating methods, class methods, attributes, and properties from the actual data used for reporting and statistical aggregation.
    * Introducing a version-controlled database and RestAPI to manage metadata and persist data efficiently.
    * This transition would also enable asynchronous access from multiple instances, improving scalability and flexibility.

* Transitioning the Context calculating Functionality into a local libraries/modules with the seperation of metadata and "methods"
    
    Moving to a database system could address current limitations by eliminating loosely defined class instances and methods, and focusing on recording structured data.
    
* Cloud-Friendly Context Format
 
    A cloud-optimized version of the context format could enhance compatibility with cloud storage and distributed systems, facilitating more efficient processing and data access across environments.
    * Additionally, a future state-tracking system would need to separate methods and abstractions to improve serialization compatibility, ensuring long-term storage and processing reliability.

* Compatibility with Legacy Pipeline

    The transition to a database-driven context system would likely require a partial or full rewrite of the Pipeline tasks, as they heavily rely on “context” objects. However, one-to-one translation of legacy pipeline heuristics to a database-based context codebase will be a lengthy and costly process. A possible transitional approach could involve adopting a database backend without fully replacing the existing context system initially:
 
    * Develop a lightweight translation layer that mimics the behavior of the context object and allows legacy pipeline tasks to interact with the new database system.
    * Gradually implement changes to streamline the transition in the context access inside the Pipeline task itself.

<p></p>

## A live example of the pipeline `Context` object

As below, we use [Rich](https://rich.readthedocs.io/en/latest/introduction.html), a Python library to inspect a live example of the pipeline `Context` instance.
We first read a `Context` object from a on-disk pickle file, then examine over the object structure and varioys attributes

In [1]:
import os
import pickle
import pipeline


def read_context(context_name='pipeline-procedure_vlassCCIP', stage=None, reaccept=False, result_from_context=True):
    loglevel = 'info'  # 'debug'
    plotlevel = 'default'
    if stage is None:
        context = pipeline.Pipeline(context=context_name, loglevel=loglevel, plotlevel=plotlevel).context
    else:
        if not reaccept:
            context = pipeline.Pipeline(
                context=context_name + '/saved_state/context-stage' + str(stage) + '.pickle',
                loglevel=loglevel,
                plotlevel=plotlevel,
            ).context
        else:
            # QA is performend when result is accept into context
            # then the weblog render will step in.
            if result_from_context:
                context = pipeline.Pipeline(
                    context=context_name + '/saved_state/context-stage' + str(stage) + '.pickle',
                    loglevel=loglevel,
                    plotlevel=plotlevel,
                ).context
                result = context.results[-1].read()
            else:
                with open(context_name + '/saved_state/result-stage' + str(stage) + '.pickle', 'rb') as fd:
                    result = pickle.load(fd)
            context = pipeline.Pipeline(
                context=context_name + '/saved_state/context-stage' + str(stage - 1) + '.pickle',
                loglevel=loglevel,
                plotlevel=plotlevel,
            ).context

            result.accept(context)

    return context


cwd = '/home/rxue/Workspace/zfs/nrao/tests/projs/csv-3899-small/pipe1669/working'
os.chdir(cwd)

ctx = read_context('pipeline-procedure_hifa_calimage.context')

usage: python -m pipeline | paris [-h] [--vlappr VLAPPR] [--ppr PPR]
                                  [--config SESSION] [--logfile LOGFILE]
                                  [--dryrun] [--local]
python -m pipeline | paris: error: unrecognized arguments: --f=/home/rxue/.local/share/jupyter/runtime/kernel-v3be2f44a2e049c1e027adaa730b99a389176adfc7.json


CLI option(s) is not recognizable by the Pipeline package; try "--help"
measurespath = /home/rxue/Workspace/nvme/nrao/casa_dist/casarundata
2024-12-18 22:35:35 INFO: Environment is not MPI enabled. Pipeline operating in single host mode
2024-12-18 22:35:36 INFO: Environment variable FLUX_SERVICE_URL not defined.  Switching to backup url.
2024-12-18 22:35:36 INFO: Environment variable FLUX_SERVICE_URL_BACKUP not defined.
2024-12-18 22:35:36 INFO: Pipeline version 2024.2.0.4+182-g50192f15f2-dirty-PIPE-1669-run-dev-pipeline-with-modular-casa6 running on xenon
2024-12-18 22:35:36 INFO: Host environment:
	CPU: Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (physical cores: 18, logical cores: 36)
	Memory: 251.4 GiB RAM, 16.0 GiB swap
	OS: Ubuntu 22.04.5 LTS (Jammy Jellyfish)
	cgroup limits: 100% of 36 CPU cores, memory limits=N/A
	ulimit limits: CPU time=N/A, memory=N/A, files=1048576
2024-12-18 22:35:36 INFO: Environment as detected by CASA:
	CPUs reported by CASA: 36 cores, max 36 OpenMP threa

2024-12-18 22:35:37	SEVERE	pipeline::pipeline.infrastructure.launcher::casa	Error creating hard link to CASA log
2024-12-18 22:35:37	WARN	pipeline::pipeline.infrastructure.launcher::casa	Reverting to symbolic link to CASA log. This is unsupported!


### Context/domain Object Query – Task Results per Stage

In [2]:
import pprint
ctx.get_recipe_name()
ctx.get_oussid()

# Get stage nr and name for all results
for rp in ctx.results:
    result = rp.read()
    print(f'{result.stage_number}, {result.taskname}')

# Show specific result, and corresponding task inputs
task_results = ctx.results[14].read()
print(task_results[0])

pprint.pprint(task_results)

1, hifa_importdata
2, hifa_flagdata
3, hifa_fluxcalflag
4, hif_rawflagchans
5, hif_refant
6, h_tsyscal
7, hifa_tsysflag
8, hifa_tsysflagcontamination
9, hifa_antpos
10, hifa_wvrgcalflag
11, hif_lowgainflag
12, hif_setmodels
13, hifa_bandpassflag
14, hifa_bandpass
15, hifa_spwphaseup
16, hifa_gfluxscaleflag
17, hifa_gfluxscale
18, hifa_timegaincal
19, hifa_renorm
20, hifa_targetflag
21, hif_applycal
22, hif_makeimlist
23, hif_makeimages
24, hif_makeimlist
25, hif_makeimages
26, hifa_imageprecheck
27, hif_checkproductsize
28, hifa_exportdata
29, hif_mstransform
30, hifa_flagtargets
31, hif_makeimlist
32, hif_findcont
33, hif_uvcontsub
34, hif_makeimages
35, hif_makeimlist
36, hif_makeimages
37, hif_makeimlist
38, hif_makeimages
39, hif_selfcal
40, hif_makeimlist
41, hif_makeimages
42, hif_makeimlist
43, hif_makeimages
44, hif_makeimlist
45, hif_makeimages
46, hif_makeimlist
47, hif_makeimages
48, hifa_exportdata
SpwPhaseupResults:
vis=uid___A002_X1181695_X1c6a4_8ant.ms
	CHECK, J1857-0048

### Context queries – observing run info

In [3]:
import pprint

attributes = [
    'project_ids',
    'schedblock_ids',
    'execblock_ids',
    'observers',
    'start_time',
    'start_datetime',
    'end_time',
    'end_datetime',
]

for attr in attributes:
    print(f'{attr:15} : {getattr(ctx.observing_run, attr)}')

project_ids     : {'uid://A001/X3645/X2c8'}
schedblock_ids  : {'uid://A001/X3734/X1'}
execblock_ids   : {'uid://A002/X1181695/X1c6a4'}
observers       : {'dmuders'}
start_time      : {'m0': {'unit': 'd', 'value': 60462.404386666654}, 'refer': 'UTC', 'type': 'epoch'}
start_datetime  : 2024-06-01 09:42:19.007999
end_time        : {'m0': {'unit': 'd', 'value': 60462.41706111111}, 'refer': 'UTC', 'type': 'epoch'}
end_datetime    : 2024-06-01 10:00:34.080000


### Context/domain Object Query – measurement sets

In [4]:
# Get names of MeasurementSets in current run

from pipeline.domain.datatype import DataType

msnames = [ms.name for ms in ctx.observing_run.get_measurement_sets()]
print(msnames)

# Get MeasurementSet by name:

ms = ctx.observing_run.get_ms(name=msnames[0])
print(ms)

# Get MeasurementSets filtered by given names & intents:

mslist = ctx.observing_run.get_measurement_sets(intents='BANDPASS,PHASE')
print(mslist)

# Get MeasurementSets matching given DataType

mslist = ctx.observing_run.get_measurement_sets_of_type(
    dtypes=[DataType.REGCAL_CONTLINE_ALL, DataType.REGCAL_CONTLINE_SCIENCE]
)
print(mslist)
for ms in mslist:
    print(ms,' - basename - ',ms.basename)


['uid___A002_X1181695_X1c6a4_8ant.ms', 'uid___A002_X1181695_X1c6a4_8ant_targets.ms', 'uid___A002_X1181695_X1c6a4_8ant_targets_line.ms']
MeasurementSet(uid___A002_X1181695_X1c6a4_8ant.ms)
[<pipeline.domain.measurementset.MeasurementSet object at 0x7a8455f86ec0>]
[<pipeline.domain.measurementset.MeasurementSet object at 0x7a8455f86ec0>]
MeasurementSet(uid___A002_X1181695_X1c6a4_8ant.ms)  - basename -  uid___A002_X1181695_X1c6a4_8ant.ms


### Context/domain Object Query – virtual spws

In [5]:
# Full/short names for all virtual SpWs in the observing run:

print(ctx.observing_run.virtual_science_spw_names)
print(ctx.observing_run.virtual_science_spw_shortnames)

# Convert virtual SpW ID to real one for given MS
ms = ctx.observing_run.get_measurement_sets()[0]
virt_spwid = 16
real_spwid = ctx.observing_run.virtual2real_spw_id(virt_spwid, ms)

# Convert real SpW ID for given MS to virtual SpW ID
virtual_spwid = ctx.observing_run.real2virtual_spw_id(24, ms)
print(virtual_spwid)

{'X803018835#ALMA_RB_03#BB_1#SW-01#FULL_RES': 16, 'X803018835#ALMA_RB_03#BB_3#SW-01#FULL_RES': 22, 'X803018835#ALMA_RB_03#BB_4#SW-01#FULL_RES': 24}
{'X803018835#ALMA_RB_03#BB_1#SW-01#FULL_RES': 'X803018835#ALMA_RB_03#BB_1#SW-01', 'X803018835#ALMA_RB_03#BB_3#SW-01#FULL_RES': 'X803018835#ALMA_RB_03#BB_3#SW-01', 'X803018835#ALMA_RB_03#BB_4#SW-01#FULL_RES': 'X803018835#ALMA_RB_03#BB_4#SW-01'}
24


### Context/domain Object Query -- spectral windows

In [6]:
msname='uid___A002_X1181695_X1c6a4_8ant_targets.ms'
ms = ctx.observing_run.get_ms(name=msname)

# Get all Spectral Windows
all_spws = ms.get_spectral_windows(science_windows_only=False)

# Get frame for specific SpW
frame = ms.get_spectral_window(0).frame

# Get all “Differential Gain Reference” spectral windows, filtered by requested IDs
dgref_spws = ms.get_spectral_windows('0')

# Get SpW IDs for science SpWs (default), filtered by band and number of channels
scispw_ids_sel = [spw.id for spw in ms.get_spectral_windows()
                  if spw.band == 'ALMA Band 3' and spw.num_channels > 4]
print(scispw_ids_sel)

[16, 22, 24]


### Context/domain Object Query – fields

In [7]:
maname='uid___A002_X1181695_X1c6a4_8ant.ms'
ms = ctx.observing_run.get_ms(name=msname)

# Get names for all fields in the MS
field_names = [field.name for field in ms.get_fields()]

# Get all intents covered by fields for given field argument
field_intents = {intent for field in ms.get_fields('*')
                 for intent in field.intents}

# Get field ID for all fields matching science target intent
fieldlist = [field.id for field in ms.get_fields(intent='TARGET')]
print(fieldlist)

[2, 4, 5]


### Context/domain Object Query -- scans

In [8]:
# Get IDs of all scans
maname='uid___A002_X1181695_X1c6a4_8ant_targets.ms'
ms = ctx.observing_run.get_ms(name=msname)
scan_ids = [scan.id for scan in ms.get_scans()]

# Get time on source for scans with PHASE intent for selected fields

times = [scan.time_on_source
         for scan in ms.get_scans(field='*', scan_intent='PHASE')]
print(times)

[]


### Context/domain Object Query -- data descriptions

In [9]:
# Get polarization ID for given MS, SpW ID, and correlation type.
maname='uid___A002_X1181695_X1c6a4_8ant_targets.ms'
ms = ctx.observing_run.get_ms(name=msname)
datadesc = ms.get_data_description(id=0)
pol_id = datadesc.get_polarization_id('XY')
print(pol_id)

2


### Context/domain Object Query -- spectral windows

In [10]:
maname = "uid___A002_X1181695_X1c6a4_8ant_targets.ms"
ms = ctx.observing_run.get_ms(name=msname)
sciencespws = [spw.id for spw in ms.get_spectral_windows(science_windows_only=True)]
print(sciencespws)
for spwid in sciencespws:
    spw = ms.get_spectral_window(spwid)
    print(
        spw.id,
        spw.band,
        spw.baseband,
        spw.centre_frequency,
        spw.correlation_bits,
        spw.frame,
        spw.min_frequency,
        spw.max_frequency,
    )


[16, 22, 24]
16 ALMA Band 3 1 100.810 GHz BITS_2x2 TOPO 99.810 GHz 101.810 GHz
22 ALMA Band 3 3 112.337 GHz BITS_2x2 TOPO 112.102 GHz 112.571 GHz
24 ALMA Band 3 4 115.248 GHz BITS_2x2 TOPO 115.014 GHz 115.482 GHz



### A live example of the pipeline `context` object

In [11]:
from rich import inspect as inspect
from rich.console import Console
import io

def crop_output(text: str, height: int) -> str:
    lines = text.split("\n")
    lines = [line for line in lines]
    out = "\n".join(lines[:height])
    if len(lines) >= (height - 1):
        out += "\n" + "." * 8
        out += "\n" + "." * 8
    return out


def rinspect(obj):
    console = Console(
        color_system='standard',
        soft_wrap=True,
        width=160,
        height=100,
        tab_size=2,
        force_jupyter=False,
        force_terminal=False,
        force_interactive=False,
        markup=True,
        record=False,
        file=io.StringIO(),
        quiet=False,
    )
    with console.capture() as capture:
        inspect(obj, console=console)
    str_output = crop_output(capture.get(), height=500)
    print(str_output)
    return


In [12]:
rinspect(ctx)

[34m╭─[0m[34m─────────────────────────────────[0m[34m [0m[1;34m<[0m[1;95mclass[0m[39m [0m[32m'pipeline.infrastructure.launcher.Context'[0m[1;34m>[0m[34m [0m[34m─────────────────────────────────[0m[34m─╮[0m
[34m│[0m [36mContext holds all pipeline state, consisting of metadata describing the[0m                                                [34m│[0m
[34m│[0m [36mdata set, objects describing the pipeline calibration state, the tree of[0m                                               [34m│[0m
[34m│[0m [36mResults objects summarising the results of each pipeline task, and a[0m                                                   [34m│[0m
[34m│[0m [36msmall number of internal pipeline variables and objects.[0m                                                               [34m│[0m
[34m│[0m                                                                                                                        [34m│[0m
[34m│[0m [32m╭──────────────

### A live example of the pipeline `context.callibrary` object

In [13]:
rinspect(ctx.callibrary)
rinspect(ctx.callibrary.active)

[34m╭─[0m[34m───────────[0m[34m [0m[1;34m<[0m[1;95mclass[0m[39m [0m[32m'pipeline.infrastructure.callibrary.IntervalCalLibrary'[0m[1;34m>[0m[34m [0m[34m────────────[0m[34m─╮[0m
[34m│[0m [36mCalLibrary is the root object for the pipeline calibration state.[0m                        [34m│[0m
[34m│[0m                                                                                          [34m│[0m
[34m│[0m [32m╭──────────────────────────────────────────────────────────────────────────────────────╮[0m [34m│[0m
[34m│[0m [32m│[0m [1m<[0m[1;95mpipeline.infrastructure.callibrary.IntervalCalLibrary[0m[39m object at [0m[1;36m0x7a8455a19210[0m[1m>[0m     [32m│[0m [34m│[0m
[34m│[0m [32m╰──────────────────────────────────────────────────────────────────────────────────────╯[0m [34m│[0m
[34m│[0m                                                                                          [34m│[0m
[34m│[0m  [3;33mactive[0m = [1m<[0m[1

### A live example of the `context.calimlist` attribute.

In [14]:
rinspect(ctx.calimlist)

[34m╭─[0m[34m─────────[0m[34m [0m[1;34m<[0m[1;95mclass[0m[39m [0m[32m'pipeline.infrastructure.imagelibrary.ImageLibrary'[0m[1;34m>[0m[34m [0m[34m──────────[0m[34m─╮[0m
[34m│[0m [32m╭──────────────────────────────────────────────────────────────────────────────╮[0m [34m│[0m
[34m│[0m [32m│[0m [1m<[0m[1;95mpipeline.infrastructure.imagelibrary.ImageLibrary[0m[39m object at [0m[1;36m0x7a8455aba560[0m[1m>[0m [32m│[0m [34m│[0m
[34m│[0m [32m╰──────────────────────────────────────────────────────────────────────────────╯[0m [34m│[0m
[34m│[0m                                                                                  [34m│[0m
[34m│[0m [1;36m27[0m[3m attribute(s) not shown.[0m Run [1;35minspect[0m[1m([0minspect[1m)[0m for options.                     [34m│[0m
[34m╰──────────────────────────────────────────────────────────────────────────────────╯[0m



### A live example of the `context.project_summary` and `context.project_performance_parameters` attributes

An pipeline `ObservingRun` object) and attached the pipeline `MeasurementSet` instances

In [15]:
rinspect(ctx.project_summary)
rinspect(ctx.project_performance_parameters)

[34m╭─[0m[34m─────────[0m[34m [0m[1;34m<[0m[1;95mclass[0m[39m [0m[32m'pipeline.infrastructure.project.ProjectSummary'[0m[1;34m>[0m[34m [0m[34m──────────[0m[34m─╮[0m
[34m│[0m [32m╭───────────────────────────────────────────────────────────────────────────╮[0m [34m│[0m
[34m│[0m [32m│[0m [1m<[0m[1;95mpipeline.infrastructure.project.ProjectSummary[0m[39m object at [0m[1;36m0x7a8455aba6e0[0m[1m>[0m [32m│[0m [34m│[0m
[34m│[0m [32m╰───────────────────────────────────────────────────────────────────────────╯[0m [34m│[0m
[34m│[0m                                                                               [34m│[0m
[34m│[0m    [3;33mobservatory[0m = [32m'ALMA Joint Observatory'[0m                                     [34m│[0m
[34m│[0m         [3;33mpiname[0m = [32m'undefined'[0m                                                  [34m│[0m
[34m│[0m  [3;33mproposal_code[0m = [32m''[0m                                        

### A live example of the `Context.observing_run` attribute

An pipeline `ObservingRun` object) and attached the pipeline `MeasurementSet` instances

In [16]:

rinspect(ctx.observing_run)

[34m╭─[0m[34m────────────────────────────────[0m[34m [0m[1;34m<[0m[1;95mclass[0m[39m [0m[32m'pipeline.domain.observingrun.ObservingRun'[0m[1;34m>[0m[34m [0m[34m────────────────────────────────[0m[34m─╮[0m
[34m│[0m [36mObservingRun is a logical representation of an observing run.[0m                                                         [34m│[0m
[34m│[0m                                                                                                                       [34m│[0m
[34m│[0m [32m╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮[0m [34m│[0m
[34m│[0m [32m│[0m [1m<[0m[1;95mpipeline.domain.observingrun.ObservingRun[0m[39m object at [0m[1;36m0x7a8455f86f80[0m[1m>[0m                                              [32m│[0m [34m│[0m
[34m│[0m [32m╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────

### A live example of the `Context.observing_run.measurement_sets` attribute

In [17]:
rinspect(ctx.observing_run.measurement_sets[0])

[34m╭─[0m[34m─────────────────────────────────────────────────[0m[34m [0m[1;34m<[0m[1;95mclass[0m[39m [0m[32m'pipeline.domain.measurementset.MeasurementSet'[0m[1;34m>[0m[34m [0m[34m──────────────────────────────────────────────────[0m[34m─╮[0m
[34m│[0m [36mA class to store logical representation of a MeasurementSet [0m[1;36m([0m[36mMS[0m[1;36m)[0m[36m.[0m                                                                                            [34m│[0m
[34m│[0m                                                                                                                                                              [34m│[0m
[34m│[0m [32m╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮[0m [34m│[0m
[34m│[0m [32m│[0m [1m<[0m[1;95mpipeline.domain.measurementset.MeasurementSet[0m[39m object at [0m[1;36m0x7a8455f86ec0[0m[1m>[