### Connection Classes
Connection classes are a place to specify all the connections that a task intends to make with the rest of a larger pipeline. The class has the following connection types:
* InitInput - arguments that will be passed to init, dimensionless, normally things like schemas
* InitOutputs - Values that the activator system will retrive once init is done, normally schemas post init
* Inputs - Input dataset types that are used to constrain an execution graph (i.e. the data that will be proccessed.
* PrerequisiteInputs - Datasets produced outside the pipeline that are required, these arguments don't constrain what data will be processed. This will be things like reference catalogs or bright object masks
* Output - A specification of what dataset types this task will produce

Each of these connection types are declared with arguments

* name - The dataset type name
* storageClass - The storage type of this object in the datastore
* deferLoad - Optional, default False. Tells the butler to return a defer object the task will load from
* multiple - Optional, default False. Indicates this field will expect a list of python objects
* checkFunction - Optional, default None. Function that allows verification of the quantum the activator will produce

Additionally Inputs, PrerequisiteInputs, and Outputs take

* dimensions - The set of dimensions that specify a particular connection, i.e. visit, detector, instrument

Connection classes must also be declared with a dimensions attribute, which specifies the fundimental unit this task will process, i.e. tract, patch

The name fields on connections can be specified as a python format string, with a name in each of the template fields, i.e. "{coaddType}Coadd_meas". If any name contains a template, then the connection class must be declared with a defaultTemplate attribute that is a dictionary specifying what the default value will be.

#### Instantiation
Connection classes are instantiated with a config object (discussed below) that allow overriding connection names, as well as controling if a partiuclar connection will be used when building a quantum graph (an execution flow). The set of each type of connection can be found under:
 * self.initInputs
 * self.initOutputs
 * self.inputs
 * self.prerequisiteInputs
 * self.outputs
 
 These names must not be otherwise used in the init method.

In [None]:
from lsst.pipe.base import PipelineTaskConfig, PipelineTaskConnections, PipelineTask
import lsst.pex.config as pexConfig

class ExamplePipelineTaskConnections(PipelineTaskConnections):
    inputSchema = InitInput(name="dummy_schema",
                            storageClass="SourceCatalog")
    outputSchema = InitOutput(name="example_dummy_schema",
                              storageClass="SourceCatalog")
    exposure = Input(name="calexp", storageClass="ExposureF",
                     dimensions=("instrument", "visit", "detector"))
    brightStarMask = PrerequisiteInput(name="bright_star_mask",
                                       storageClass="StarMasks",
                                       dimensions=("tract", "patch"))
    resultExp = Output(name="{outputTemp}Exposure", storageClass="ExposureF")
    resultCat = Output(name="{outputTemp}meas_cat", storageClass="SourceCatalog",
                       dimensions=("instrument", "visit", "detector"))
    
    dimensions = ("instrument", "visit", "detector")
    defaultTemplates = {"outputTemp": "Example"}
    
    def __init__(self, *, config=None):
        super().__init__(config=config)
        if not config.doThings:
            self.prerequisiteInputs -= set(("brightStarMask",))

### PipelineTaskConfig
These are almost standard configuration classes, but they must be declared with a pipelineConnections keyword in addition to the `PipelineTaskConfig` base class. This adds all the configurable elements of the specified PipelineConnectionsClass to the config class under a field called connections. Right now this is only the name fields, and the names of the templates in format strings if any are present.

These fields are then set just like any normal configuration. If templates are present, assigning to them will have the effect of formating all the format strings when the config is later used in constructing a connection class.

In [None]:
class ExamplePipelineTaskConfig(PipelineTaskConfig, pipelineConnections=ExamplePipelineTaskConnections):
    doThings = pexConfig.Field(dtype=bool, default=True, doc="Example field")

### PipelineTask
This is the task that will be called by an activator to process a quantum of data. 

#### Init
`InitInput` connections specified in the connections class will be passed into the `__init__` method inside an `initInputs` argument. It is expected that any declared `InitOutput`s will be assigned to an instance variable with the same variable name used in the connection class. The activator will look for this to write as an output.

#### Execution
Execution happens inside the runQuantum class. This is passed a unit of work to process as defined by the dimensions on the connection class, i.e. a tract, patch. This function is responsable for loading the provided `DatasetRef`s, executing the tasks `Run` method, and saving the output. 

`runQuantum` is provided a `ButlerQuantumContext` object to do the getting and putting. This is like a regular butler, but it only allows loading datasets defined in the connections class. The other arguments to runQuantum are `Struct`s that map the attribut names used in the connection class to `DatasetRef`s.

The butlerQC object, can take one of these `Structs`, a single `DatasetRef`, or a `list` of `DataRefs`s. If the struct is provided to get, it will load everything into a dictionary keyed by the attribute name in the connection class. If a single `DatasetRef` is provided a single output will be returned. If a `list` of `DataRefs`s is given a dictionary of `DataIds` to object will be returned. The put method behaves similarly. If a struct is given it will look up the corresponding fields in the "values" struct and save them. If a single `DatasetRef` is given, the "value" argument of put must be a single value. If a dict is given it must be a mapping of `DataId` to object, and the "value" argument of the put will be the dataset type name (the name field on a connection, the same thing that can be set on the config class).

The `run` method should accept keywords corresponding to the `Input` and `PrerequisiteInput` attribute names defined in the connection class. It should return a `Struct` corresponding to the attribute names of the `Output` connections on the connection class. This is only a requirement if the default `runQuantum` is used. It is a requirement for run to return this sort of `Struct` if `butlerQC.put` is to be called directly with the `outputRefs` `Struct`.

In [1]:
class ExamplePipelineTask(PipelineTask):
    def __init__(self, config, *args, initInputs=None, **kwargs):
        super().__init__(*args, config=config, **kwargs)
        self.outputSchema = addToInputSchema(initInputs['inputSchema'])
    
    # This is the method from the base class, it is redefined here as an example only
    def runQuantum(self, butlerQC, inputRefs, outputRefs):
        inputs = butlerQC.get(inputRefs)
        outputs = self.run(**inputs)
        butlerQC.put(outputs, outputRefs)

        

In [13]:
# Create a config object, alter some properties
config = ExamplePipelineTaskConfig()
_=[print(f"Field: {k: <15} Value: {v}") for k, v in config.connections.items()]

Field: inputSchema     Value: dummy_schema
Field: outputSchema    Value: example_dummy_schema
Field: exposure        Value: calexp
Field: brightStarMask  Value: bright_star_mask
Field: resultExp       Value: {outputTemp}Exposure
Field: resultCat       Value: {outputTemp}meas_cat
Field: outputTemp      Value: Example


In [3]:
# Set the name for the inputScheam
config.connections.inputSchema = 'previous_step_schema'
# format the shared template name 'outputTemp'
config.connections.outputTemp = 'bgMasked'

In [4]:
# Create a connections class to see what happened (This is all done by the system but is done here to demo)
connections = ExamplePipelineTaskConnections(config=config)
# Print the names of the modified fields
print(connections.inputSchema.name)
print(connections.resultExp.name)
print(connections.resultCat.name)

previous_step_schema
bgMaskedExposure
bgMaskedmeas_cat
