Skip to content

Latest commit

 

History

History
155 lines (121 loc) · 6.01 KB

linking.rst

File metadata and controls

155 lines (121 loc) · 6.01 KB

The linking framework

One of the strengths of Glue is the ability to be able to link different datasets together. The linking page describes how to set up links graphically from the Glue application, but in this page, we look at how links are set up programmatically.

As described in data_tutorial, components are identified by ~glue.core.component_id.ComponentID instances. We can then use these to create links across datasets. Note that links are not defined between ~glue.core.data.Data or ~glue.core.component.Component objects, but between ~glue.core.component_id.ComponentID instances.

The basic linking object is ~glue.core.component_link.ComponentLink. This describes how two ~glue.core.component_id.ComponentID instances are linked. The following example demonstrates how to set up a ~glue.core.component_link.ComponentLink programmatically:

>>> from glue.core import Data, DataCollection >>> d1 = Data(x1=[1, 2, 3]) >>> d2 = Data(x2=[2, 3, 4, 5]) >>> dc = DataCollection([d1, d2]) >>> from glue.core.component_link import ComponentLink >>> link = ComponentLink([d1.id['x1']], d2.id['x2'])

Note that the first argument of ~glue.core.component_link.ComponentLink should be a list of ~glue.core.component_id.ComponentID instances.

Since no linking function was specified in the above example, ~glue.core.component_link.ComponentLink defaults to the simplest kind of link, identity. For the link to be useful, we need to add it to the data collection, and we'll be able to see what it changes:

>>> dc.add_link(link)

If we look at the list of components on the ~glue.core.data.Data objects, we see that the x2 component in d2 has been replaced by x1:

>>> print(d1.components) [Pixel Axis 0, World 0, x1] >>> print(d2.components) [Pixel Axis 0, World 0, x1]

This is because we used the identify transform, so since the ~glue.core.component_id.ComponentID objects x1 and x2 are interchangeable, Glue decided to use x1 instead of x2 in d2 for simplicity.

The benefit of this is now that if we create a ~glue.core.subset.SubsetState based on the x1 ~glue.core.component_id.ComponentID, this ~glue.core.subset.SubsetState will be applicable to both datasets:

>>> subset_state = d2.id['x1'] > 2.5 >>> subset_group = dc.new_subset_group('x1 > 2.5', subset_state)

This has now created subsets in both d1 and d2:

>>> d1.subsets[0].to_mask()
array([False, False,  True], dtype=bool)
>>> d2.subsets[0].to_mask()
array([False,  True,  True,  True], dtype=bool)

Let's now try and use a custom linking function that is not simply identity:

>>> link = ComponentLink([d1.id['x1']], d2.id['x2'],
...                      using=lambda x: 2*x)
>>> dc.add_link(link)

This time, if we look at the list of components on the ~glue.core.data.Data objects, we see that d1 now has an additional component, x2:

>>> print(d1.components)
[Pixel Axis 0, World 0, x1, x2]
>>> print(d2.components)
[Pixel Axis 0, World 0, x2]

We can take a look at the values of all the components:

>>> print(d1['x1'])
[1 2 3]
>>> print(d1['x2'])
[2 4 6]
>>> print(d2['x2'])
[2 3 4 5]

In this case, both datasets have kept their original components, but d1 now also includes an x2 ~glue.core.component.DerivedComponent which was computed as being twice the values of d1['x1'].

Creating simple component links can also be done using arithmetic operations on ~glue.core.component_id.ComponentID instances:

>>> d3 = Data(xa=[1, 2, 3], xb=[1, 3, 5]) >>> dc = DataCollection([d3]) >>> diff = d3.id['xa'] - d3.id['xb'] >>> diff <BinaryComponentLink: (xa - xb)> >>> dc.add_link(diff) >>> d3['diff'] array([ 0, -1, -2])

Note

This is different from using comparison operators such as > or <= on ~glue.core.component_id.ComponentID instances, which produces ~glue.core.subset.SubsetState objects.

It is also possible to add a component link to just one particular ~glue.core.data.Data object, in which case this is equivalent to creating a ~glue.core.component.DerivedComponent. The following:

>>> from glue.core import Data
>>> d4 = Data(xa=[1, 2, 3], xb=[1, 3, 5])
>>> link = d4.id['xa'] * 2
>>> d4.add_component_link(link, 'xa_double_1')
<glue.core.component.DerivedComponent object at 0x107b2c828>
>>> print(d4['xa_double_1'])
[2 4 6]

is equivalent to creating a derived component:

>>> d4['xa_double_2'] = d4.id['xa'] * 2
>>> print(d4['xa_double_2'])
[2 4 6]

When adding a component link via the ~glue.core.data_collection.DataCollection ~glue.core.data_collection.DataCollection.add_link method, new component IDs are only added to ~glue.core.data.Data objects for which the set of ~glue.core.component_id.ComponentID required for the link already exist. For instance, in the following example, xu is only added to d6:

>>> d5 = Data(xs=[5, 5, 6])
>>> d6 = Data(xt=[3, 2, 3])
>>> dc = DataCollection([d5, d6])
>>> new_component = ComponentID('xu')
>>> link = ComponentLink([d6.id['xt']], new_component,
...                      using=lambda x: x + 3)
>>> dc.add_link(link)
>>> print(d5.components)
[Pixel Axis 0, World 0, xs]
>>> print(d6.components)
[Pixel Axis 0, World 0, xt, xu]

Glue includes a number of built-in link functions that are collected in the link_function registry object from glue.config. You can easily create new link functions as described in custom_links, and these will then be available through the user interface, as shown in linking in the User guide.