Skip to content
Gijs Molenaar edited this page Feb 13, 2014 · 6 revisions

A Python-based Tree Definition Language

Since Python is no nicely extensible, a few cleverly-written classes would allow for trees to be defined at a very high level. I can see the following benefits:

  • Use all the power of a proper programming language to define your trees algorithmically. Things like expanding for stations and IFRs, attaching LSM sources, and chaining together peel units can be implemented cleanly and concisely.
  • High-level forest editing. Once a forest is constructed, its "python source" may be stored in the forest state record. End-users can then extract this source, modify it, and regenerate the forest.
  • Python may be called directly from the kernel.
  • LSM source models can be implemented as libraries of Python functions that define source trees.
  • Simplifies node naming -- the tree generator can take care of most of the details.
  • Information on high-level tree structure (concepts like "peeling units", "predict branches", etc.) is available in the script itself, and may be stored in the forest state as well. This simplifies the job of a tree visualizer. Some code examples to focus the discussion follow. A more complete working example can be found here: InitialWorkingExample. A prototype can currently be found in cvs as Timba/TreeGen/src/TreeGen.py, this contains the python code needed to implement the language, plus the working example (after the if __name_ == 'main': statement.)

Here's a "peeling unit" implementing a shift/solve/subtract sequence:

def peelUnit (datasource,solve_sources,predict_sources=[],ns=GlobalScope):  """this function defines a peeling unit for a list of solve sources and, optionally, a list of  auxiliary predict sources""";  # ns is a node scope: this object is responsible for auto-generating names  # We create all our nodes in whichever scope has been passed in.  for (s1,s2) in IFRS:    for (icorr,corr) in enumerate(CORRELATIONS):      # create condeq branch      ns.condeq(s1=s1,s2=s2,corr=corr) << MeqCondeq(children=[         ns.measured(s1=s1,s2=s2,corr=corr) << MeqSelector(index=icorr,children=[          ns.phaseshifter(s1=s1,s2=s2) << MeqPhaseShift(...,children=datasource(s1=s1,s2=s2))]),        ns.predicted(s1=s1,s2=s2,corr=corr) << MeqAdd(children=          [ gen(s1=s1,s2=s2,corr=corr) for gen in solve_sources + predict_sources ]);      ]);    # create subtract branch    ns.subtract(s1=s1,s2=s2) << MeqSubtract(children=[      ns.phaseshifter(s1=s1,s2=s2),      ns.predpeel(s1=s1,s2=s2) <<= MeqAdd(children=[           ns.predcollect(s1=s1,s2=s2) << MeqCollector(children=[gen(s1=s1,s2=s2,corr=corr) for corr in CORRS])         for gen in solve_sources ]);    ]);  # creates solver and sequencers  ns.solver() = MeqSolver(...,children =      [ unit.condeq(s1=s1,s2=s2,corr=corr) for (s1,s2) in IFRS for corr in CORRS ]);  for (s1,s2) in IFRS:    ns.reqseq(s1=s1,s2=s2) << MeqReqSeq(...,children=[ns.solver(),ns.subtract(s1,s2)]);  # returns root nodes of unit  return ns.reqseq;```
Now here's a top-level script to chain multiple peeling units together: 


```python
# create spigotsfor (s1,s2) in IFRS:  node.spigot(s1,s2) << MeqSpigot(...);# create & chain together peel unitsdatasource = node.spigot;for (q,name) in enumerate(SOURCES):  # get subtrees for this source from LSM  predicter = lsm.getSourcePredicter(name,scope=tree.scope('predict',src=q));  # create peel unit  datasource = peelUnit(datasource,predicter,scope=tree.scope('peelunit',src=q));# attach sinks to last unit in chainfor (s1,s2) in IFRS:  node.sink(s1,s2) << MeqSink(...,children=datasource(s1=s1,s2=s2));```
And here's what an LSM tree might look like: 


```python
def unpolarizedPointSource(ra=...,dec=...,sti=...,ns=None):  """defines an unpolarized point source""";  ns.lmn << MeqLMN(children = {    ra:   ns.ra()  << MeqParm(...),    dec:  ns.dec() << MeqParm(...),    ra0:  GLOBALS.PhaseCenter.ra,    dec0: GLOBALS.PhaseCenter.dec });  for (s1,s2) in IFRS:    ns.predict_i(s1=s1,s2=s2,stk='I') = MeqMultiply(children=[      ns.flux(stk='I') << MeqParm(...),      ns.dft(s1=s1,s2=s2) << MeqPSDFT(children=[        ns.stdft(s=s1) << MeqPSStDFT(children=[GLOBALS.UVW(s=s1),ns.lmn()]),        ns.stdft(s=s2) << MeqPSStDFT(children=[GLOBALS.UVW(s=s2),ns.lmn()]),        ns.n() <<= MeqSelector(index=2,children=ns.lmn());      ])    ]);  for c in ('XX','YY'):    ns.predict(s1=s1,s2=s2,corr=c) = MeqMultiply(children=(ns.predict(s1=s1,s2=s2,stk='I'),MeqConst(value='0.5')));  for c in ('XY','YX'):    ns.predict(s1=s1,s2=s2,corr=c) = MeqConst(value='0');  return ns.predict;```

## Tony's Thoughts

This a useful start - at least we are writing programs in Python rather than Glish, and if useful 'standard layout' functions and definitions are available, then some of the programming burden would be removed.  However, when I heard that Oleg was working on a tree definition language I had hoped for something more.  

What I would like to see is something like the following: 

* A true tree definition **language** 
This language allows me to lay out the tree in a text file.  Note that the text file describes a tree **layout**, and is not a program. I don't necessarily want to have to write a program in order to design a tree. 

* A language **processor**  
The processor takes the tree as defined above and turns this tree into something that can be executed by the computer. This output could indeed be a piece of python code, or maybe even a  compiled executable directly interfaced to the kernel. 

* These concepts are not new.  
In the Graphviz package ([[http://www.graphviz.org|http://www.graphviz.org]]) we describe the graph layout in  a tree definition language called 'dot'. Various viewers etc, can then process the 'dot' layout code and produce a meaningful drawing. 

GUIs are basically defined by a tree structure. In Qt ([[http://www.trolltech.com/|http://www.trolltech.com/]]) we have a main widget  which can have children, etc. Qt comes with Qt Designer. The Designer allows you to lay out a widget,  set up all the locations and connections of the children etc and then stores the widget design in a text file (in  XML if I remember correctly). 

Then a processor program takes this design desciption in XML and generates the python or C++ code which will actually be used to implement the GUI.  

Obviously a fair amount of work would be required to implement such an entire system for [[MeqTrees|Home]] and I'm not sure I'd give it a high priority, but ... 

If [[MeqTrees|Home]] provided such a system, we would be off to the races! 


## Oleg's Responce

Tony, I don't think I've conveyed the main principle right. Python role's here is not that of a quick kludge at all, there's huge added value in it. Your idea of a custom TDL is roughly where I started off with my thinking, but I was troubled by the ensuing complexity. Keep in mind that: 

* Designing and implementing a language + parser + processor from scratch is a pretty huge project in itself. 
* Qt Designer and dot are kids' toys compared to what we need. They deal with very simple trees! 
* The complexity of our trees has to do with (a) size and (b) structure. Think of a 100-station LOFAR tree with 100 predict sources in it. You'll have roughly 5000 independent per-baseline predict segments, which hook up into a network of solvers at the bottom, and 100 predict branches at the top. Such a tree is completely intractable if you look at it node by node.  
* However, if you break such a tree into units, you can begin to deal with it. Say, here's what a baseline predict looks like, now we build 5000 of them. Here's what a predict of a point source looks like,  we use it for some sources. Here's a UVBrick branch for extended sources, we plug it in here and here. And so on. 
* Therefore, designing such trees is **programming**, whether you like it or not. You need things like: 
   * for-loops, to tell how similar units are replicated; 
   * if-clauses, to build trees differently depending on, say, source structure; 
   * procedures, to encapsulate larger units; 
   * symbol binding and resolution, to couple LSM sources to trees and to chain trees together; 
* Using Python allows us to leverage all that "programmatic" functionality without  reinventing the wheel. 
Finally, if you look at the provided code examples, they do nothing but describe trees, with minimal synctatic overhead. If we design our own language, it won't look all that different from the Python code you see here, simply because it will need to do roughly similar stuff. So is there really a point in creating a language? 
Clone this wiki locally