Skip to content

What_the_heck_is_a_'boio'_file_and_when_should_I_use_one?

Gijs Molenaar edited this page Feb 13, 2014 · 2 revisions

(Oleg, this needs a bit of fixing up!)

OK, I've got it solving for all 10 sources successfully. I've checked in a fix (including some better bookmarks). The problem is that you've inherited one feature too many from matrix343.py. This code here sets up the data source:

   if not short and os.access(boioname,os.R_OK):
     rec = record(boio=record(boio_file_name=boioname,boio_file_mode="r"));
   # else use MS, but tell the event channel to record itself to boio file
   else:
     rec = record();
     rec.ms_name          = msname
     rec.data_column_name = 'DATA'
     rec.tile_size        = tile_size
     rec.selection = record(channel_start_index=0,
                            channel_end_index=0,
                            channel_increment=1,
                            selection_string='')

It contains a useful optimization for when you're doing calibration runs over large MSs. The first time you run the script, it uses the MS as input and RECORDS the input into a "boio" file, which is basically a fast binary dump of everything that was read from the input. The next time, if the boio dump exists, it uses that as the input and completely ignores the MS. This gives a big speed boost to repeated runs, since boio dumps are read near-instantaneously. The StreamControl page has some more details.

Of course in your case clar_source_predict was generating a boio dump of its input -- that being an MS full of null visibilities -- and clar_source_solve was reusing this dump and trying to solve for it, completely ignoring your changes to the MS. Disabling the if-branch above in clar_source_solve eliminates this problem.

The one hassle with using boio dumps is that they incorporate a fixed tiling (hence the <tile_size> component in their filenames). So you'd have to repeat the predict every time you want to try a different tile size in the solve.

Is this relevant? .... from a later message on April 6, 2006 ...

I've eliminated the bottleneck that we were experiencing when writing to an MS. As usual, the gains are better than x10. :-) The I/O overhead is barely noticeable now, since the writing is done in a separate thread. But please be on the lookout for new bugs (especially if you select subsets of channels, etc.)

Tony's 6-second 4800 timeslot simulated CLAR MS provides an amusing point of reference. AIPS++ newsimulator takes 15 minutes (on birch) to create the empty MS with a zero DATA column; we need just over 5 minutes to fill the same MS with a simulated CLAR field.

Clone this wiki locally