Skip to content
Jack Radcliffe edited this page Nov 30, 2023 · 19 revisions

Introduction

Welcome to the VPIPE wiki! This wiki outlines how to use the VLBI pipeline built by Jack Radcliffe. If you have any issues with the code feel free to create and issue or generate a pull request. This pipeline was built mostly by me so mistakes are probably there!

The wiki is designed to explain and guide the user through the steps conducted by the pipeline. Each step has a page to itself and you can use this page to get to these. This will give you help in completing the parameter file. This home page provides an overview of the pipeline and includes some future directions. Currently (as of 04-07-2022), the pipeline has been tested on continuum data sets from the European VLBI Network (EVN), Very Long Baseline Array (VLBA) and the Long Baseline Array (LBA), however, it can easily be adapted for other arrays.

The pipeline was built to accommodate the many large VLBI data sets on the horizon along with promoting and transitioning VLBI data processing from AIPS to CASA. This pipeline includes some wide-field VLBI specialist codes such as primary beam correction, the casting of calibration solutions to multiple phase centres and multi-source self-calibration.

Executing the pipeline

To start, I recommend learning how to execute the pipeline and what input files are required for execution.

In short, to execute the pipeline you should follow these steps (more details for a step are hyperlinked)

  1. Make a working folder where you will calibrate these data and clone the github repository (git clone https://github.com/jradcliffe5/VLBI_pipeline).
  2. Copy the input and parameter files (vlbi_pipe_inputs.txt and vlbi_pipe_params.json) to that working folder.
  3. Edit the input and parameter files to set the steps you want to execute and the parameters to run those steps with (see the wiki pages for each the individual step).
  4. Make the bash scripts to run the steps using your chosen CASA executable (e.g., casa --nologger --log2term -c VLBI_pipeline/run_vlbi_pipe.py vlbi_pipe_inputs.txt for self-contained CASA, or python VLBI_pipeline/run_vlbi_pipe.py vlbi_pipe_inputs.txt for the modular CASA i.e., installed through pip)
  5. Run the steps using the runfile via bash vp_runfile.bash. This should submit all the jobs to the job manager software if you are using one.
  6. Repeat 4. and 5. for the next steps until the end. (run_vlbi_pipe.py must be run before new steps are executed so that it can make a new runfile).

Finally, please read the known issues before running the pipeline.

The VLBI pipeline takes two input files, the ascii file (vlbi_pipe_inputs.txt) that determines the steps that will be conducted, while the json file (vlbi_pipe_params.json), determines the detailed parameters.

The input file

The input files tells the pipeline which steps to run. A 1 indicates that the step should be run and a 0 for the step not to be run. Remember that you need to set the steps that you have already run to zero when running further steps.

### Inputs for the VLBI pipeline ###
## Extra parameters to be set in vlbi_pipe_params.json

parameter_file_path = vlbi_pipe_params.json

## Steps to run ##
prepare_data = 0
import_fitsidi = 0
make_mms = 0
apriori_cal = 0
init_flag = 0
fit_autocorrs = 0
sub_band_delay = 0
bandpass_cal = 0
phase_referencing = 0
apply_target = 0
## Wide-field steps ##
apply_to_all = 0
mssc = 0

The parameter file

With this set, you then need to edit the parameter file (vlbi_pipe_params.json) to get the pipeline working. The parameter file is a json file and errors will occur is you use a different datatype e.g., list, float etc. to what is specified in the default setup

The global parameters are the most important to set first. Please look at the other wiki pages on how to set the parameters for the other various steps. The behaviour of the global parameters are explained below:

"global":{
   "project_code"      :  "eg078d",   # the project code that each data & calibration product will have.
   "cwd"               :  "/Users/",  # the current working directory where all data processing will be done
   "vlbipipe_path"     :  "VLBI_pipeline", # the path to the VLBI_pipeline git repo from the cwd
   "job_manager"       :  "bash",     # Set this to the job manager you wish to use (options bash | pbs | slurm). 
                                      # If you are running on your personal computer then use bash, or if you are using HPC 
                                      # with a job manager then SLURM and PBS Pro are currently supported
   "HPC_project_code"  :  "ASTR1313", # the HPC account used to track the CPU hours
   "default_partition" :  "normal",   # default HPC partition / queue you want to use
   "default_walltime"  :  "10:00:00", # default expected job wall-time 
   "default_nodes"     :     2,       # default number of nodes to use
   "default_cpus"      :    24,       # default number of cpus to use
   "default_mpiprocs"  :    24,       # default number of mpiprocesses (normally ok to set to same as cpus)
   "default_nodetype"  :  "haswell_reg", # default nodetype to use (important generally for PBS pro)
   "email_progress"    :  "", # email address to set job updates
   "mpicasapath"       :  "", # path to mpicasa executable (if casa v5 this should end in mpicasa while casa v6 will be mpirun)
   "casapath"          :  "python", # path to casa executable (if casa v5 then end in casa while casa v6 will be python)
   "singularity"       :  false,    # use singularity to run casa (if so set casapath to singularity container + casa)
   "AOflag_command"    : ["module load chpc/singularity","singularity exec /mnt/lustre/users/jradcliffe/singularity_ims/kern5-dev.simg aoflagger"],
   "wsclean_command"    : ["/Volumes/HD-LXU3/anaconda2/anaconda2/bin/wsclean"],
   "fitsidi_files"     : ["auto"], #if auto, pipeline will try to find the files from fitsidi_path
   "fitsidi_path"      :  "raw_UV", #location of the fitsidifile to conduct calibration on.
   "refant"            : ["T6","O8","MC","TR","SV","ZC","BD","UR","NT","JB","CM","DA"], #reference antennae in order of preference
   "fringe_finders"    : ["3C345","DA193"], #fringe finders / bandpass calibrators
   "phase_calibrators" : ["J1241+602","J1234+619"], #phase reference sources
   "targets"           : ["HDFC0155"], #target fields
   "do_parang"         :  true #do parallactic angle correction?

Known caveats and issues

Below are some issues you should be aware about so that you can run the pipeline efficiently so please read them.

  • Version 6.6 w/ python 3.8 of self-contained CASA (for linux at least) does not come with pyfits and/or astropy installed. You need to use the python executable (e.g., casa-6.6.0-20-py3.8.el7/bin/python3) to install astropy via python3 -m pip install astropy.
  • The pipeline is currently not end-to-end. This is due to the init_flag step currently requiring some metadata from the measurement set to produce the bash scripts to control the jobs. This means that, unless import_fitsidi has been run, you need to run everything before init_flag and then you can run the rest of the pipeline.
  • Sometimes the make_mms step causes the data to get corrupted and the resultant images are all nans. Not sure why but good to be aware in case you encounter it.
  • The apply_to_all step needs a modification to casampi if you are using bash rather than a job manager (slurm / PBS Pro). Follow the instructions on the apply_to_all page.

Setting the correct parameters for each step

This section outlines the steps in order of execution and you can visit each of these pages on how to input the correct parameters.

Now that you know the ropes, follow the upcoming pages so you can set the appropriate parameters (in vlbi_pipe_params.json) and calibrate your VLBI data. Note that the brackets corresponds to the step within the vlbi_pipe_inputs.txt file. For the first time, it is recommended to run each step individually.

Standard VLBI steps

  1. Preparing fitsidi data - EVN/LBA only (prepare_data)
  2. Import fitsidi data to CASA measurement set (import_fitsidi)
  3. Parallelise calibration using multi-MS (make_mms)
  4. Derive a-priori calibration products (apriori_cal)
  5. Flag RFI and other unwanted/bad data (init_flag)
  6. Scalar bandpass correction using autocorrelations (fit_autocorrs)
  7. Instrumental / sub-band delay calibration (sub_band_delay)
  8. Bandpass calibration (bandpass_cal)
  9. Multi-band delay and phase referencing (phase_referencing)
  10. Apply solutions to target (apply_target)

Wide-field steps

  1. Apply calibration to all targets (apply_to_all)
  2. (experimental) Target self-calibration using MSSC (mssc)

Additional Scripts

The pipeline also comes with some additional scripts that can help you obtain better results from the pipeline. Please follow the links for information on the codes:

  • check_antab.py - debugging script which can find some of the most common errors in the ANTAB files.
  • adjust_gainpaths.py - script to change paths so that pipelines can be transferred to other machines for further processing.
  • plotcaltable_standalone.py - script to plot CASA calibration tables using matplotlib.
  • copy_ms_pols.py - copies one polarisation to the other, useful as CASA cannot deal with single pols (defunct as of CASA v5.8/6.3 with the development of the corrdepflags parameter in calibration tasks.