# LINC tutorial - Jan 2024

author: Etienne Bonnassieux etienne.bonnassieux@uni-wuerzburg.de
Official documentation: https://linc.readthedocs.io/en/latest/

## This document

The purpose of this document is to provide an intuitive, user-friendly addition to the official LINC documentation, in order to allow new LOFAR users to run the initial calibration pipeline on their data. We will go through definining our paths, datasets, etc; we will then show how one can first install and then deploy LINC on the data as defined here.

## Prerequisites

The current software prerequisites for LINC are a huge improvement over past iterations: in principle, all that is strictly required is for git and python3 virtual environments to be installed. If this is not present, you can follow the steps below to install the python requirements (assuming a Linux machine):

In [None]:
sudo apt update && sudo apt upgrade -y
sudo apt-get install git -y
sudo apt install python3-pip -y
sudo apt install build-essential libssl-dev libffi-dev python3-dev
sudo apt install python3-venv -y

This will, sequentially:

1. update your software list
2. install git, easy peasy
3. install pip, necessary for convenient python package installation
4. install some reliability prerequisites, in case your distribution is out of date
5. actually install python3-venv itself, which is the python3 virtual environment package.

With this done, you are able to proceed to the next steps.


## I. Defining our environment variables

In order to facilitate this tutorial, we will define a series of important environment variables. **This step is crucial** in order to deploy the pipeline on a variety of environments. You will need to define:

1. The absolute path of your working directory. This is where you will want the pipeline to run, where you will place its input files, and where it will place its outputs. Note that **you will need a reasonable data quota at this specific location**: under **no circumstances** should it EVER be your home directory! If in doubt, ask your friendly local sysadmin where your data storage is located on your computational architecture.
2. The absolute path to the datasets you want to reduce. At present LINC can only reduce one observation at a time; multiple observations will need sequential LINC calls on their respective datasets.

In the examples below, I have defined the environment variables for our local compute infrastructure in Wurzburg. **If you don't change these for your use case, nothing will work**: path errors are the #1 top case of errors and bugs with LINC, so please check each of these folders and files exist with a quick ls!


In [None]:
working_dir = '/data/LOFAR/LBA'
data_dir    = '/data/LOFAR/LBA/DATA/3C380'
# mslist      = glob.glob(data_dir+'/L671058_SB*MS')

## II. Acquiring the pipeline

Once the prerequisites for LINC are installed, the pipeline itself can be acquired and installed with the following commands:

In [None]:
git clone https://git.astron.nl/RD/LINC.git working_dir
cd working_dir/LINC
./build_venv.sh
source venv/bin/activate
pip install --upgrade toil[cwl]
pip install pip install cwltool==3.1.20220628170238

This will, in order:

1. Git clone the pipeline
2. go to the cloned repository
3. create the virtual environment
4. enter it
5. install the necessary Common Workflow Language processing tool (in this case, toil)
6. install a version of a second Common Workflow Language processing tool (in this case, cwltool) which is known to work with LINC

And with that, you're done! Note that you will need to reload the virtual environment each time you open a new terminal in which to start the pipeline. We will therefore add this command in the configuration and execution steps.

## III. Configuring the pipeline

We now build scripts which will allow us to create the pipeline configuration files in a straightforward, easy way. 