Skip to content

AMIGA-IAA/hcg-16

Repository files navigation

Binder Replay

1) Pipeline for HCG-16 Project

This repository hosts a pipeline to reproduce the data reduction and analysis of Jones et al. 2019.

Here are the steps to run the pipeline:

First make sure you create and go to a new working directory:

mkdir pipeline-run
cd pipeline-run

Then execute:

curl -O https://raw.githubusercontent.com/AMIGA-IAA/hcg-16/master/run.sh
bash run.sh

run.sh will do automatically the following steps:

If you have already downloaded everything and just want to re-run the pipeline then you can do so with the following command (Note: the hcg-16 conda environment created above must be active):

python pipeline.py make plotting --local --timeit=pipeline.time

At present it is advised that you move or delete any output files before re-running the pipeline, as some of these files won't be overwritten by default and may cause the pipeline to crash or not generate the desired output.

Pre-requisites

For the pipeline to execute succesfully it will require approximately 20 GB of free space. Approximately 10 GB will be used in the directory where the pipeline is executed, and about 9 GB will be used in ~/.udocker.

2) Plots for HCG-16 Project on the cloud

If you do not wish to run the whole pipeline on your local machine (almost) all the figures of this project can be regenerated and modified using mybinder or EGI Replay. There is a Jupyter notebook for each figure which can be run using a link to mybinder or EGI Replay. This service can take some time to start so please be patient.

In order to test integration with EGI Notebooks in the context of the European Open Science Cloud the Jupyter notebooks have also been uploaded to EUDAT and they are now discoverable via OpenAIRE Explore.

3) Plots for HCG-16 Project in local environment

Alternatively, you may regenerate the plots by running the notebooks in your machine.

First, you will need to clone the repository:

git clone https://github.com/AMIGA-IAA/hcg-16.git
cd hcg-16

Download and install conda (if conda is already available on your system you may skip this step):

curl -o Miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-4.6.14-Linux-x86_64.sh
bash Miniconda.sh -b -p conda-install
source conda-install/etc/profile.d/conda.sh

Create and activate the conda environment:

# mamba is recommended
conda install mamba -c conda-forge --yes
mamba env create --file hcg-16.yml
conda activate hcg-16

You will also need to download the data files that are used in the notebooks by running the following commands:

cd plot_scripts
wget https://b2share.eudat.eu/api/files/878dbee0-01bf-4b85-8ed3-71818cd223bf/HCG16_final_data.tar.gz
tar -xzf HCG16_final_data.tar.gz

Finally, you are ready to launch the Jupyter server and open the notebooks:

jupyter notebook 

Pre-requisites

Please make sure you have conda and git installed on your computer before cloning the repository and running conda commands.

4) Final data and visualisation

If you are only interested in downloading the final reduced data cubes these are stored on the EUDAT B2SHARE service (DOI: 10.23728/b2share.a69a7b2dcc22449e8734552dde4d3906). The single cube on which most of the analysis depends is also available from CDS.There is also a 3D visualisation of the HI data cube hosted on the AMIGA webpage.