*This notebook was developed by Marina Ricci for the DESC CL_Cosmo_Pipeline team.*
# This notebook aims at presenting the cluster pipeline for 
# stacked $\Delta\Sigma$ profiles and counts in richness/redshift bins.

It is meant to be run independently. If you already produced the necessary outputs `ceci` will use them, and otherwise create them.

___

In [2]:
import os
from pprint import pprint

import ceci
import h5py

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import yaml
from IPython.display import Image
from astropy.table import Table

import re
import sacc

Make sure to change your path in the next cell that leads to your TXPipe directory. See examples for IN2P3 and NERSC below.

In [3]:
# user specific paths -- IN2P3 example
my_txpipe_dir = "/pbs/home/m/mricci/throng_mricci/desc/TXPipe"

# user specific paths -- NERSC example
# my_txpipe_dir = "/pscratch/sd/a/avestruz/TXPipe"

os.chdir(my_txpipe_dir)

import txpipe

___

# 1 - Launching the pipeline

## **Pipeline approach**

Here we will use the 20deg2, but we can also use the 1deg2 files (just need to change 20deg2 to 1deg2 in the name of the files)


Let's have a look at the submission script for this pipeline:
- to work at CCin2p3 we can use: `examples/cosmodc2/Cluster_pipelines/cosmodc2/20deg2-in2p3.sub`:
- to work at NERSC we can use: `examples/cosmodc2/Cluster_pipelines/cosmodc2/20deg2-nersc.sub`:

## **Comands to run the pipeline**
This will launch a job of up to one hour (it should finish in 30 min) on a single CC-IN2P3 node to run a pipeline. After the first run, the output files are created and following runs take much less time.




> ### In a terminal, **navigate to your TXPipe directory on IN2P3 and run**:
>```
sbatch examples/cosmodc2/Cluster_pipelines/20deg2-in2p3.sub
```


> ### If you are **on NERSC, you will instead run**:
>```
sbatch examples/cosmodc2/Cluster_pipelines/20deg2-nersc.sub
```

If you are at CCin2p3 you can look at the ouput of your submission in the file `slurm-xxx.out`, with xxx the number of your batch process. If you see *'Pipeline successful.  Joy is sparked.'* : congratulations, it worked ! 

Once the pipeline is run you can go directly to 4) to look at the results.

___

# 2 - Looking at the different pipeline files

### **Let's look at the submission script :**

=> If we use the CCin2p3 example:

In [21]:
! cat examples/cosmodc2/Cluster_pipelines/20deg2-in2p3.sub

=> If we use the NERSC example:

In [22]:
! cat examples/cosmodc2/Cluster_pipelines/20deg2-nersc.sub

### **The submission script is specifying the pipeline yaml file. Let's look at it :**

=> The only differences between NERSC and CCin2p3 are in the first block, that defines on witch machine your run. 

In [112]:
! cat examples/cosmodc2/Cluster_pipelines/pipeline-20deg2-CL-nersc.yml

___

# 3 - Producing and looking at the pipeline diagram

Here we run the pipeline in "dry-run" mode, to check that it can run and to produce a pipeline diagram in png.

In [4]:
# Read the appropriate pipeline configuration, and ask for a flow-chart.

pipeline_file = "examples/cosmodc2/Cluster_pipelines/pipeline-20deg2-CL-in2p3.yml"
# pipeline_file = "examples/cosmodc2/Cluster_pipelines/pipeline-20deg2-CL-nersc.yml"
flowchart_file = "CL_pipeline.png"


pipeline_config = ceci.Pipeline.build_config(pipeline_file, flow_chart=flowchart_file, dry_run=True)

# Run the flow-chart pipeline
ceci.run_pipeline(pipeline_config);

Here we have 6 uncommented stages, and 5 uncommented inputs files.

This translate in the pipeline charts as 6 red ellipses, and 5 yellow boxes. The blue boxes represents the ouput files.

In [None]:
Image(flowchart_file)

___

# 4 - Opening and looking at the outputs

In [5]:
#At the moment the ouput file is a pickle file
import pickle as pkl

### **Open the pipeline file to load correct input/output file names**

In [6]:
with open(pipeline_file, "r") as file:
    pipeline_content = yaml.safe_load(file)

### **Open the output**

In [7]:
data = pkl.load(open(pipeline_content["output_dir"] + "/cluster_profiles.pkl", "rb"))

In [8]:
data

The output is a dictionary containing a `CLMM` `ClusterEnsemble` object for each redshift/richness bin.

### **Exploring the output**

In [9]:
example_bin = 'bin_zbin_0_richbin_0'

In [10]:
#This is the info for this bin
data[example_bin]

In [13]:
#This shows the table with all cluster in this bin and their corresponding profiles
data[example_bin]['clmm_cluster_ensemble'].data

In [15]:
#This shows the table with all cluster in this bin and their corresponding profiles
data[example_bin]['clmm_cluster_ensemble'].data.meta

In [16]:
#This shows the ensemble stacked profiles
data[example_bin]['clmm_cluster_ensemble'].stacked_data.meta

In [21]:
#This shows the covarinace for the ensemble stacked profiles
data[example_bin]['clmm_cluster_ensemble'].cov

### **Plot the ouput**

In [111]:
fig, ax = plt.subplots()

ax.semilogx(
        data[example_bin]['clmm_cluster_ensemble'].stacked_data["radius"],
        data[example_bin]['clmm_cluster_ensemble'].stacked_data["tangential_comp"],
        "bx-",
        label="tan",)

ax.semilogx(
        data[example_bin]['clmm_cluster_ensemble'].stacked_data["radius"],
        data[example_bin]['clmm_cluster_ensemble'].stacked_data["cross_comp"],
        "r.-",
        label="cross",)

ax.errorbar(
        data[example_bin]['clmm_cluster_ensemble'].stacked_data["radius"],
        data[example_bin]['clmm_cluster_ensemble'].stacked_data["tangential_comp"],
        yerr=data[example_bin]['clmm_cluster_ensemble'].cov["tan_sc"].diagonal() ** 0.5,
        color="blue",)


ax.errorbar(
        data[example_bin]['clmm_cluster_ensemble'].stacked_data["radius"],
        data[example_bin]['clmm_cluster_ensemble'].stacked_data["cross_comp"],
        yerr=data[example_bin]['clmm_cluster_ensemble'].cov["cross_sc"].diagonal() ** 0.5,
        color="red",)



ax.set_xlabel('radius [Mpc]')
ax.set_ylabel('$\\Delta \\Sigma$')

plt.legend()