# DisPerSE & Filament Indentification Tutorial

This Jupyter notebook aims to teach you how to use the Discrete Persistent Structures Extractor (DisPerSE) to find galactic strustures like filaments and clusters. Much of the information in here can also be found on the DisPerSE website (https://www2.iap.fr/users/sousbie/web/html/indexd41d.html), which includes manuals and tutorials for all of DisPerSE's functions as well as more in depth explanations of how DisPerSE works.

Before getting into DisPerSE, I will import some of the libraries that will be used later on.

In [2]:
# Python Libraries
import numpy as np
from astropy.table import Table, Column, join
from astropy.coordinates import SkyCoord
from astropy.table import Column
from numpy import *
import matplotlib.pyplot as plt
%matplotlib inline
from astropy.wcs import WCS
from astropy.io import fits
from matplotlib.colors import LogNorm
from astropy.utils.data import download_file
import warnings
import os
import pandas as pd
from astropy.io import ascii

warnings.filterwarnings('ignore')

### Compiling and Installing DisPerSE

I was able to compile DisPerSE on a virtual machine running Ubuntu 20.04 via Windows WSL. DisPerSE can be downloaded from the aforementioned DisPerSE website or from the github (https://github.com/thierry-sousbie/DisPerSE). There are two required libraries/programs to be able to use the most basic functions of DisPerSE. They are CMake and GSL. You will also want CGAL which is used for the delaunay_2D/3D functions that output files in a format (.NDnet) that can be fed into other functions. 

Once you have DisPerSE downloaded, navigate to the "disperse/build" directory and type "cmake ../" to configure the make files. Then installing should be as simple as running "make install". You should then be able to find several executables in the "disperse/bin" folder, the most important ones are mse, delaunay_2D, delaunay_3D, and skelconv as these are the ones I used. There are also other functions, however, many of these are not essential.

### Making an input file for DisPerSE

The input DisPerSE takes in needs the coordinates of galaxies in the form of either RA, DEC, and Z or SGX, SGY, and SGZ. The easiest way to do this is to read out the contents of a catalog into an ascii text file, which is one of the file formats DisPerSE can read. You also need to name each column a specific keyword in the heading so that DisPerSE will recognize what that column's data represents. I will show an example of what I did below, but further examples of what this ascii file should look like can be found on https://www2.iap.fr/users/sousbie/web/html/index744c.html?post/survey_ascii-format.

In [3]:
# Defines a home directory and path to the catalog I want to use
homedir = os.getenv("HOME")
catalog_path='/home/evan-barkus/Downloads/'

In [5]:
# VFS main catalog
maintab = Table.read('/home/evan-barkus/Downloads/vf_v2_main.fits')

# VFS environment catalog (has SG coords)
envtab = Table.read('/home/evan-barkus/Downloads/vf_v2_environment.fits')

In [6]:
# Cuts on VFS catalog from Zakharova+24
racut = (maintab['RA'] > 100) & (maintab['RA'] < 280)
deccut = (maintab['DEC'] > -1.3) & (maintab['DEC'] < 75)
vrcut = (maintab['vr'] > 500) & (maintab['vr'] < 3300)
cut = vrcut & deccut & racut

cuttab = maintab[cut]

In [9]:
c = 3*10**5 #km/s

ra = np.array(cuttab['RA']) # Right ascension of each galaxy
dec = np.array(cuttab['DEC']) # Declination of each galaxy
z = np.array(cuttab['vr']/c) # Redshifts (VFS has recessional velocities so I had to convert them to z by dividing by c)

id = np.arange(0,len(ra)) # It is also useful to have a column of indices in the DisPerSE input file

In [12]:
# Defines the columns of the input file
col1 = fits.Column(name='ra', format='D', array=ra)
col2 = fits.Column(name='dec', format='D', array=dec)
col3 = fits.Column(name='z', format='D', array=z)
col4 = fits.Column(name='id', format='D', array=id)

In [13]:
# Combines columns and writes them into a fits table
coldefs = fits.ColDefs([col1, col2, col3, col4])
hdu = fits.BinTableHDU.from_columns([col1, col2, col3, col4])

In [None]:
# Writes out the fits table "hdu" to a .txt file
ascii.write(hdu.data,'vfs_ascii_new',names=(['ra','dec','z','id']),overwrite=True)

And with that, we now have a file that DisPerSE can read.

### Running DisPerSE

The first thing you will want to do is run either delaunay_3D or delaunay_2D on this file. These output an unstructured network format file (.NDnet) that can be used as an input for other DisPerSE functions, namely mse, which is the main function DisPerSE is built around. The difference between the 3D and 2D versions is simply whether you include three position coordinates or two in your input file; DisPerSE should let you know if there are not enough arguments in the input to use them. To use either version, just run "<path>/disperse/bin/delaunay_nD filename". After running for a bit, delaunay_nD should generate a new file called "filename.NDnet".