# SFH Group Separater
This notebook is an example of separating the spatially resolved SFHs of a galaxy into groups. The wasserstein_distance in scipy is used to calculate the earth mover's distance.  
In the same folder, there is another python script "GroupSeparater.py" with the same code, which can be more smoothly run in command line mode.  

In [12]:
import os
import sys
import glob
import numpy as np
import pandas as pd
from astropy.wcs import WCS
from astropy.io import fits
from astropy import units as u
from astropy.coordinates import SkyCoord
from scipy.ndimage import gaussian_filter
from scipy.stats import wasserstein_distance
from astropy.wcs.utils import pixel_to_skycoord,skycoord_to_pixel
from astropy.visualization import wcsaxes,ZScaleInterval,ImageNormalize

## The Galaxy and the Number of Groups
Here we specify the galaxy of our interest, and the target number of groups.  
I choose number 122 galaxy in the list, and the group number is 4 in this example.  
When using the script, you can run the command "python GroupSeparater.py 122 4"
The galaxy name list is "FinalSelMuseMask3.csv".  
There are some core collapse SNe in the list, whose SFHs are not calculated.  
Please make sure you specified a SN Ia with a solved SFH cube here.  

In [16]:
i=122
groupSize=4
uvals,vvals=np.arange(50),np.arange(50)
FinalSelMuse=pd.read_csv('FinalSelMuseMask3.csv',index_col=0)

In [4]:
def emdKnn(sfhCube,massMap,stepper=100,coreNum=10):
    groupMap=np.random.randint(coreNum,size=(sfhCube.shape[1],sfhCube.shape[2]))+1
    groupMap[np.isnan(sfhCube.sum(axis=0))]=0
    groupMap[sfhCube.sum(axis=0)<0.8]=0
    groupMap[np.isnan(massMap)]=0
    groupMap=groupMap*1.0
    groupMap[groupMap==0]=np.nan
    groupMapOld=groupMap.copy()
    for indexer in range(stepper):
        meanSFH=[]
        massList=[]
        for group in range(1,coreNum+1):
            sfhCubeSel=sfhCube[:,groupMap==group]
            massSel=massMap[groupMap==group]
            if sfhCubeSel.shape[1]<=1:continue
            sfhCubeAvg=np.sum(sfhCubeSel*massSel,axis=1)/np.sum(massSel)
            meanSFH.append(sfhCubeAvg)
            massList.append(np.sum(massSel))
        coreNum=len(meanSFH)
        for x in range(groupMap.shape[0]):
            for y in range(groupMap.shape[1]):
                if np.isnan(groupMap[x,y]):continue
                emdList=[wasserstein_distance(uvals,vvals,meanSFH[j],sfhCube[:,x,y]) for j in range(0,coreNum)]
                groupMap[x,y]=np.argmin(emdList)+1
        print(indexer)
        if np.average(groupMap[np.isnan(groupMap)==False]==groupMapOld[np.isnan(groupMapOld)==False])==1:break
        else:groupMapOld=groupMap.copy()
    return groupMap,meanSFH,massList


## Read the Data
The "ArcName" in the list are the file names of these host galaxy IFU data cube that can be used to search in ESO science archive.  
In my pipeline, I use the data cube with $3\times 3$ spatial binning, and the binned file name is changed from "ADP---" to "CDP---".  
All the resampled spectra data are stored in "../MuseObject/MuseResample/" directory, please check the directory name when you are running the code.  
All the SFH data are stored in "../MuseSFH/dataOut/" directory, please check the directory name when you are running the code.  
In the .fits file read into the variable "ppxfOut", there are two HDUs.  
The first HDU stores the fitting spectra from ppxf.  
The second HDU stores the spatially resolved SFHs.  
Both the HDU shapes are matched with the resampled IFU data cube.  
Also, the SFHs have the shape (50,7), to match the stellar population with 50 age grids and 7 metallicity grids used in ppxf.  

In [5]:
arcName=FinalSelMuse['ArcName'][i]
crcName=arcName.replace('ADP','CDP')
fitsFile=fits.open('../MuseObject/MuseResample/'+crcName+'.fits')
ppxfOut=fits.open('../MuseSFH/dataOut/'+str(i)+'_'+FinalSelMuse['SN Name'][i]+'.fits')

## The Pixel of SN
Here, we identify the pixel closest to the SN coordinate.  
The point spread function is also extracted here, and converted from arcsec to pixel length.  
To notice, in the resampled data, one pixel corresponding to 0.6 arcsec.  

In [6]:
wcsselect=WCS(fitsFile[1])
snCoord=SkyCoord(FinalSelMuse['SN RA'][i]*u.deg,FinalSelMuse['SN DEC'][i]*u.deg)
minpos=skycoord_to_pixel(snCoord,wcsselect)
minpos=(int(minpos[0]),int(minpos[1]))
psfSig=fitsFile[0].header['SKY_RES']/0.6/2.355#0.6 here is the pixel size in arcsec for MUSE, 2.355 is the conversion between sigma and FWHM

## The Map of Mass
In ppxf, the fitting spectra are normalized to 1 solar mass of total initial stellar mass.  
Here, we divide the observed spectra by the fitting spectra and pick up the median ratio, as the mass of the stars in the pixel.  
I am not sure if the flux levels in MUSE are properly calibrated, but at least such a mass map can tell the relative mass within the galaxy.  

In [7]:
fitFlux=ppxfOut[0].data
massMap=np.nanmedian(fitsFile[1].data,axis=0)/np.nanmedian(fitFlux,axis=0)
massMap[massMap<0]=np.nan

  result = np.apply_along_axis(_nanmedian1d, axis, a, overwrite_input)
  This is separate from the ipykernel package so we can avoid doing imports until


## Refine the SFH data cube
Here, we just ignore the metallicity and make the SFH into (50,1) shape.  
Also, well.... we convolve the SFH on the spatial dimension with the observed PSF, this step will make the whole pipeline more stable.  
By the way, without such a convolution, the SFH map looks quite noizy and cannot identify the psf shape, we suppose such a convolution can regularize the SFH results.  

In [8]:
sfhCube=ppxfOut[1].data
sfhCube=sfhCube.sum(axis=1)
sfhCube[sfhCube<0]=0
sfhCube[np.isnan(sfhCube)]=0
sfhCube[:,np.isnan(massMap)]=0

  This is separate from the ipykernel package so we can avoid doing imports until


In [9]:
for indexer in range(50):sfhCube[indexer]=gaussian_filter(sfhCube[indexer],sigma=psfSig,truncate=3)

## The Group Map Calculator
The function is defined several cells above.  
The maximum step is 100, adequate for most of the cases.  
Finally all the data will be stored in a fits file.  

In [10]:
groupMap,meanSFH,massList=emdKnn(sfhCube,massMap,coreNum=groupSize)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14


In [17]:
meanSFH=np.array(meanSFH)
massList=np.array(massList)

mapHdu=fits.PrimaryHDU(data=groupMap)
mapHdu.header['Type']='Group Map'
sfhHdu=fits.ImageHDU(data=meanSFH)
sfhHdu.header['Type']='Mean SFH'
masHdu=fits.ImageHDU(data=massList)
masHdu.header['Type']='Mass List'
mapFits=fits.HDUList([mapHdu,sfhHdu,masHdu])
mapFits.writeto('GroupMap/'+str(i)+'_'+FinalSelMuse['SN Name'][i]+'_'+str(groupSize)+'.fits',overwrite=True)