# Access to notebooks to study the SEDs in the context of FORS2 and StarLight


- author Sylvie Dagoret-Campagne
- creation date : 2023/01/11
- last update : 2023/01/21


- **purpose of this nb: pointer toward notebooks under the hierarchy**


- github repository : https://github.com/JospehCeh/PhotoZ_PhD/tree/u/dagoret (branch u/dagoret)

- **StudyFors2SED** : View SED to understand and control them
- **PCA**  : Do PCA decomposition on full spectra
- **PCA_Fors2** : Compute PCA coefficients of Fors2 spectra on StarLight PCA-Eigenvector spectra
- **Clustering** : perform clustering to find similar spectra and reduce the number of SL   
- **SOM** : try to visualize some clustering with Self Organising Maps
- **tSNE** : kind of Non linear clustering
- **LLE**: try many non linear clustering on different learning-manifold



All notebooks should be executed from their local directory to access to their data.
The appropriate directory change can be performed from this README.ipynb notebook

In [1]:
import sys
import os

In [2]:
sys.path.append("./StudyFors2SED") 
sys.path.append("./PCA") 
sys.path.append("./PCA_Fors2")
sys.path.append("./Clustering")
sys.path.append("./SOM")

In [3]:
cwd=os.path.abspath("")
main_dir = cwd
print(f"notebook current executing path : {cwd}")

notebook current executing path : /Users/dagoret/MacOSX/GitHub/LSST/PhotoZ_PhD


## 1) Exploration of the different types of SED including StarLight and Fors2 

- to run the notebooks in their working directory , change of execution path

In [4]:
os.chdir(main_dir)
cwd = os.path.abspath("")
if cwd == main_dir:
    os.chdir('./StudyFors2SED') 
cwd = os.path.abspath("")
print(f" New Directory {cwd}")

 New Directory /Users/dagoret/MacOSX/GitHub/LSST/PhotoZ_PhD/StudyFors2SED


In [5]:
! ls -l *.ipynb

-rw-r--r--  1 dagoret  staff     62785 Jan 11 14:15 ExploreFors2.ipynb
-rw-r--r--  1 dagoret  staff     11451 Jan  4 20:27 ExploreFors2_comparespectra.ipynb
-rw-r--r--  1 dagoret  staff      9733 Jan 11 14:17 ExploreFors2_short.ipynb
-rw-r--r--  1 dagoret  staff     15632 Jan 20 22:21 ExploreFors2_viewspectra1by1.ipynb
-rw-r--r--  1 dagoret  staff     19416 Jan 20 22:22 ExploreFors2_viewspectra1by1_CompareSL.ipynb
-rw-r--r--  1 dagoret  staff   1297464 Dec 26 15:38 ExploreFors2_viewspectra1by1_v0.ipynb
-rw-r--r--  1 dagoret  staff    118845 Jan  4 20:27 ExploreFors2inRestFrame.ipynb
-rw-r--r--  1 dagoret  staff  47716465 Jan 12 20:48 ExploreSL_comparespectra.ipynb
-rw-r--r--  1 dagoret  staff   7374934 Jan 13 17:41 ViewStandardSED.ipynb


### a) Rubin sims and PySynphot SEDS

View all SED available with rubin-sims and pysynphot SEDS. The goal is to compare with Fors2/SL. Both python packages require to be installed

- [the rubin-sim package](https://github.com/lsst/rubin_sim)

- [the pysynphot package](https://pysynphot.readthedocs.io/en/latest/)

-  the notebook : [StudyFors2SED/ViewStandardSED.ipynb](StudyFors2SED/ViewStandardSED.ipynb)

### b) Explore Fors2 spectra

#### Initial notebook

- deprecated

with all Eric and Johan code put in it 

-  [StudyFors2SED/ExploreFors2.ipynb](StudyFors2SED/ExploreFors2.ipynb)

#### notebook with code seperated from plots

- [StudyFors2SED/ExploreFors2_short.ipynb](StudyFors2SED/ExploreFors2_short.ipynb)

#### Check Fors2 spectra with respect to emission lines

- I have taken tables on galaxy emission lines to check on spectra plots

-  [StudyFors2SED/ExploreFors2_viewspectra1by1.ipynb](StudyFors2SED/ExploreFors2_viewspectra1by1.ipynb)

#### Check Fors2 spectra with respect to emission lines and StarLight

- [StudyFors2SED/ExploreFors2_viewspectra1by1_CompareSL.ipynb](StudyFors2SED/ExploreFors2_viewspectra1by1_CompareSL.ipynb)

## 2) PCA Decomposition

In [6]:
os.chdir(main_dir)
cwd = os.path.abspath("")
if cwd == main_dir:
    os.chdir('./PCA') 
cwd = os.path.abspath("")
print(f" New Directory {cwd}")

 New Directory /Users/dagoret/MacOSX/GitHub/LSST/PhotoZ_PhD/PCA


In [7]:
!ls -l *.ipynb

-rw-r--r--  1 dagoret  staff   1348706 Jan 20 18:23 ComparePCACoefficientsandEigenVectors_SL_Brown_BruzualCharlot.ipynb
-rw-r--r--  1 dagoret  staff   1156657 Jan 20 17:33 ComparePCA_SL_Brown_BruzualCharlot.ipynb
-rw-r--r--  1 dagoret  staff    906036 Jan 20 17:33 ComputePCA_Brown_PCAmethod1.ipynb
-rw-r--r--  1 dagoret  staff    931426 Jan 20 17:33 ComputePCA_BruzualCharlot_PCAmethod1.ipynb
-rw-r--r--  1 dagoret  staff   1077876 Jan 16 14:44 ComputePCA_SL.ipynb
-rw-r--r--  1 dagoret  staff    918843 Jan 16 14:50 ComputePCA_SL_Eigenvalues.ipynb
-rw-r--r--  1 dagoret  staff   1226724 Jan 20 17:33 ComputePCA_SL_PCAmethod1.ipynb
-rw-r--r--  1 dagoret  staff    809582 Jan 13 22:07 ComputePCA_SL_PCAmethod12D.ipynb
-rw-r--r--  1 dagoret  staff    883723 Dec 28 18:28 ComputePCA_SL_PCAmethod2.ipynb
-rw-r--r--  1 dagoret  staff   1032713 Dec 28 18:29 ComputePCA_SL_PCAmethod2_v2.ipynb
-rw-r--r--  1 dagoret  staff    631831 Dec 28 18:28 ComputePCA_SL_PCAmethod3.ipynb
-rw-r--r--  1 dagoret  staff  

### a) PCA decomposition on StarLight (+ Brown + Bruzual-Charlot)

#### SL SED

#### Save SL in fits file and normalise SL spectrum in $0<\lambda< 10000$ angstrom
- [PCA/prepareSL_toPCAana.ipynb](PCA/prepareSL_toPCAana.ipynb)

#### Compare the Eigenvector of the decomposition with  PCA / ICA / NMF methods. 
- [PCA/ComputePCA_SL.ipynb](PCA/ComputePCA_SL.ipynb)

#### PCA Decomposition of one SL spectrum in its coefficients

- study convergence of PCA coefficients

- [PCA/ComputePCA_SL_Eigenvalues.ipynb](PCA/ComputePCA_SL_Eigenvalues.ipynb)

#### PCA Decomposition of all SL spectra in its coefficients 

- plot 2D of all coeff[0], coeff[1]
- save fits file of all coefficients for later clustering

- [PCA/ComputePCA_SL_PCAmethod1.ipynb](PCA/ComputePCA_SL_PCAmethod1.ipynb)

#### PCA Decomposition of all SL spectra in its coefficients 

- plot 2D of all coeff[0], coeff[1]
- save fits file of all coefficients for later clustering

- [PCA/ComputePCA_SL_PCAmethod12D.ipynb](PCA/ComputePCA_SL_PCAmethod12D.ipynb)

#### Brown SED


- [PCA/ComputePCA_Brown_PCAmethod1.ipynb](PCA/ComputePCA_Brown_PCAmethod1.ipynb)

#### Bruzual Charlot SED

- [PCA/ComputePCA_BruzualCharlot_PCAmethod1.ipynb](PCA/ComputePCA_BruzualCharlot_PCAmethod1.ipynb)

#### Comparison of PCA Eigenvector for SL/Brown/Bruzual-Charlot

- [PCA/ComparePCA_SL_Brown_BruzualCharlot.ipynb](PCA/ComparePCA_SL_Brown_BruzualCharlot.ipynb)

### b) PCA decomposition of Fors2

In [8]:
os.chdir(main_dir)
cwd = os.path.abspath("")
if cwd == main_dir:
    os.chdir('./PCA_Fors2') 
cwd = os.path.abspath("")
print(f" New Directory {cwd}")

 New Directory /Users/dagoret/MacOSX/GitHub/LSST/PhotoZ_PhD/PCA_Fors2


In [9]:
! ls -l *.ipynb

-rw-r--r--  1 dagoret  staff   1024454 Jan 13 21:35 ComputePCA_Fors2.ipynb
-rw-r--r--  1 dagoret  staff  65175663 Jan 20 20:19 ComputePCA_Fors2_all.ipynb
-rw-r--r--  1 dagoret  staff  40414358 Jan 13 21:33 prepare_Fors2.ipynb


#### Compare FORS2 and average SL and create fits file for FORS2 data for later PCA decomposition

The normalisation of FORS2 is done wrt average
-  [PCA_Fors2/prepare_Fors2.ipynb](PCA_Fors2/prepare_Fors2.ipynb)

#### Compute PCA deccomposition for a few FORS2 spectra

-  [PCA_Fors2/ComputePCA_Fors2.ipynb](PCA_Fors2/ComputePCA_Fors2.ipynb)

#### Compute PCA deccomposition for all FORS2 spectra

-  [PCA_Fors2/ComputePCA_Fors2_all.ipynb](PCA_Fors2/ComputePCA_Fors2_all.ipynb)

## 3)  Perform clustering

In [10]:
os.chdir(main_dir)
cwd = os.path.abspath("")
if cwd == main_dir:
    os.chdir('./Clustering') 
cwd = os.path.abspath("")
print(f" New Directory {cwd}")

 New Directory /Users/dagoret/MacOSX/GitHub/LSST/PhotoZ_PhD/Clustering


In [11]:
! ls -l *.ipynb

-rw-r--r--  1 dagoret  staff  20176024 Jan 21 16:37 ClusteringAffinityPropagation_fromPCA_SL_PCAmethod1.ipynb
-rw-r--r--  1 dagoret  staff    490746 Jan 13 21:29 ClusteringAggrDendrogram_fromPCA_SL_PCAmethod1.ipynb
-rw-r--r--  1 dagoret  staff  13118693 Jan 21 16:04 ClusteringBisectingKMeans_fromPCA_SL_PCAmethod1.ipynb
-rw-r--r--  1 dagoret  staff  25840365 Jan 21 16:19 ClusteringKmean_fromPCA_SL_PCAmethod1.ipynb
-rw-r--r--  1 dagoret  staff  13524954 Jan 21 16:06 ClusteringKmean_fromPCA_SL_PCAmethod1_andPlot.ipynb
-rw-r--r--  1 dagoret  staff   8018876 Jan 21 16:21 ClusteringMeanShift_fromPCA_SL_PCAmethod1.ipynb
-rw-r--r--  1 dagoret  staff    241614 Dec 28 18:09 ClusteringPCA_SL_PCAmethod1.ipynb
-rw-r--r--  1 dagoret  staff  10374607 Jan 21 16:51 ClusteringSpectral_fromPCA_SL_PCAmethod1.ipynb


#### Clustering with Kmean

Use 10 PCA coefficients and also SL Spectra

-  [Clustering/ClusteringKmean_fromPCA_SL_PCAmethod1.ipynb](Clustering/ClusteringKmean_fromPCA_SL_PCAmethod1.ipynb)

#### Clustering with Kmean and plot

Use only 2 PCA coefficients and show the clustering in the 2D frame of the two top most PCA coefficients
-  [Clustering/ClusteringKmean_fromPCA_SL_PCAmethod1_andPlot.ipynb](Clustering/ClusteringKmean_fromPCA_SL_PCAmethod1_andPlot.ipynb)

#### Clustering with BisectingKmean


-  [Clustering/ClusteringBisectingKMeans_fromPCA_SL_PCAmethod1.ipynb](Clustering/ClusteringBisectingKMeans_fromPCA_SL_PCAmethod1.ipynb)

#### Clustering with Mean Shift 

Find the number of clusters, after specifying a quantile and computing a bandwidth. Clusters unbalanced

-  [Clustering/ClusteringMeanShift_fromPCA_SL_PCAmethod1.ipynb](Clustering/ClusteringMeanShift_fromPCA_SL_PCAmethod1.ipynb)

#### Clustering with AffinityPropagation

It find itself the number of clusters, either with PCA coefficients (find 20 clusters) or the spectra (find 30 clusters)

-  [Clustering/ClusteringAffinityPropagation_fromPCA_SL_PCAmethod1.ipynb](Clustering/ClusteringAffinityPropagation_fromPCA_SL_PCAmethod1.ipynb)


#### Clustering with Spectral Clustering

Poorly balanced in the number of elements in clusters

-  [Clustering/ClusteringSpectral_fromPCA_SL_PCAmethod1.ipynb](Clustering/ClusteringSpectral_fromPCA_SL_PCAmethod1.ipynb)

#### Clustering with Dentrogram

Spectra not shown

-  [Clustering/ClusteringAggrDendrogram_fromPCA_SL_PCAmethod1.ipynb](Clustering/ClusteringAggrDendrogram_fromPCA_SL_PCAmethod1.ipynb)

## 4) Compute Self Organizing Maps

In [12]:
os.chdir(main_dir)
cwd = os.path.abspath("")
if cwd == main_dir:
    os.chdir('./SOM') 
cwd = os.path.abspath("")
print(f" New Directory {cwd}")

 New Directory /Users/dagoret/MacOSX/GitHub/LSST/PhotoZ_PhD/SOM


In [13]:
! ls -l *.ipynb

-rw-r--r--  1 dagoret  staff  8563213 Jan 15 16:19 SOM_SL.ipynb
-rw-r--r--  1 dagoret  staff  8441739 Jan 16 14:33 SOM_SL_eigenvectorandcoeff.ipynb
-rw-r--r--  1 dagoret  staff  3580691 Jan 18 11:00 SOM_all.ipynb
-rw-r--r--  1 dagoret  staff  3911365 Jan 18 11:00 SOM_all_eigenvectorandcoeff.ipynb


#### Fast SOM on PCA coefficients
-  [SOM/SOM_SL.ipynb](SOM/SOM_SL.ipynb) (deprecated)


#### Fast SOM on PCA coefficients
-  [SOM/SOM_SL_eigenvectorandcoeff.ipynb](SOM/SOM_SL_eigenvectorandcoeff.ipynb)   (deprecated)

#### Fast SOM on PCA coefficients for SL and Brown and Bruzual-Charlot


- replace the above notebook **SOM_SL_eigenvectorandcoeff.ipynb** to apply to any SED type

- [SOM/SOM_all_eigenvectorandcoeff.ipynb](SOM/SOM_all_eigenvectorandcoeff.ipynb)

## Non linear clustering

## tSNE

In [18]:
os.chdir(main_dir)
cwd = os.path.abspath("")
if cwd == main_dir:
    os.chdir('./tSNE') 
cwd = os.path.abspath("")
print(f" New Directory {cwd}")

 New Directory /Users/dagoret/MacOSX/GitHub/LSST/PhotoZ_PhD/tSNE


In [15]:
! ls

SL_tSNE_onSpectr_clusters_30.pickle [1m[31mdatatools[m[m
SL_tSNE_onSpectr_clusters_50.pickle tSNE_allSED.ipynb


- [tSNE/tSNE_allSED.ipynb](tSNE/tSNE_allSED.ipynb)

### Locally Linear Embeddings (LLE)

In [16]:
os.chdir(main_dir)
cwd = os.path.abspath("")
if cwd == main_dir:
    os.chdir('./LLE') 
cwd = os.path.abspath("")
print(f" New Directory {cwd}")

 New Directory /Users/dagoret/MacOSX/GitHub/LSST/PhotoZ_PhD/LLE


In [17]:
!ls

LLEclustering_allSEDtypes.ipynb    [1m[31mdatatools[m[m
ManyfoldLearning_allSEDtypes.ipynb [1m[31mexample[m[m


#### methods for Locally Linear Embeddings (LLE)

4 methods to interpolate locally by neighbourgs

- Standard locally linear embedding
- Local tangent space alignment
- Hessian eigenmap
- "Modified locally linear embedding

- [LLE/LLEclustering_allSEDtypes.ipynb](LLE/LLEclustering_allSEDtypes.ipynb)

#### methods for Manifold learning

4 methods formanifold learning

- Isomap Embedding
- Multidimensional scaling
- Spectral Embedding
- T-distributed Stochastic Neighbor Embedding

- [LLE/ManyfoldLearning_allSEDtypes.ipynb](LLE/ManyfoldLearning_allSEDtypes.ipynb)

#### Examples from scikit-learn or astroml

 [LLE/example/plot_compare_ManifoldLearning_methods.ipynb](LLE/example/plot_compare_ManifoldLearning_methods.ipynb)