## Latent distances comparison 

This notebook walks through how to compare for the different methods considered (PolyDis,FaderDis,FactorDis) the distances between stellar siblings 

## Measuring the distances

The first ingredient in creating these visualizations is to evaluate the distances between all the pairs of the stars in the dataset. We have written scripts for doing just that. You can run the script ```/tagging/scripts/calculate_neural_latent_distance.py``` to get the distances for an existing neural network model or run the script ```/tagging/scripts/calculate_polynomial_latent_distance.py``` to calculate and get distances of a polynomial fit model.

In [None]:
! python ../scripts/calculate_neural_latent_distances.py --help

In [None]:
! python ../scripts/calculate_neural_latent_distances.py --data_file /share/splinter/ddm/taggingProject/taggingClean/data/final/train/spectra_noiseless.pd --model_file /share/splinter/ddm/taggingProject/taggingRepo/outputs/results_fader/runWithFe0.00001/adN7214I3000 --savepath "/share/splinter/ddm/taggingProject/taggingRepo/outputs/intermediate/distances/distances_fader_0.p" --n_conditioned 3

In [None]:
! python ../scripts/calculate_neural_latent_distances.py --data_file /share/splinter/ddm/taggingProject/taggingClean/data/final/train/spectra_SN_50.pd --model_file /share/splinter/ddm/taggingProject/taggingRepo/outputs/results_fader/runWithFe0.00001/adN7214I3000 --savepath "/share/splinter/ddm/taggingProject/taggingRepo/outputs/intermediate/distances/distances_fader_50.p" --n_conditioned 3

## Plotting the distances

We show below a bit of code for plotting these disstances as was done in the paper. By default, this uses precalculated outputs found in ```outputs/distances/``` feel free to replace with those you calculated yourself in the first part of this notebook.


In [None]:
import pickle
import numpy as np
import os
import matplotlib.pyplot as plt
import matplotlib
from tagging.paths import basepath

In [None]:
def load_ranking(method,SNR):
    with open(os.path.join(os.path.split(basepath)[0],"outputs","intermediate","distances",'distances_{}_{}.p'.format(method,SNR)), 'rb') as f:
        ranking = pickle.load(f)
    return ranking

In [None]:
methods = ["factor","poly","fader"]
SNRs = [0,50] #we use 0 for infinity for readability 
distances = {}
for method in methods:
    distances[method]={}
    for SNR in SNRs:
        distances[method][str(SNR)] = load_ranking(method,SNR)


In [None]:
fig, axes = plt.subplots(2,3,sharex=True,sharey="row",gridspec_kw={'hspace': 0, 'wspace': 0})

def plot_axis(ax,distances,text,x_max=10,y_max=1):
    ax.hist(distances["siblings"]/np.mean(distances["siblings"]),bins=100,alpha=0.5,normed=True,label="chemically identical")
    ax.hist(distances["randoms"]/np.mean(distances["siblings"]),bins=100,alpha=0.5,normed=True,label="chemically different")
    ax.text(0.7, 0.08, text, transform=ax.transAxes, size=16)
    ax.get_xaxis().set_major_formatter(matplotlib.ticker.ScalarFormatter())
    ax.minorticks_on()
    ax.grid(which="both")
    ax.grid(which='minor', alpha=0.2)
    ax.grid(which='major', alpha=0.4)
    ax.set_xlim(0,x_max)
    ax.set_ylim(0,y_max)
    return ax



axes[0,0] = plot_axis(axes[0,0],distances["fader"]["0"],"a",y_max=1)
axes[1,0] = plot_axis(axes[1,0],distances["fader"]["50"],"d",y_max=2.5)

#yticks = axes[0,0].yaxis.get_major_ticks() 
#yticks[0].label1.set_visible(False)


axes[0,1] = plot_axis(axes[0,1],distances["factor"]["0"],"b",y_max=1)
axes[1,1] = plot_axis(axes[1,1],distances["factor"]["50"],"e",y_max=2.5)

axes[0,2] = plot_axis(axes[0,2],distances["poly"]["0"],"c",y_max=1)
axes[1,2] = plot_axis(axes[1,2],distances["poly"]["50"],"f",y_max=2.5)
axes[0,2].legend(fontsize=8)

fig.text(0.5,0.04, "d", ha="center", va="center",fontsize=14)
fig.text(0.035, 0.3, 'SNR=50', va='center', rotation='vertical',fontsize=11)
fig.text(0.03, 0.5, 'p', va='center', rotation='vertical',fontsize=14)
fig.text(0.035, 0.7, 'noiseless', va='center', rotation='vertical',fontsize=11)

xticks = axes[1,0].xaxis.get_major_ticks()
xticks[-1].set_visible(False)

xticks = axes[1,1].xaxis.get_major_ticks()
xticks[-1].set_visible(False)

xticks = axes[1,1].xaxis.get_major_ticks()
xticks[-1].set_visible(False)

yticks = axes[1,0].yaxis.get_major_ticks()
yticks[-1].set_visible(False)




fig.text(0.185,0.9,"FaderDis")
fig.text(0.44,0.9,"FactorDis")
fig.text(0.72,0.9,"PolyDis")
fig.savefig("../../outputs/figures/distributionComparison.pdf",format="pdf")