# TensorFlow for $\xi_{\rm hm}$

In The Aemulus Project, we have measured the halo-matter correlation function $\xi_{\rm hm}$ for many halo masses over a large range of physical scales. Because of this, we have a lot of training data from which to build an emulator. What we would like to do, is use a physical model that is flexible and fit it to each measured correlation function, thereby doing a dimensional reduction on the training parameters. The amount of training data for an individual Gaussian Processes will go down from $N_{\rm sims} \times N_{\rm z} \times N_{\rm M} \times N_{\rm r}$ down to $N_{\rm sims} \times N_{\rm z} \times N_{\rm M}$, or possibly less if we identify parameters that are simple functions of redshift $z$ or mass (or peak height $\nu$). This represents a reduction by a factor of 50-1000, depending on the final fits.

## Advantages of tensorflow

The reason I'd like to use TF here is because of the quality of the optimizers implemented it has. The fit function parameters have unknown degeneracies with each other, are numerically at very different scales, and have units. Thus, I need a good affine-invariant optimizer, and so far I have found that Nelder-Mead doesn't cut it.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
import tensorflow as tf

In [3]:
#Load in correlation function data
print("Need to download the data!")

Need to download the data!


In [4]:
#Load "parameters" that are already known, as well as extra data we need
Mass = 999e9
bias = 1.0 #from emulator
redshift = 0.
r_xi  = 0 #np.loadtxt("r.txt")
xi_mm = 0 #np.load("xi_mm_z%.2f.npy"%redshift)

In [5]:
#Define a PGM
#Placeholders for the data and result
xi_model = tf.placeholder(dtype=tf.float32) #will hold model CF
xi_data  = tf.placeholder(dtype=tf.float32) #holds the measured CF
Cinv     = tf.placeholder(dtype=tf.float32) #holds the inverse covariance matrix

In [6]:
#Parameter guesses
conc = tf.Variable(initial_value=np.random.normal(loc=1, scale=1), dtype=tf.float32)
rt   = tf.Variable(initial_value=np.random.normal(loc=1, scale=1), dtype=tf.float32)
bt   = tf.Variable(initial_value=np.random.normal(loc=1, scale=1), dtype=tf.float32)
lnMa = tf.Variable(initial_value=np.random.normal(loc=1, scale=1), dtype=tf.float32)
ca   = tf.Variable(initial_value=np.random.normal(loc=1, scale=1), dtype=tf.float32)
lnMb = tf.Variable(initial_value=np.random.normal(loc=1, scale=1), dtype=tf.float32)
cb   = tf.Variable(initial_value=np.random.normal(loc=1, scale=1), dtype=tf.float32)

# IMPORTANT!

We need to wrap the modeling function from the cluster_toolkit in a form that is [tf.py_func](https://www.tensorflow.org/api_docs/python/tf/py_func) compatible.

In [7]:
#xi_model = tf.pyfunc(ct.function, [my_input], tf.float32)

In [8]:
#Make the session
_ = """
with tf.Session() as sess:
    sess.run(fetches=tf.global_variables_initializer())
    
    i = 1
    obs_vars = sess.run(fetches=[[conc], [rt], [bt], 
                                 [lnMa], [ca], 
                                 [lnMb], [cb]])
    obs_lnLike =- sess.run(fetches=[chi2], feed_dict={"stuff"})
"""