We recall the log-Likelihood formula (with the $2\pi$ term removed):
\begin{equation}
{\cal L} = -\log L = \frac{1}{2}\sum_{i=1}^N\left[\log(\sigma^2+\varepsilon_i^2)+\frac{(v_i-u)^2}{\sigma^2+ \varepsilon_i^2}\right]
\end{equation}
Let's consider that $\sigma$ and $u$ are the parameters to be fitted (following the work by http://arxiv.org/abs/0904.3329). Partial derivation of $\cal L$ with respect to these yields :
\begin{align}
\sum_{i=1}^N \frac{v_i}{\sigma^2+\varepsilon_i^2} &= u\sum_{i=1}^N \frac{1}{\sigma^2+\varepsilon_i^2}\\
\sum_{i=1}^N \frac{(v_i-u)^2}{(\sigma^2+\varepsilon_i^2)^2} &= \sum_{i=1}^N \frac{1}{\sigma^2+\varepsilon_i^2}
\end{align} 
Note that in our code, we have been using $u=u_0\equiv\sum_i^N v_i/N$, the sample average velocity. One can check on the first equation above that this is the solution for $u$ in the limit of small velocity measurement errors, compared to the dispersion $\sigma$ : $(\varepsilon_i/\sigma)^2\rightarrow 0\quad \forall i$. In such limit, the second equation yields $\sigma = \sigma_0\equiv\sum_i^N (v_i-u_0)^2/N$, the sample variance.

Thus in the limit of vanishing measurement errors, the gaussian likelihood has a trivial solution : $\sigma(\rho_0, r_0) = \sigma_0$, which defines the minimal curve on the $(\rho_0,\,r_0)$ plane of our DM density parameters. Thus, we should first of all check whether $\mathrm{max}_i\varepsilon_i\ll\sigma_0$ for our dwarfs.

In [11]:
import numpy as np
names = {'dra':"Draco",'seg1':"Segue 1",'umaI':"Ursa Major 1",'booI':"Bootes 1",
         'wil1':"Willman 1",'scl':"Sculptor",'for':"Fornax",'sgr':"Sagittarius",
         'com':"Coma Berenices"}
for gal in names :
    x,v,dv = np.loadtxt('data/velocities/velocities_'+gal+'.dat',dtype=float,usecols=(0,1,2),unpack=True)
    u_0 = v.mean()
    s_0 = v.std()
    print "%s : conservative %g likely more correct %g"%(names[gal],dv.max()/s_0,dv.mean()/s_0)


Ursa Major 1 : conservative 3.3286 likely more correct 0.659221
Sculptor : conservative 1.28245 likely more correct 0.273852
Fornax : conservative 1.09073 likely more correct 0.165878
Bootes 1 : conservative 2.96312 likely more correct 0.831952
Draco : conservative 0.99117 likely more correct 0.36275
Coma Berenices : conservative 3.27191 likely more correct 0.976253
Sagittarius : conservative 0.313037 likely more correct 0.146962
Segue 1 : conservative 5.36808 likely more correct 0.924557
Willman 1 : conservative 1.54722 likely more correct 0.808797


So taking the max of the error shows that we are not completely safe in neglecting the measurement errors, but taking their average is perhaps a better indication? Nevertheless, we probably want to try to solve for $\sigma$ and $u$ with the measurement errors. This can probably be done iteratively.

In [19]:
def g(s,v,dv):
    d = s*s + dv*dv
    u = (v/d).sum()/(1./d).sum()
    return u

for gal in names:
    x,v,dv = np.loadtxt('data/velocities/velocities_'+gal+'.dat',dtype=float,usecols=(0,1,2),unpack=True)
    u_0 = v.mean()
    s_0 = v.std()
    u_1 = g(s_0, v, dv)
    print "%s : %g"%(names[gal],abs(u_1 - u_0)/u_0)

Ursa Major 1 : -0.00562636
Sculptor : 0.000550519
Fornax : 4.53144e-05
Bootes 1 : 0.00255312
Draco : -0.000237632
Coma Berenices : 0.00345137
Sagittarius : 7.35223e-05
Segue 1 : 0.00277647
Willman 1 : -0.0392069


The very small relative change between $u_0=g(\sigma_0,v,0)$ and $u_1 = g(\sigma_0,v,dv)$ is a very good sign that the initial guess with neglected errors is close to the correct solution, and thus that neglecting errors is probably not grossly inaccurate.

In [42]:
from scipy.optimize import newton
def h(v, dv, prev_s, curr_u):
    eq = lambda s : ( ((v-curr_u)/(s*s+dv*dv))**2 - (1./(s*s+dv*dv)) ).sum()
    new_s = newton(eq, prev_s)
    return new_s

for gal in names:
    x,v,dv = np.loadtxt('data/velocities/velocities_'+gal+'.dat',dtype=float,usecols=(0,1,2),unpack=True)
    s_0 = v.std()
    u_1 = g(s_0, v, dv)
    try :
        print names[gal], s_0, h(v,dv, s_0, u_1)
    except RuntimeError:
        print '\n',names[gal], ' failed'
        print names[gal], s_0, h(v,np.zeros_like(v), s_0, u_1)

Ursa Major 1 7.81108018788 7.56733020435
Sculptor 9.43504091239 8.95353536576
Fornax 10.9101270609 10.6949157573
Bootes 1 4.38727287336 
Bootes 1  failed
Bootes 1 4.38727287336 4.39433747662
Draco 9.27187479585 8.54081086709
Coma Berenices 6.57720745563 4.63235383369
Sagittarius 15.9405941804 15.7258738841
Segue 1 7.04907415167 
Segue 1  failed
Segue 1 7.04907415167 7.07305317348
Willman 1 5.93968230434 4.82532860248


So it seems that Segue 1 and Bootes 1 are causing trouble here, and the stepping may have to compute the next $\sigma$ based on neglected errors and the updated $u$. Need to investigate what is going on.

In [60]:
def update(prev_u, prev_s):
    new_u = g(prev_s, v, dv)
    try :
        new_s = h(v, dv, prev_s, new_u)
    except :
        new_s = h(v, np.zeros_like(v), prev_s, new_u)
    return np.max([abs(new_u-prev_u), abs(new_s-prev_s)]), new_u, new_s

#full draco :
x,v,dv = np.loadtxt('data/velocities/velocities_seg1.dat',dtype=float,usecols=(0,1,2),unpack=True)
u = v.mean()
s = v.std()
conv = False
while not conv :
    eps, u, s = update(u, s)
    if eps <1.e-5 :
        conv = True
print "we can call it converged! ", u, s, v.mean(), v.std()

we can call it converged!  209.010329694 7.07293029247 209.590757576 7.04907415167
