Mathematical Framework for latentcor

Main Framework

Latent Gaussian Copula Model for Mixed Data

latentcor utilizes the powerful semi-parametric latent Gaussian copula models to estimate latent correlations between mixed data types (continuous/binary/ternary/truncated or zero-inflated). Below we review the definitions for each type.

from latentcor import gen_data, latentcor

Definition of continuous model

A random $X\in\cal{R}^{p}$ satisfies the Gaussian copula (or nonparanormal) model if there exist monotonically increasing f = (f_j)_j = 1^p with Z_j = f_j(X_j) satisfying Z ∼ N_p(0, Σ), σ_jj = 1; we denote X ∼ NPN(0, Σ, f) :citefan2017high.

print(gen_data(n = 6, tps = "con")['X'])

Definition of binary model

A random $X\in\cal{R}^{p}$ satisfies the binary latent Gaussian copula model if there exists W ∼ NPN(0, Σ, f) such that X_j = I(W_j > c_j), where I( ⋅ ) is the indicator function and c_j are constants :citefan2017high.

print(gen_data(n = 6, tps = "bin")['X'])

Definition of ternary model

A random $X\in\cal{R}^{p}$ satisfies the ternary latent Gaussian copula model if there exists W ∼ NPN(0, Σ, f) such that X_j = I(W_j > c_j) + I(W_j > c′_j), where I( ⋅ ) is the indicator function and c_j < c′_j are constants :citequan2018rank.

print(gen_data(n = 6, tps = "ter")['X'])

Definition of truncated or zero-inflated model

A random $X\in\cal{R}^{p}$ satisfies the truncated latent Gaussian copula model if there exists W ∼ NPN(0, Σ, f) such that X_j = I(W_j > c_j)W_j, where I( ⋅ ) is the indicator function and c_j are constants :citeyoon2020sparse.

print(gen_data(n = 6, tps = "tru")['X'])

Mixed latent Gaussian copula model

The mixed latent Gaussian copula model jointly models W = (W₁, W₂, W₃, W₄) ∼ NPN(0, Σ, f) such that X_1j = W_1j, X_2j = I(W_2j > c_2j), X_3j = I(W_3j > c_3j) + I(W_3j > c′_3j) and X_4j = I(W_4j > c_4j)W_4j.

X = gen_data(n = 100, tps = ["con", "bin", "ter", "tru"])['X'] print(X[ :6, : ])

Moment-based estimation of latent correlation matrix based on bridge functions

The estimation of latent correlation matrix Σ is achieved via the bridge function F which is defined such that E(τ̂_jk) = F(σ_jk), where σ_jk is the latent correlation between variables j and k, and τ̂_jk is the corresponding sample Kendall's τ.

Kendall's correlation

Given observed $\mathbf{x}_{j}, \mathbf{x}_{k}\in\cal{R}^{n}$,

$$\hat{\tau}_{jk}=\hat{\tau}(\mathbf{x}_{j}, \mathbf{x}_{k})=\frac{2}{n(n-1)}\sum_{1\le i<i'\le n}sign(x_{ij}-x_{i'j})sign(x_{ik}-x_{i'k}),$$

where n is the sample size.

latentcor calculates pairwise Kendall's τ̂ as part of the estimation process.

K = latentcor(X, tps = ["con", "bin", "ter", "tru"])['K'] print(K)

Using F and τ̂_jk, a moment-based estimator is σ̂_jk = F^− 1(τ̂_jk) with the corresponding Σ̂ being consistent for Σ :citefan2017high,quan2018rank,yoon2020sparse.

The explicit form of bridge function F has been derived for all combinations of continuous(C)/binary(B)/ternary(N)/truncated(T) variable types, and we summarize the corresponding references. Each of this combinations is implemented in latentcor.

Below we provide an explicit form of F for each combination.

Theorem (explicit form of bridge function)

Let $W_{1}\in\cal{R}^{p_{1}}$, $W_{2}\in\cal{R}^{p_{2}}$, $W_{3}\in\cal{R}^{p_{3}}$, $W_{4}\in\cal{R}^{p_{4}}$ be such that W = (W₁, W₂, W₃, W₄) ∼ NPN(0, Σ, f) with p = p₁ + p₂ + p₃ + p₄. Let $X=(X_{1}, X_{2}, X_{3}, X_{4})\in\cal{R}^{p}$ satisfy X_j = W_j for j=1,...,p_{1}, X_j = I(W_j > c_j) for j = p₁ + 1, ..., p₁ + p₂, X_j = I(W_j > c_j) + I(W_j > c′_j) for j = p₁ + p₂ + 1, ..., p₃ and X_j = I(W_j > c_j)W_j for j = p₁ + p₂ + p₃ + 1, ..., p with Δ_j = f(c_j). The rank-based estimator of Σ based on the observed n realizations of X is the matrix R̂ with r̂_jj = 1, r̂_jk = r̂_kj = F^− 1(τ̂_jk) with block structure

$$\begin{aligned} \mathbf{\hat{R}}=\begin{pmatrix} F_{CC}^{-1}(\hat{\tau}) & F_{CB}^{-1}(\hat{\tau}) & F_{CN}^{-1}(\hat{\tau}) & F_{CT}^{-1}(\hat{\tau})\\\ F_{BC}^{-1}(\hat{\tau}) & F_{BB}^{-1}(\hat{\tau}) & F_{BN}^{-1}(\hat{\tau}) & F_{BT}^{-1}(\hat{\tau})\\\ F_{NC}^{-1}(\hat{\tau}) & F_{NB}^{-1}(\hat{\tau}) & F_{NN}^{-1}(\hat{\tau}) & F_{NT}^{-1}(\hat{\tau})\\\ F_{TC}^{-1}(\hat{\tau}) & F_{TB}^{-1}(\hat{\tau}) & F_{TN}^{-1}(\hat{\tau}) & F_{TT}^{-1}(\hat{\tau}) \end{pmatrix} \end{aligned}$$

$$\begin{aligned} F(\cdot)=\begin{cases} CC: & 2\sin^{-1}(r)/\pi \\\ \\\ BC: & 4\Phi_{2}(\Delta_{j},0;r/\sqrt{2})-2\Phi(\Delta_{j}) \\\ \\\ BB: & 2\{\Phi_{2}(\Delta_{j},\Delta_{k};r)-\Phi(\Delta_{j})\Phi(\Delta_{k})\} \\\ \\\ NC: & 4\Phi_{2}(\Delta_{j}^{2},0;r/\sqrt{2})-2\Phi(\Delta_{j}^{2})+4\Phi_{3}(\Delta_{j}^{1},\Delta_{j}^{2},0;\Sigma_{3a}(r))-2\Phi(\Delta_{j}^{1})\Phi(\Delta_{j}^{2})\\\ \\\ NB: & 2\Phi_{2}(\Delta_{j}^{2},\Delta_{k},r)\{1-\Phi(\Delta_{j}^{1})\}-2\Phi(\Delta_{j}^{2})\{\Phi(\Delta_{k})-\Phi_{2}(\Delta_{j}^{1},\Delta_{k},r)\} \\\ \\\ NN: & 2\Phi_{2}(\Delta_{j}^{2},\Delta_{k}^{2};r)\Phi_{2}(-\Delta_{j}^{1},-\Delta_{k}^{1};r)-2\{\Phi(\Delta_{j}^{2})-\Phi_{2}(\Delta_{j}^{2},\Delta_{k}^{1};r)\}\{\Phi(\Delta_{k}^{2})\\\ & -\Phi_{2}(\Delta_{j}^{1},\Delta_{k}^{2};r)\} \\\ \\\ TC: & -2\Phi_{2}(-\Delta_{j},0;1/\sqrt{2})+4\Phi_{3}(-\Delta_{j},0,0;\Sigma_{3b}(r)) \\\ \\\ TB: & 2\{1-\Phi(\Delta_{j})\}\Phi(\Delta_{k})-2\Phi_{3}(-\Delta_{j},\Delta_{k},0;\Sigma_{3c}(r))-2\Phi_{3}(-\Delta_{j},\Delta_{k},0;\Sigma_{3d}(r)) \\\ \\\ TN: & -2\Phi(-\Delta_{k}^{1})\Phi(\Delta_{k}^{2}) + 2\Phi_{3}(-\Delta_{k}^{1},\Delta_{k}^{2},\Delta_{j};\Sigma_{3e}(r)) \\\ & +2\Phi_{4}(-\Delta_{k}^{1},\Delta_{k}^{2},-\Delta_{j},0;\Sigma_{4a}(r))+2\Phi_{4}(-\Delta_{k}^{1},\Delta_{k}^{2},-\Delta_{j},0;\Sigma_{4b}(r)) \\\ \\\ TT: & -2\Phi_{4}(-\Delta_{j},-\Delta_{k},0,0;\Sigma_{4c}(r))+2\Phi_{4}(-\Delta_{j},-\Delta_{k},0,0;\Sigma_{4d}(r)) \\\ \end{cases} \end{aligned}$$

where Δ_j = Φ^− 1(π_0j), Δ_k = Φ^− 1(π_0k), Δ_j¹ = Φ^− 1(π_0j), Δ_j² = Φ^− 1(π_0j + π_1j), Δ_k¹ = Φ^− 1(π_0k), Δ_k² = Φ^− 1(π_0k + π_1k),

$$\begin{aligned} \Sigma_{3a}(r)= \begin{pmatrix} 1 & 0 & \frac{r}{\sqrt{2}} \\\ 0 & 1 & -\frac{r}{\sqrt{2}} \\\ \frac{r}{\sqrt{2}} & -\frac{r}{\sqrt{2}} & 1 \end{pmatrix}, \;\;\; \Sigma_{3b}(r)= \begin{pmatrix} 1 & \frac{1}{\sqrt{2}} & \frac{r}{\sqrt{2}}\\\ \frac{1}{\sqrt{2}} & 1 & r \\\ \frac{r}{\sqrt{2}} & r & 1 \end{pmatrix}, \end{aligned}$$

$$\begin{aligned} \Sigma_{3c}(r)= \begin{pmatrix} 1 & -r & \frac{1}{\sqrt{2}} \\\ -r & 1 & -\frac{r}{\sqrt{2}} \\\ \frac{1}{\sqrt{2}} & -\frac{r}{\sqrt{2}} & 1 \end{pmatrix}, \;\;\; \Sigma_{3d}(r)= \begin{pmatrix} 1 & 0 & -\frac{1}{\sqrt{2}} \\\ 0 & 1 & -\frac{r}{\sqrt{2}} \\\ -\frac{1}{\sqrt{2}} & -\frac{r}{\sqrt{2}} & 1 \end{pmatrix}, \end{aligned}$$

$$\begin{aligned} \Sigma_{3e}(r)= \begin{pmatrix} 1 & 0 & 0 \\\ 0 & 1 & r \\\ 0 & r & 1 \end{pmatrix}, \;\;\; \Sigma_{4a}(r)= \begin{pmatrix} 1 & 0 & 0 & \frac{r}{\sqrt{2}} \\\ 0 & 1 & -r & \frac{r}{\sqrt{2}} \\\ 0 & -r & 1 & -\frac{1}{\sqrt{2}} \\\ \frac{r}{\sqrt{2}} & \frac{r}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 1 \end{pmatrix}, \end{aligned}$$

$$\begin{aligned} \Sigma_{4b}(r)= \begin{pmatrix} 1 & 0 & r & \frac{r}{\sqrt{2}} \\\ 0 & 1 & 0 & \frac{r}{\sqrt{2}} \\\ r & 0 & 1 & \frac{1}{\sqrt{2}} \\\ \frac{r}{\sqrt{2}} & \frac{r}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 1 \end{pmatrix}, \;\;\; \Sigma_{4c}(r)= \begin{pmatrix} 1 & 0 & \frac{1}{\sqrt{2}} & -\frac{r}{\sqrt{2}} \\\ 0 & 1 & -\frac{r}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\\ \frac{1}{\sqrt{2}} & -\frac{r}{\sqrt{2}} & 1 & -r \\\ -\frac{r}{\sqrt{2}} & \frac{1}{\sqrt{2}} & -r & 1 \end{pmatrix} \end{aligned}$$

and

$$\begin{aligned} \Sigma_{4d}(r)= \begin{pmatrix} 1 & r & \frac{1}{\sqrt{2}} & \frac{r}{\sqrt{2}} \\\ r & 1 & \frac{r}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\\ \frac{1}{\sqrt{2}} & \frac{r}{\sqrt{2}} & 1 & r \\\ \frac{r}{\sqrt{2}} & \frac{1}{\sqrt{2}} & r & 1 \end{pmatrix}. \end{aligned}$$

Estimation methods

Given the form of bridge function F, obtaining a moment-based estimation σ̂_jk requires inversion of F. latentcor implements two methods for calculation of the inversion:

method = "original"
method = "approx"

Both methods calculate inverse bridge function applied to each element of sample Kendall's τ matrix. Because the calculation is performed point-wise (separately for each pair of variables), the resulting point-wise estimator of correlation matrix may not be positive semi-definite. latentcor performs projection of the pointwise-estimator to the space of positive semi-definite matrices, and allows for shrinkage towards identity matrix using the parameter nu.

Original method (`method = "original"`)

Original estimation approach relies on numerical inversion of F based on solving uni-root optimization problem. Given the calculated τ̂_jk (sample Kendall's τ between variables j and k), the estimate of latent correlation σ̂_jk is obtained by calling scipy.optimize.fminbound function to solve the following optimization problem:

r̂_jk = arg min_r{F(r) − τ̂_jk}².

The parameter tol controls the desired accuracy of the minimizer and is passed to scipy.optimize.fminbound, with the default precision of 10^− 8.

estimate_original = latentcor(X, tps = ["con", "bin", "ter", "tru"], method = "original", tol = 1e-8)

Algorithm for Original method

Input: F(r) = F(r, Δ) - bridge function based on the type of variables j, k

Step 1. Calculate τ̂_jk using (1).

print(estimate_original['K'])

Step 2. For binary/truncated variable j, set $\hat{\mathbf{\Delta}}_{j}=\hat{\Delta}_{j}=\Phi^{-1}(\pi_{0j})$ with $\pi_{0j}=\sum_{i=1}^{n}\frac{I(x_{ij}=0)}{n}$. For ternary variable j, set $\hat{\mathbf{\Delta}}_{j}=(\hat{\Delta}_{j}^{1}, \hat{\Delta}_{j}^{2})$ where Δ̂_j¹ = Φ^− 1(π_0j) and Δ̂_j² = Φ^− 1(π_0j + π_1j) with $\pi_{0j}=\sum_{i=1}^{n}\frac{I(x_{ij}=0)}{n}$ and $\pi_{1j}=\sum_{i=1}^{n}\frac{I(x_{ij}=1)}{n}$.

print(estimate_original['zratios'])

Step 3 Compute F^− 1(τ̂_jk) as r̂_jk = argmin{F(r) − τ̂_jk}² solved via scipy.optimize.fminbound function with accuracy tol.

print(estimate_original['Rpointwise'])

Approximation method (`method = "approx"`)

A faster approximation method is based on multi-linear interpolation of pre-computed inverse bridge function on a fixed grid of points :citeyoon2021fast. This is possible as the inverse bridge function is an analytic function of at most 5 parameters:

Kendall's τ
Proportion of zeros in the 1st variable
(Possibly) proportion of zeros and ones in the 1st variable
(Possibly) proportion of zeros in the 2nd variable
(Possibly) proportion of zeros and ones in the 2nd variable

In short, d-dimensional multi-linear interpolation uses a weighted average of 2^d neighbors to approximate the function values at the points within the d-dimensional cube of the neighbors, and to perform interpolation, latentcor takes advantage of the Python package scipy.interpolate.RegularGridInterpolator. This approximation method has been first described in :citeyoon2021fast for continuous/binary/truncated cases. In latentcor, we additionally implement ternary case, and optimize the choice of grid as well as interpolation boundary for faster computations with smaller memory footprint.

estimate_approx = latentcor(X, tps = ["con", "bin", "ter", "tru"], method = "approx") print(estimate_approx['Rpointwise'])

Algorithm for Approximation method

Input: Let ǧ = h(g), pre-computed values F^− 1(h^− 1(ǧ)) on a fixed grid $\check{g}\in\check{\cal{G}}$ based on the type of variables j and k. For binary/continuous case, ǧ = (τ̌_jk, Δ̌_j); for binary/binary case, ǧ = (τ̌_jk, Δ̌_j, Δ̌_k); for truncated/continuous case, ǧ = (τ̌_jk, Δ̌_j); for truncated/truncated case, ǧ = (τ̌_jk, Δ̌_j, Δ̌_k); for ternary/continuous case, ǧ = (τ̌_jk, Δ̌_j¹, Δ̌_j²); for ternary/binary case, ǧ = (τ̌_jk, Δ̌_j¹, Δ̌_j², Δ̌_k); for ternary/truncated case, ǧ = (τ̌_jk, Δ̌_j¹, Δ̌_j², Δ̌_k); for ternay/ternary case, ǧ = (τ̌_jk, Δ̌_j¹, Δ̌_j², Δ̌_k¹, Δ̌_k²).

Step 1 and Step 2 same as Original method.
Step 3. If |τ̂_jk| ≤ ratio × τ̄_jk( ⋅ ), apply interpolation; otherwise apply Original method.

To avoid interpolation in areas with high approximation errors close to the boundary, we use hybrid scheme in Step 3. The parameter ratio controls the size of the region where the interpolation is performed (ratio = 0 means no interpolation, ratio = 1 means interpolation is always performed). For the derivation of approximate bound for BC, BB, TC, TB, TT cases see @yoon2021fast. The derivation of approximate bound for NC, NB, NN, NT case is in the Appendix.

$$\begin{aligned} \bar{\tau}_{jk}(\cdot)= \begin{cases} 2\pi_{0j}(1-\pi_{0j}) & for \; BC \; case\\\ 2\min(\pi_{0j},\pi_{0k})\{1-\max(\pi_{0j}, \pi_{0k})\} & for \; BB \; case\\\ 2\{\pi_{0j}(1-\pi_{0j})+\pi_{1j}(1-\pi_{0j}-\pi_{1j})\} & for \; NC \; case\\\ 2\min(\pi_{0j}(1-\pi_{0j})+\pi_{1j}(1-\pi_{0j}-\pi_{1j}),\pi_{0k}(1-\pi_{0k})) & for \; NB \; case\\\ 2\min(\pi_{0j}(1-\pi_{0j})+\pi_{1j}(1-\pi_{0j}-\pi_{1j}), \\\ \;\;\;\;\;\;\;\;\;\;\pi_{0k}(1-\pi_{0k})+\pi_{1k}(1-\pi_{0k}-\pi_{1k})) & for \; NN \; case\\\ 1-(\pi_{0j})^{2} & for \; TC \; case\\\ 2\max(\pi_{0k},1-\pi_{0k})\{1-\max(\pi_{0k},1-\pi_{0k},\pi_{0j})\} & for \; TB \; case\\\ 1-\{\max(\pi_{0j},\pi_{0k},\pi_{1k},1-\pi_{0k}-\pi_{1k})\}^{2} & for \; TN \; case\\\ 1-\{\max(\pi_{0j},\pi_{0k})\}^{2} & for \; TT \; case\\\ \end{cases} \end{aligned}$$

By default, latentcor uses ratio = 0.9 as this value was recommended in @yoon2021fast having a good balance of accuracy and computational speed. This value, however, can be modified by the user

print(latentcor(X, tps = ["con", "bin", "ter", "tru"], method = "approx", ratio = 0.99)['R']) print(latentcor(X, tps = ["con", "bin", "ter", "tru"], method = "approx", ratio = 0.4)['R']) print(latentcor(X, tps = ["con", "bin", "ter", "tru"], method = "original")['R'])

The lower is the ratio, the closer is the approximation method to original method (with ratio = 0 being equivalent to method = "original"), but also the higher is the cost of computations.

Rescaled Grid for Interpolation

Since |τ̂| ≤ τ̄, the grid does not need to cover the whole domain τ ∈ [ − 1, 1]. To optimize memory associated with storing the grid, we rescale τ as follows:

τ̌_jk = τ_jk/τ̄_jk ∈ [ − 1, 1],

where τ̄_jk is as defined above.

In addition, for ternary variable j, it always holds that

$$\Delta_{j}^{2}>\Delta_{j}^{1}` since :math:`\Delta_{j}^{1}=\Phi^{-1}(\pi_{0j})$$

and

Δ_j² = Φ^− 1(π_0j + π_1j).

Thus, the grid should not cover the the area corresponding to

Δ_j² ≥ Δ_j¹.

We thus rescale as follows:

Δ̌_j¹ = Δ_j¹/Δ_j² ∈ [0, 1];

Δ̌_j² = Δ_j² ∈ [0, 1].

Adjustment of pointwise-estimator for positive-definiteness

Since the estimation is performed point-wise, the resulting matrix of estimated latent correlations is not guaranteed to be positive semi-definite. For example, this could be expected when the sample size is small (and so the estimation error for each pairwise correlation is larger).

X = gen_data(n = 6, tps = ["con", "bin", "ter", "tru"])['X'] print(latentcor(X, tps = ["con", "bin", "ter", "tru"])['Rpointwise'])

latentcor automatically corrects the pointwise estimator to be positive definite by making two adjustments. First, if Rpointwise has smallest eigenvalue less than zero, the latentcor projects this matrix to the nearest positive semi-definite matrix. The user is notified of this adjustment through the message (supressed in previous code chunk), e.g.

print(latentcor(X, tps = ["con", "bin", "ter", "tru"])['R'])

Second, latentcor shrinks the adjusted matrix of correlations towards identity matrix using the parameter \nu with default value of 0.001 (nu = 0.001), so that the resulting latentcor[0] is strictly positive definite with the minimal eigenvalue being greater or equal to \nu. That is

R = (1 − ν)R̃ + νI,

where \widetilde R is the nearest positive semi-definite matrix to Rpointwise.

print(latentcor(X, tps = ["con", "bin", "ter", "tru"], nu = 0.001)['R'])

As a result, R and Rpointwise could be quite different when sample size n is small. When n is large and p is moderate, the difference is typically driven by parameter nu.

X = gen_data(n = 100, tps = ["con", "bin", "ter", "tru"])['X'] out = latentcor(X, tps = ["con", "bin", "ter", "tru"], nu = 0.001) print(out['Rpointwise']) print(out['R'])

Appendix

Derivation of bridge function for ternary/truncated case

Without loss of generality, let j = 1 and k = 2. By the definition of Kendall's τ,

$$\tau_{12}=E(\hat{\tau}_{12})=E[\frac{2}{n(n-1)}\sum_{1\leq i\leq i' \leq n} sign\{(X_{i1}-X_{i'1})(X_{i2}-X_{i'2})\}].$$

Since X₁ is ternary,

$$\begin{aligned} \begin{align} &sign(X_{1}-X_{1}') \nonumber\\ =&[I(U_{1}>C_{11},U_{1}'\leq C_{11})+I(U_{1}>C_{12},U_{1}'\leq C_{12})-I(U_{1}>C_{12},U_{1}'\leq C_{11})] \nonumber\\\ &-[I(U_{1}\leq C_{11}, U_{1}'>C_{11})+I(U_{1}\leq C_{12}, U_{1}'>C_{12})-I(U_{1}\leq C_{11}, U_{1}'>C_{12})] \nonumber\\\ =&[I(U_{1}>C_{11})-I(U_{1}>C_{11},U_{1}'>C_{11})+I(U_{1}>C_{12})-I(U_{1}>C_{12},U_{1}'>C_{12}) \nonumber\\\ &-I(U_{1}>C_{12})+I(U_{1}>C_{12},U_{1}'>C_{11})] \nonumber\\\ &-[I(U_{1}'>C_{11})-I(U_{1}>C_{11},U_{1}'>C_{11})+I(U_{1}'>C_{12})-I(U_{1}>C_{12},U_{1}'>C_{12}) \nonumber\\\ &-I(U_{1}'>C_{12})+I(U_{1}>C_{11},U_{1}'>C_{12})] \nonumber\\\ =&I(U_{1}>C_{11})+I(U_{1}>C_{12},U_{1}'>C_{11})-I(U_{1}'>C_{11})-I(U_{1}>C_{11},U_{1}'>C_{12}) \nonumber\\\ =&I(U_{1}>C_{11},U_{1}'\leq C_{12})-I(U_{1}'>C_{11},U_{1}\leq C_{12}). \end{align} \end{aligned}$$

Since X₂ is truncated, C₁ > 0 and

$$\begin{aligned} \begin{align} sign(X_{2}-X_{2}')=&-I(X_{2}=0,X_{2}'>0)+I(X_{2}>0,X_{2}'=0) \nonumber\\\ &+I(X_{2}>0,X_{2}'>0)sign(X_{2}-X_{2}') \nonumber\\\ =&-I(X_{2}=0)+I(X_{2}'=0)+I(X_{2}>0,X_{2}'>0)sign(X_{2}-X_{2}'). \end{align} \end{aligned}$$

Since f is monotonically increasing, sign(X₂ − X₂′) = sign(Z₂ − Z₂′),

$$\begin{aligned} \begin{align} \tau_{12}=&E[I(U_{1}>C_{11},U_{1}'\leq C_{12}) sign(X_{2}-X_{2}')] \nonumber\\ &-E[I(U_{1}'>C_{11},U_{1}\leq C_{12}) sign(X_{2}-X_{2}')] \nonumber\\\ =&-E[I(U_{1}>C_{11},U_{1}'\leq C_{12}) I(X_{2}=0)] \nonumber\\\ &+E[I(U_{1}>C_{11},U_{1}'\leq C_{12}) I(X_{2}'=0)] \nonumber\\\ &+E[I(U_{1}>C_{11},U_{1}'\leq C_{12})I(X_{2}>0,X_{2}'>0)sign(Z_{2}-Z_{2}')] \nonumber\\\ &+E[I(U_{1}'>C_{11},U_{1}\leq C_{12}) I(X_{2}=0)] \nonumber\\\ &-E[I(U_{1}'>C_{11},U_{1}\leq C_{12}) I(X_{2}'=0)] \nonumber\\\ &-E[I(U_{1}'>C_{11},U_{1}\leq C_{12})I(X_{2}>0,X_{2}'>0)sign(Z_{2}-Z_{2}')] \nonumber\\\ =&-2E[I(U_{1}>C_{11},U_{1}'\leq C_{12}) I(X_{2}=0)] \nonumber\\\ &+2E[I(U_{1}>C_{11},U_{1}'\leq C_{12}) I(X_{2}'=0)] \nonumber\\\ &+E[I(U_{1}>C_{11},U_{1}'\leq C_{12})I(X_{2}>0,X_{2}'>0)sign(Z_{2}-Z_{2}')] \nonumber\\\ &-E[I(U_{1}'>C_{11},U_{1}\leq C_{12})I(X_{2}>0,X_{2}'>0)sign(Z_{2}-Z_{2}')]. \end{align} \end{aligned}$$

From the definition of U, let Z_j = f_j(U_j) and Δ_j = f_j(C_j) for j = 1, 2. Using sign(x) = 2I(x > 0) − 1, we obtain

$$\begin{aligned} \begin{align} \tau_{12}=&-2E[I(Z_{1}>\Delta_{11},Z_{1}'\leq \Delta_{12},Z_{2}\leq \Delta_{2})]+2E[I(Z_{1}>\Delta_{11},Z_{1}'\leq \Delta_{12},Z_{2}'\leq \Delta_{2})] \nonumber\\\ &+2E[I(Z_{1}>\Delta_{11},Z_{1}'\leq \Delta_{12})I(Z_{2}>\Delta_{2},Z_{2}'>\Delta_{2},Z_{2}-Z_{2}'>0)] \nonumber\\\ &-2E[I(Z_{1}'>\Delta_{11},Z_{1}\leq \Delta_{12})I(Z_{2}>\Delta_{2},Z_{2}'>\Delta_{2},Z_{2}-Z_{2}'>0)] \nonumber\\\ =&-2E[I(Z_{1}>\Delta_{11},Z_{1}'\leq \Delta_{12}, Z_{2}\leq \Delta_{2})]+2E[I(Z_{1}>\Delta_{11},Z_{1}'\leq \Delta_{12}, Z_{2}'\leq \Delta_{2})] \nonumber\\\ &+2E[I(Z_{1}>\Delta_{11},Z_{1}'\leq\Delta_{12},Z_{2}'>\Delta_{2},Z_{2}>Z_{2}')] \nonumber\\\ &-2E[I(Z_{1}'>\Delta_{11},Z_{1}\leq\Delta_{12},Z_{2}'>\Delta_{2},Z_{2}>Z_{2}')]. \end{align} \end{aligned}$$

Since $\{\frac{Z_{2}'-Z_{2}}{\sqrt{2}}, -Z{1}\}$, $\{\frac{Z_{2}'-Z_{2}}{\sqrt{2}}, Z{1}'\}$ and $\{\frac{Z_{2}'-Z_{2}}{\sqrt{2}}, -Z{2}'\}$ are standard bivariate normally distributed variables with correlation $-\frac{1}{\sqrt{2}}$, $r/\sqrt{2}$ and $-\frac{r}{\sqrt{2}}$, respectively, by the definition of Φ₃(⋅,⋅,⋅; ⋅ ) and Φ₄(⋅,⋅,⋅,⋅; ⋅ ) we have

$$\begin{aligned} \begin{align} F_{NT}(r;\Delta_{j}^{1},\Delta_{j}^{2},\Delta_{k})= & -2\Phi_{3}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},\Delta_{k};\begin{pmatrix} 1 & 0 & -r \\\ 0 & 1 & 0 \\\ -r & 0 & 1 \end{pmatrix} \right\} \nonumber\\\ &+2\Phi_{3}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},\Delta_{k};\begin{pmatrix} 1 & 0 & 0 \\\ 0 & 1 & r \\\ 0 & r & 1 \end{pmatrix}\right\}\nonumber \\\ & +2\Phi_{4}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},-\Delta_{k},0;\begin{pmatrix} 1 & 0 & 0 & \frac{r}{\sqrt{2}} \\\ 0 & 1 & -r & \frac{r}{\sqrt{2}} \\\ 0 & -r & 1 & -\frac{1}{\sqrt{2}} \\\ \frac{r}{\sqrt{2}} & \frac{r}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 1 \end{pmatrix}\right\} \nonumber\\\ &-2\Phi_{4}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},-\Delta_{k},0;\begin{pmatrix} 1 & 0 & r & -\frac{r}{\sqrt{2}} \\\ 0 & 1 & 0 & -\frac{r}{\sqrt{2}} \\\ r & 0 & 1 & -\frac{1}{\sqrt{2}} \\\ -\frac{r}{\sqrt{2}} & -\frac{r}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 1 \end{pmatrix}\right\}. \end{align} \end{aligned}$$

Using the facts that

$$\begin{aligned} \begin{align} &\Phi_{4}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},-\Delta_{k},0;\begin{pmatrix} 1 & 0 & r & -\frac{r}{\sqrt{2}} \\\ 0 & 1 & 0 & -\frac{r}{\sqrt{2}} \\\ r & 0 & 1 & -\frac{1}{\sqrt{2}} \\\ -\frac{r}{\sqrt{2}} & -\frac{r}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 1 \end{pmatrix}\right\} \nonumber\\ &+\Phi_{4}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},-\Delta_{k},0;\begin{pmatrix} 1 & 0 & r & \frac{r}{\sqrt{2}} \\\ 0 & 1 & 0 & \frac{r}{\sqrt{2}} \\\ r & 0 & 1 & \frac{1}{\sqrt{2}} \\\ \frac{r}{\sqrt{2}} & \frac{r}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 1 \end{pmatrix}\right\} \nonumber\\\ =&\Phi_{3}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},-\Delta_{k};\begin{pmatrix} 1 & 0 & 0 \\\ 0 & 1 & r \\\ 0 & r & 1 \end{pmatrix}\right\} \end{align} \end{aligned}$$

and

$$\begin{aligned} \begin{align} &\Phi_{3}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},-\Delta_{k};\begin{pmatrix} 1 & 0 & 0 \\\ 0 & 1 & r \\\ 0 & r & 1 \end{pmatrix}\right\}+\Phi_{3}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},\Delta_{k};\begin{pmatrix} 1 & 0 & -r \\\ 0 & 1 & 0 \\\ -r & 0 & 1 \end{pmatrix} \right\} \nonumber\\\ =&\Phi_{2}(-\Delta_{j}^{1},\Delta_{j}^{2};0) =\Phi(-\Delta_{j}^{1})\Phi(\Delta_{j}^{2}). \end{align} \end{aligned}$$

So that,

$$\begin{aligned} \begin{align} F_{NT}(r;\Delta_{j}^{1},\Delta_{j}^{2},\Delta_{k})= & -2\Phi(-\Delta_{j}^{1})\Phi(\Delta_{j}^{2}) \nonumber\\\ &+2\Phi_{3}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},\Delta_{k};\begin{pmatrix} 1 & 0 & 0 \\\ 0 & 1 & r \\\ 0 & r & 1 \end{pmatrix}\right\}\nonumber \\\ & +2\Phi_{4}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},-\Delta_{k},0;\begin{pmatrix} 1 & 0 & 0 & \frac{r}{\sqrt{2}} \\\ 0 & 1 & -r & \frac{r}{\sqrt{2}} \\\ 0 & -r & 1 & -\frac{1}{\sqrt{2}} \\\ \frac{r}{\sqrt{2}} & \frac{r}{\sqrt{2}} & -\frac{1}{\sqrt{2}} & 1 \end{pmatrix}\right\} \nonumber\\\ &+2\Phi_{4}\left\{-\Delta_{j}^{1},\Delta_{j}^{2},-\Delta_{k},0;\begin{pmatrix} 1 & 0 & r & \frac{r}{\sqrt{2}} \\\ 0 & 1 & 0 & \frac{r}{\sqrt{2}} \\\ r & 0 & 1 & \frac{1}{\sqrt{2}} \\\ \frac{r}{\sqrt{2}} & \frac{r}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 1 \end{pmatrix}\right\}. \end{align} \end{aligned}$$

It is easy to get the bridge function for truncated/ternary case by switching j and k.

Derivation of approximate bound for the ternary/continuous case

Let $n_{0x}=\sum_{i=1}^{n_x}I(x_{i}=0)$, $n_{2x}=\sum_{i=1}^{n_x}I(x_{i}=2)$, $\pi_{0x}=\frac{n_{0x}}{n_{x}}$ and $\pi_{2x}=\frac{n_{2x}}{n_{x}}$, then

$$\begin{aligned} \begin{align} |\tau(\mathbf{x})|\leq & \frac{n_{0x}(n-n_{0x})+n_{2x}(n-n_{0x}-n_{2x})}{\begin{pmatrix} n \\ 2 \end{pmatrix}} \nonumber\\\ = & 2\{\frac{n_{0x}}{n-1}-(\frac{n_{0x}}{n})(\frac{n_{0x}}{n-1})+\frac{n_{2x}}{n-1}-(\frac{n_{2x}}{n})(\frac{n_{0x}}{n-1})-(\frac{n_{2x}}{n})(\frac{n_{2x}}{n-1})\} \nonumber\\\ \approx & 2\{\frac{n_{0x}}{n}-(\frac{n_{0x}}{n})^2+\frac{n_{2x}}{n}-(\frac{n_{2x}}{n})(\frac{n_{0x}}{n})-(\frac{n_{2x}}{n})^2\} \nonumber\\\ = & 2\{\pi_{0x}(1-\pi_{0x})+\pi_{2x}(1-\pi_{0x}-\pi_{2x})\} \end{align} \end{aligned}$$

For ternary/binary and ternary/ternary cases, we combine the two individual bounds.

Derivation of approximate bound for the ternary/truncated case

Let x ∈ ℛⁿ and y ∈ ℛⁿ be the observed n realizations of ternary and truncated variables, respectively. Let $n_{0x}=\sum_{i=0}^{n}I(x_{i}=0)$, $\pi_{0x}=\frac{n_{0x}}{n}$, $n_{1x}=\sum_{i=0}^{n}I(x_{i}=1)$, $\pi_{1x}=\frac{n_{1x}}{n}$, $n_{2x}=\sum_{i=0}^{n}I(x_{i}=2)$, $\pi_{2x}=\frac{n_{2x}}{n}$, $n_{0y}=\sum_{i=0}^{n}I(y_{i}=0)$, $\pi_{0y}=\frac{n_{0y}}{n}$, $n_{0x0y}=\sum_{i=0}^{n}I(x_{i}=0 \;\& \; y_{i}=0)$, $n_{1x0y}=\sum_{i=0}^{n}I(x_{i}=1 \;\& \; y_{i}=0)$ and $n_{2x0y}=\sum_{i=0}^{n}I(x_{i}=2 \;\& \; y_{i}=0)$ then

$$\begin{aligned} \begin{align} |\tau(\mathbf{x}, \mathbf{y})|\leq & \frac{\begin{pmatrix}n \\ 2\end{pmatrix}-\begin{pmatrix}n_{0x} \\ 2\end{pmatrix}-\begin{pmatrix}n_{1x} \\ 2\end{pmatrix}-\begin{pmatrix} n_{2x} \\ 2 \end{pmatrix}-\begin{pmatrix}n_{0y} \\ 2\end{pmatrix}+\begin{pmatrix}n_{0x0y} \\ 2 \end{pmatrix}+\begin{pmatrix}n_{1x0y} \\ 2\end{pmatrix}+\begin{pmatrix}n_{2x0y} \\ 2\end{pmatrix}}{\begin{pmatrix}n \\ 2\end{pmatrix}} \nonumber \end{align} \end{aligned}$$

Since n_0x0y ≤ min (n_0x, n_0y), n_1x0y ≤ min (n_1x, n_0y) and n_2x0y ≤ min (n_2x, n_0y) we obtain

$$\begin{aligned} \begin{align} |\tau(\mathbf{x}, \mathbf{y})|\leq & \frac{\begin{pmatrix}n \\ 2\end{pmatrix}-\begin{pmatrix}n_{0x} \\ 2\end{pmatrix}-\begin{pmatrix}n_{1x} \\ 2\end{pmatrix}-\begin{pmatrix} n_{2x} \\ 2 \end{pmatrix}-\begin{pmatrix}n_{0y} \\ 2\end{pmatrix}}{\begin{pmatrix}n \\ 2\end{pmatrix}} \nonumber\\\ & + \frac{\begin{pmatrix}\min(n_{0x},n_{0y}) \\ 2 \end{pmatrix}+\begin{pmatrix}\min(n_{1x},n_{0y}) \\ 2\end{pmatrix}+\begin{pmatrix}\min(n_{2x},n_{0y}) \\ 2\end{pmatrix}}{\begin{pmatrix}n \\ 2\end{pmatrix}} \nonumber\\\ \leq & \frac{\begin{pmatrix}n \\ 2\end{pmatrix}-\begin{pmatrix}\max(n_{0x},n_{1x},n_{2x},n_{0y}) \\ 2\end{pmatrix}}{\begin{pmatrix}n \\ 2\end{pmatrix}} \nonumber\\\ \leq & 1-\frac{\max(n_{0x},n_{1x},n_{2x},n_{0y})(\max(n_{0x},n_{1x},n_{2x},n_{0y})-1)}{n(n-1)} \nonumber\\\ \approx & 1-(\frac{\max(n_{0x},n_{1x},n_{2x},n_{0y})}{n})^{2} \nonumber\\\ =& 1-\{\max(\pi_{0x},\pi_{1x},\pi_{2x},\pi_{0y})\}^{2} \nonumber\\\ =& 1-\{\max(\pi_{0x},(1-\pi_{0x}-\pi_{2x}),\pi_{2x},\pi_{0y})\}^{2} \end{align} \end{aligned}$$

It is easy to get the approximate bound for truncated/ternary case by switching x and y.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

math.rst

math.rst

Mathematical Framework for latentcor

Main Framework

Appendix

Files

math.rst

Latest commit

History

math.rst

File metadata and controls

Mathematical Framework for latentcor

Main Framework

Appendix