In [1]:
import numpy as np

ModuleNotFoundError: No module named 'numpy'

### Generating correlated auxiliary noise

Here we will generate correlated bitstrings with a controlled amount of correlation. 

- Before, I thought defining a covariance matrix would be a good strategy. I have moved away from this for several reasons: Covariance is not directly related to conditional entropy, a covariance matrix plus marginals does _not_ nicely define a joint distribution, and sampling (when you fix the marginals) is kind of ugly (see "arcsin method").

#### (1) Separable, symmetric UCAN
I will call my UCAN _separable_ if $p_{\Gamma^n, \Delta^n} = \bigotimes_{i=1}^n p_{\Gamma_i, \Delta_i}$. I will call a joint distirbution on $\{0,1\}^2$ _symmetric_ if $p_{01} = p_{10}$.
1. Sample $\Delta$ according to a fair coinflip (or $p_\Delta$)
2. Fix $p_{\Gamma\Delta}(0,1) = p_{\Gamma\Delta}(1,0) = p_{diff}$ and $p_{\Gamma \Delta}(0,0)$
 - If $p_\Delta$ is not a coinflip, then in general $p_{\Gamma|\Delta}(0|1) \neq p_{\Gamma|\Delta}(1|0)$.
 3. Sample $\Gamma \sim p_{\Gamma | \Delta}$ as computed above.
 4. This conditional entropy is 
 \begin{align}
 H(\Gamma | \Delta) &= -\sum_{\gamma \delta}p_{\Gamma \Delta}(\gamma, \delta) \log p_{\Gamma | \Delta}(\gamma | \delta)
 \\&= -\sum_{\gamma \neq \delta}p_{diff} \log \frac{p_{diff}}{p_\Delta(\delta)} - \left[ p_{\Gamma \Delta}(0,0) \log  p_{\Gamma |\Delta}(0|0) +  p_{\Gamma \Delta}(1,1) \log  p_{\Gamma |\Delta}(1|1) \right]
 \end{align}

TODO: can I massage that into something kind of nice? Otherwise, its definitely computable.

The tool to do this will be a covariance matrix

In [13]:
import numpy as np
def bitwise_ucan_v1(n, n_data, p0_delta, p_diff, p_00, seed=0):
    """Generate a sample of (Gamma, Delta) UCAN pairs, in Gamma|Delta mode.

    The CAN is SEPARABLE and SYMMETRIC.
    
    This will return an (n_data, n, 2) array of UCAN pairs. The last axis corresponds to
    a noise bitstring (Gamma) and a CAN bitstring (Delta). 

    Args:
        p0_delta: length-n array containing Pr(Delta_i=0) at location i
        p_diff: length-n array containing Pr(Gamma_i=1, Delta_i=0)  
        p_00: length-n array containing Pr(Gamma_i=0, Delta_i=0)  

    """

    # Sample our Delta bits according to p0_delta
    np.random.seed(seed)
     # shape (n_data, n); the second axis probabilities of 0 are given by p0_delta
    delta = np.random.binomial(1, 1 - p0_delta, size=(n_data, n))
    print(delta.shape)
    print(delta)
    # Now compute p_{Gamma|Delta}(0|1), and use the delta samples as a mask for anothe binomial sample
    p_gd_10 = np.divide(p_diff / (1-p0_delta)) #size n array, pr(Gamma_i=1|Delta_i=0)
    p_gd_11 = np.divide((1 - p_00) / (1-p0_delta)) # size n array, pr(Gamma_i=1|Delta_i=1)

    # We'll just mask for two separate bernouli sampling experiments
    mask_11 = p_gd_11 * delta 
    gammas_11 = np.random.binomial(1, mask_11)
    mask_10 = p_gd_10 * (1 - delta)
    gammas_10 = np.random.binomial(1, mask_10)
    gammas = gammas_11 + gammas_10

    # Now stack the two datasets of bitstrings
    return np.stack([gammas, delta], axis=-1)

# TODO: compute entropy ;)

In [14]:
bitwise_ucan_v1(3, 10, np.array([0.5, 0.5, 1]), np.array([0.2, 0.2, 0.2]), np.array([0.6, 0.7, 0.6]))

(10, 3)
[[1 1 0]
 [1 0 0]
 [0 1 0]
 [0 1 0]
 [1 1 0]
 [0 0 0]
 [1 1 0]
 [1 0 0]
 [0 1 0]
 [1 1 0]]
