# `NumPy` Puzzles
This notebook is less of a tutorial and more of a cook-book/reference of `NumPy` puzzles that myself or friends have run into!

In [1]:
# Import the necessary packages.
import numpy as np

## Polarizing A Haplotype Matrix
Here I have 2D matrix of shape (variants, samples), and I want to polarize the entire matrix using a 1D ancestral sequence array. First, I need to intialize a toy data set consiting of 5 samples over 3 sites.

In [2]:
# Intialize the ancestral sequence.
anc_seq = np.array([0, 1, 1])
print('ancestral sequence: ', anc_seq)
# Intialize the haplotype matrix. 
hap_mat = np.empty((3, 5))
site_1 = np.zeros(5)
site_2 = np.zeros(5)
site_3 = np.ones(5)
hap_mat[0, :] = site_1
hap_mat[1, :] = site_2
hap_mat[2, :] = site_3
print('unpolarized haplotype matrix: '+'\n', hap_mat)

ancestral sequence:  [0 1 1]
unpolarized haplotype matrix: 
 [[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [1. 1. 1. 1. 1.]]


Next, I need to define a function that polarizes a haplotype matrix given an ancestral sequence.

In [3]:
def polarize_haplotype_matrix(haplotype_matrix, ancestral_states):
    """
    ###########################################################################
    INPUT
        haplotype_matrix: 2D matrix of shape (n_sites, n_haplotypes).
        ancestral_states: 1D array of ancestral states to polarize from.
    ---------------------------------------------------------------------------
    OUTPUT: Polarized haplotype matrix
    ###########################################################################
    """
    # Intialize an empty haplotype matrix.
    polarized_haplotype_matrix = np.empty_like(haplotype_matrix)
    # For all haplotypes...
    for sample in range(haplotype_matrix.shape[1]):
        # Extract the sample's haplotype.
        haplotype = haplotype_matrix[:, sample]
        # Polarize the sample's haplotype.
        polarized_haplotype_matrix[:, sample] = np.where(ancestral_states == 1, np.abs(haplotype - 1), haplotype)
    return polarized_haplotype_matrix

Now, we test the function!

In [4]:
# Polarize the haplotype matrix.
polarized_hap_mat = polarize_haplotype_matrix(hap_mat, anc_seq)
print('ancestral sequence: ', anc_seq)
print('unpolarized haplotype matrix: '+'\n', hap_mat)
print('polarized haplotype matrix: '+'\n', polarized_hap_mat)

ancestral sequence:  [0 1 1]
unpolarized haplotype matrix: 
 [[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [1. 1. 1. 1. 1.]]
polarized haplotype matrix: 
 [[0. 0. 0. 0. 0.]
 [1. 1. 1. 1. 1.]
 [0. 0. 0. 0. 0.]]
