# Table of Contents
 <p><div class="lev1"><a href="#Initialisation-+-tests"><span class="toc-item-num">1&nbsp;&nbsp;</span>Initialisation + tests</a></div><div class="lev1"><a href="#Introduction"><span class="toc-item-num">2&nbsp;&nbsp;</span>Introduction</a></div><div class="lev1"><a href="#Single-trait-Fine-mapping"><span class="toc-item-num">3&nbsp;&nbsp;</span>Single trait Fine-mapping</a></div><div class="lev2"><a href="#Bayes-Factor-Computation"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Bayes Factor Computation</a></div><div class="lev3"><a href="#Derivation-of-$z$-values"><span class="toc-item-num">3.1.1&nbsp;&nbsp;</span>Derivation of <span class="MathJax_Preview" style="color: inherit;"><span class="MJXp-math" id="MJXp-Span-1"><span class="MJXp-mi MJXp-italic" id="MJXp-Span-2">z</span></span></span><span class="MathJax MathJax_Processing" id="MathJax-Element-1-Frame" tabindex="0"></span><script type="math/tex" id="MathJax-Element-1">z</script> values</a></div><div class="lev3"><a href="#Calculation-of-Bayes-Factor"><span class="toc-item-num">3.1.2&nbsp;&nbsp;</span>Calculation of Bayes Factor</a></div><div class="lev3"><a href="#Calculation-of-Posterior"><span class="toc-item-num">3.1.3&nbsp;&nbsp;</span>Calculation of Posterior</a></div><div class="lev3"><a href="#Implementation"><span class="toc-item-num">3.1.4&nbsp;&nbsp;</span>Implementation</a></div><div class="lev3"><a href="#Example"><span class="toc-item-num">3.1.5&nbsp;&nbsp;</span>Example</a></div><div class="lev2"><a href="#Trait-simulation"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Trait simulation</a></div><div class="lev3"><a href="#Explanation"><span class="toc-item-num">3.2.1&nbsp;&nbsp;</span>Explanation</a></div><div class="lev3"><a href="#Implementation"><span class="toc-item-num">3.2.2&nbsp;&nbsp;</span>Implementation</a></div><div class="lev3"><a href="#Example"><span class="toc-item-num">3.2.3&nbsp;&nbsp;</span>Example</a></div><div class="lev1"><a href="#Colocalisation"><span class="toc-item-num">4&nbsp;&nbsp;</span>Colocalisation</a></div><div class="lev1"><a href="#Tests"><span class="toc-item-num">5&nbsp;&nbsp;</span>Tests</a></div>

# Initialisation + tests

In [5]:
### Load modules and data

import numpy as np
import itertools as it
import matplotlib.pyplot as plt
import math
from scipy import stats
import pdb
from sklearn import preprocessing
import copy
from unittest import *
import itertools
from bidict import bidict

%matplotlib inline


# Introduction

In this notebook I implement basic fine-mapping methods. Firstly I implement a basic method to calculated Bayes Factors given sets of SNPs and their effect sizes using the LD structure calculated from 1000 genomes, from which I calculate posterior probabilities of gene sets. Then I simulate trait data with specified effect sizes, and generate summary statistics from these data. Following this, I implement a colocalisation method.

# Single trait Fine-mapping

## Bayes Factor Computation

### Derivation of $z$ values

To start, we assume that the trait $y$ is modelled as:

$$ y = X\beta + \epsilon $$

Where $X$ is an $n$x$m$ matrix of values consisting of 0,1,2 denoting whether a SNP is homozygous to the common allele, heterozygous, or homozygous to the rare allele respectively. $n$ denotes the number of samples, and $m$ the number of causitive SNPs.

We scale $X$ such that $\frac{1}{n}\sum^{n}_{i=1} X_{ij} = 0$, and $\frac{1}{n}\sum^{n}_{i=1} X^2_{ij} = 1$ for $j = 1,2, ... m$. We also scale $y$ such that $\frac{1}{n}\sum^n_{i=1} y_i = 0$ and $\frac{1}{n}\sum^n_{i=1} y_i^2 = 1$.

We assume $\epsilon$ ~ $N(0, \frac{1}{\tau} I_n)$. We also assume $\beta$ has a prior normal distribution $N(0,\nu \frac{1}{\tau})$. $\nu$ is diagonal, $\beta$ and $\epsilon$ are independent and we assume all SNPs have the same prior variance $\sigma^2 \frac{1}{\tau}$. Therefore $\nu = \sigma^2 I_m$.

Now given this prior on $\beta$, and using $X$ and $\epsilon$, we can deduce the expectation and mean of $y$.

$$E(y \: | \: \tau, X) = E(E(y \: | \: \tau,X,\beta)) = E(X \beta) = 0$$ 

<sub>[ *since* $E(\beta) = 0$ ]</sub>

$$ Var(y \: | \: \tau, X) = E(Var(y \: | \tau, X, \beta)) + Var(E(y \: | \: \tau, X, \beta)) $$

<sub>[ *since* $Var(X \: | \: Y) = E(Var(X \: | \: Y)) + Var(E(X \: | \: Y))$ ]</sub>

$$ = E(\frac{1}{\tau}I_n) + Var(X \beta)$$

$$ = \frac{1}{\tau}( I_n + X \nu X^T)$$

Now, since y is a linear transformation of a multivariate normal random vector,

$$ y \:|\: \tau, X \sim N \left( 0,\frac{1}{\tau}( I_n + X \nu X^T)) \right) $$

The null distribution is when $\beta = 0$. In which case,

$$y \:|\: \tau, X \sim N \left( 0,\frac{1}{\tau}I_n \right) $$

Now consider a new variable $z = \sqrt{\frac{\tau}{n}} X^{T}y$:

$$ z ~ \sim N \left( 0, \frac{X^T}{n}(I_n + X \nu X^T) X \right)$$

$$ = N \left( 0, \left(\frac{X^TX}{n} + \frac{X^TX \nu X^TX}{n}\right) \right)$$

Now let $\Sigma_x = \frac{X^T X}{n}$. Since all column in $X$ are standardised, this is equivalent to the correlation matrix or, more importantantly, the linkage disequilibirum structure of the SNPs which can be derived from the 1000 genomes data.

Then we have:

$$ z \sim N(0, \Sigma_x + \Sigma_x n\nu \Sigma_x) $$


### Calculation of Bayes Factor 

The *Bayes Factor* is the ratio of the likelihood functions under the alternative hypothesis, and under the null hypothesis. It is equivalent to the likelihood ratio.

$P_1(z \:|\: \tau, X)$, the likelihood of $z$ under our alternate hypothesis, i.e. when $\nu \neq 0$ is:

$$ P_1(z \:|\: \tau, X) = 2\pi^{-\frac{n}{2}} | \Sigma_x + \Sigma_x n\nu \Sigma_x |^{-\frac{1}{2}} \exp\left(-\frac{1}{2}z^T(\Sigma_x + \Sigma_x n\nu \Sigma_x)^{-1}z\right)$$

$P_0(z \:| \: \tau, X)$, the likelihood of $z$ under the null hypothesis when $\nu = 0$ is:

$$P_0(z \:| \: \tau, X) = 2\pi^{-\frac{n}{2}} |\Sigma_x|^{-\frac{1}{2}} \exp\left(-\frac{1}{2}z^T(\Sigma_x)^{-1}z\right)$$

Therefore we calculate the Bayes Factor as:

$$ \frac{
| \Sigma_x + n\nu \Sigma_x^2 |^{-\frac{1}{2}} \exp\left(-\frac{1}{2}z^T(\Sigma_x + \Sigma_x n\nu \Sigma_x)^{-1}z\right)
}{
|\Sigma_x|^{-\frac{1}{2}} \exp\left(-\frac{1}{2}z^T(\Sigma_x)^{-1}z\right)
}
$$

We assume that $X$ has full column rank, and that $\Sigma_x$ also has rank $m$ and is non-singular. That is to say, we assume that no two snps are in full linkage disequilibrium.

Using the Woodberry matrix identity:

$$
(\Sigma_x + \Sigma_x n\nu \Sigma_x)^{-1} = \Sigma_x^{-1} - ((n\nu)^{-1} + \Sigma_x)^{-1}
$$

Therefore the resulting Bayes Factor is:

$$
|I_m + n\nu \Sigma_x|^\frac{1}{2} \exp(\frac{1}{2}z^T((n\nu)^{-1} + \Sigma_x)^{-1}z)
$$

Crucially, this only depends on inverting matrices of size m, our candidate gene set. Therefore we compute these Bayes Factors using sets of candidate SNPs of size m, and choose the set with the highest calculated Bayes Factor.

In practice, we recieve $\beta$, $se(\beta)$, and the SNP linkage disequilibrium structure $\Sigma_x$.

Since both $X$ and $y$ are normalised, 

$$\beta = \frac{X^T y}{n}$$

Also, 
$$\tau = \frac{1}{\sigma^2}, \:\: se(\epsilon) = \frac{\sigma}{\sqrt{n}}$$

where $\sigma$ is the observed standard deviation of the errors $\epsilon$.

Therefore:
$$
se(\epsilon) = \frac{1}{\sqrt{n\tau}}
$$

Therefore we generate the $z$ vector exactly with and $se$ is the standard error:

$$
\frac{\beta}{se(\beta)} = \sqrt{\frac{\tau}{n}} X^{T}y = z
$$

The Bayes Factor can then be directly calculated using $z$ and $\Sigma_x$.



### Calculation of Posterior

We place a binomial prior on candidate gene sets. If our gene set $G$ has size $m$, we assume that each SNPs has probability  $p = \frac{1}{m}$ of being causal. Therefore the prior probability of a causal gene set with size $l$ is:

$$
P(G) = p^l(1-p)^{m-l}
$$

Therefore using Bayes Theorem:

$$
P(G \: | \: X) = \frac{P(X \: | \: G) \times P(G)}{P(X)}
$$

to calculate posterior probabilities of the gene sets where $P(X \: | \: G)$ is calculated from the normalised Bayes Factors.

However when we calculated the Bayes Factors, these are not exactly the likelihoods. They are however far easier to compute.

The Bayes Factors we have calculated are equivalent to:

$$
\frac{P(X \: | \: G)}{P(X \: | \: G_0)}
$$

where $G_0$ is the null hypothesis that no gene-set is casual.

However, since $P(X \: | \: G_0)$ is a constant for all gene-sets, this is proportional to the likelihood term. Therefore we can normalise to output the posterior probability distributions.

### Implementation

In [6]:
### Create selection of SNPs
def select_snps(z, subset):
    return [z[i] for i in subset]

#example
# for subset in it.combinations(range(len(z1)),3):
#     print subset, select_snps(z1, subset)    



### Select covariance submatrix

def select_cov(cov, subset):
    return cov[np.ix_(subset,subset)]

#example   
#select_cov(LD_tss_1, (0,1,5))

### Calculate Bayes Factor

def calc_BF(z, cov,n,v=0.1):
    """
    Calculate the Bayes factor of a single set of candidate SNPs effect sizes z,
    covariance matrix cov, a prior variance on beta v, and a sample
    size n.
    """
    z = np.matrix(z)
    z = z.T
    v_matrix = np.matrix(np.eye(len(z)) * v)
#     pdb.set_trace()
    coeff = 1. / math.sqrt(np.linalg.det((np.matrix(np.eye(len(z))) + n * v_matrix * np.matrix(cov))))
    exponent = 0.5* z.T * np.matrix(np.linalg.pinv((n*v_matrix).I + cov)) * z
    return np.array(math.log(coeff) + exponent)[0][0]

# example
# subset = (0,1,5,8)
# cov = select_cov(LD_tss_1, subset)
# z = select_snps(z1, subset)
# v = np.eye(len(z))/1000
# n = 1000
# calc_BF(z,cov,v,n)

def calc_prior(x,m,prior='binomial'):
    if prior == 'binomial':
        p = 1./m
        l = len(x)
        return p**l * (1-p)**(m-l)
    else:
        return None
    
# example
# calc_prior((1,3,5),30)
    
def calc_posterior(variant_set_BF,prior='binomial'):
    
    priors = [math.log(calc_prior(x[0],30)) for x in variant_set_BF]
    
    log_bayes_factors = [x[1] for x in variant_set_BF]

    unscaled_log_posteriors = [ log_bayes_factors[i] + priors[i] for i in range(len(log_bayes_factors))]

    scaled_log_posteriors = np.array(unscaled_log_posteriors) - max(unscaled_log_posteriors)

    scaled_posteriors = [math.exp(x) for x in scaled_log_posteriors]

    calib_factor = sum([math.exp(x) for x in scaled_log_posteriors])

    posteriors = [x/calib_factor for x in [math.exp(x) for x in scaled_log_posteriors]]
    
    aug_posteriors = [(variant_set_BF[i][0], posteriors[i]) for i in range(len(posteriors))]
    
    aug_posteriors.sort(key=lambda x: x[1], reverse=True)
    
    return aug_posteriors





def calc_variant_set_BFs(data,k,v=0.1,prior='binomial'):
    """
    Calculate variant set posteriors with a binomial prior as normal,
    searching all variant sets up till size k.
    v is the prior variance on beta.
    data has the format (z,LD,n) where z is the effect sizes, 
    LD is the linkage disequilibrium matrix, and n is the 
    number of samples.
    """
    bayes_factors = []
    for i in range(1,k):
        for subset in it.combinations(range(len(data[0])),i):
            z = select_snps(data[0], subset)
            cov = select_cov(data[1],subset)
            n = data[2]
            bayes_factors.append((subset, calc_BF(z, cov,n,v)))
    
    bayes_factors.sort(key=lambda x: x[1], reverse=True)
    return bayes_factors



### Example

In [7]:
s_tss_1=np.load('summary_stats_g1_tss60.npy')[0]
s_tss_2=np.load('summary_stats_g2_tss60.npy')[0]
LD_tss_1=np.load('LD_g1_TSS60.npy')
LD_tss_2=np.load('LD_g2_TSS60.npy')

### Generate z arrays

n1 = 10000
n2 = 1000
z1 = np.array(np.divide(s_tss_1['beta'],np.sqrt(s_tss_1['var_beta'])))
z2 = np.array(np.divide(s_tss_2['beta'],np.sqrt(s_tss_1['var_beta'])))
z1 = np.ndarray.flatten(z1)
z2 = np.ndarray.flatten(z2)

### Initialise hyper parameters
k=3
data1 = (z1, LD_tss_1, 10000)
data2 = (z2, LD_tss_2, 1000)

### Calculate variant set Bayes Factors
set1 = calc_variant_set_BFs(data1,k)
set2 = calc_variant_set_BFs(data2,k)

### Calculate variant set posteriors
posteriors1 = calc_posterior(set1)
posteriors2 = calc_posterior(set2)

posteriors1.sort(key=lambda x: x[1], reverse=True)
posteriors2.sort(key=lambda x: x[1], reverse=True)


In [76]:
posteriors2[0:10]

[((27, 29), 0.9151178367714483),
 ((29,), 0.02660978343371104),
 ((1, 29), 0.01814290094185906),
 ((6, 29), 0.014384187969434005),
 ((28, 29), 0.0072543067927032835),
 ((29, 30), 0.003287058453751547),
 ((16, 29), 0.003015627470309419),
 ((25, 29), 0.002901065736140464),
 ((0, 29), 0.002500323472232103),
 ((2, 29), 0.0010426464045647608)]

## Trait simulation

### Explanation

Given genotype data and an LD structure, simulate a trait which is linearly associated with a variant, or a set of variants. Here I generate a large $m \times n$ matrix ($m$=number of samples, $n$=number of SNPs), with $0,1,2$ as elements.

Then, I can choose a set of SNPs, and from these SNPs I generate a trait with a linear model with a given parameter $\beta$, as well as an unexplained variance parameter $\epsilon$.

Following this, I try to recover these sets of SNPs. I generate p-values for each SNP being associated with the trait, by individually building univariate linear models for each SNPs, as I understand summary statistics are generated.

### Implementation

In [8]:
### Sample genotypes

def simulate_genotype(n,m,geno_dist):
    """
    Simulate a genotype of n samples and m causal SNPs with specified genotype distribution for (0,1,2).
    """
    X=np.zeros([n,m])
    for i in range(m):
        X[:,i] = [np.random.choice(a=[0,1,2],p=geno_dist) for x in range(n)]
    return np.array(X)

###example
# X = simulate_genotype(n=10000,m=30,geno_dist=[0.85,0.1,0.05])

def simulate_traits(X,snp_group,eps=0.5):
    """
    SNPs in the form e.g. {3: 0.9, 5:0.4, 8:0.5}. Dictionary values are the linear model coefficients (beta values).
    eps is the level of unexplained variance. X is the genotype information.
    """
    beta = np.array(snp_group.values()).T
    snps = snp_group.keys()
    eps_vector = np.array(np.random.normal(0,eps,X.shape[0])).T
    return np.add(np.dot(X[:,snps], beta), eps_vector)
    
# examples
# y = simulate_traits(X,eps=0.5,snp_group={3: 5, 9: 3})

def build_linear_models(X,y):
    """
    Build univariate linear models for each SNP column in X against the trait y.
    """
    return [stats.linregress(X[:,i],y) for i in range(X.shape[1])]

# example
# models1 = [x for x in build_linear_models(X,y)]

def calc_effect_sizes(models):
    """
    Calculate the effect sizes = beta / se(beta) of individual SNPs towards the traits.
    Takes in a list of linear regression models.
    """
    return [x.slope / x.stderr for x in models]

# example
# z1 = [x.slope / x.stderr for x in models1]



### Example

In [67]:
snp_groups = [{1: 5}, {1: 5, 3: 6}, {1: 5, 3: 6, 15:3}, {1: 5, 3: 6, 15:3, 25:1}]

for g in snp_groups:
    n = 10000

    ### simulate genotypes
    X = simulate_genotype(n=10000,m=30,geno_dist=[0.85,0.1,0.05])
    ### scale columns
    X = preprocessing.scale(X)

    ### calculate LD matrix
    LD_matrix = np.corrcoef(X,rowvar=0)

    ### simulate traits
    y = simulate_traits(X,eps=0.5,snp_group=g)
    ### scale traits
    y = preprocessing.scale(y)

    t_statistics = build_linear_models(X,y)

    beta = [x.slope for x in t_statistics]
    se_beta = [x.stderr for x in t_statistics]

    ###calcuate z

    z =  np.divide(beta, se_beta)

    simulated_effectsize_data = ([x*np.sqrt(n) for x in beta], LD_matrix, n)

    gene_set_BFs = calc_variant_set_BFs(simulated_effectsize_data,k=5,v=0.01)

    gene_set_posteriors = calc_posterior(gene_set_BFs)
    print g, gene_set_posteriors[0:5]

{1: 5} [((1,), 0.9050352515170708), ((1, 21), 0.003189029568249363), ((1, 25), 0.00318428426718765), ((1, 19), 0.003161674088441622), ((1, 4), 0.0031344545531106003)]
{1: 5, 3: 6} [((1, 3), 0.9083736228868016), ((1, 3, 13), 0.0031873943086928548), ((1, 3, 10), 0.0031412730321097606), ((1, 3, 26), 0.003138966388610109), ((1, 3, 17), 0.0031380659516668206)]
{1: 5, 3: 6, 15: 3} [((1, 3, 15), 0.9150552356086875), ((1, 3, 12, 15), 0.0031637810581904495), ((1, 3, 4, 15), 0.0031548310641800084), ((1, 3, 15, 19), 0.003154222576982418), ((1, 3, 15, 27), 0.0031541062648322094)]
{1: 5, 3: 6, 25: 1, 15: 3} [((1, 3, 15, 25), 1.0), ((1, 3, 15), 3.0459784689226484e-28), ((1, 3, 9, 15), 1.0914551735029302e-30), ((1, 3, 13, 15), 1.083211641042767e-30), ((1, 3, 15, 21), 1.0756922570942834e-30)]


# Colocalisation

Is it possible to ascertain whether two traits are due to the same causal variant? This is the aim of colocalisation.

First simulate two traits with different effect sizes

In [62]:
def is_colocalised(X, trait1, trait2,db=0):
    """
    With respect to a shared genotype X, Determine whether trait1 and trait2 are colocalised.
    I.e. whether there is evidence that they share a genetic basis.
    """

    ### generate individual linear models
    models1 = build_linear_models(X,y1)
    models2 = build_linear_models(X,y2)

    ### pull out slope and standard error terms.
    beta1 = [x.slope for x in models1]
    se_beta1 = [x.stderr for x in models1]

    beta2 = [x.slope for x in models2]
    se_beta2 = [x.stderr for x in models2]

    ### calculate z scores
    simulated_effectsize_data1 = ([x*np.sqrt(n) for x in beta1], LD_matrix, n)
    simulated_effectsize_data2 = ([x*np.sqrt(n) for x in beta2], LD_matrix, n)

    ### generate the gene set Bayes Factors
    gene_set_BFs1 = calc_variant_set_BFs(simulated_effectsize_data1,k=4,v=0.01)
    gene_set_BFs2 = calc_variant_set_BFs(simulated_effectsize_data2,k=4,v=0.01)
    

    ### calculate the posteriors
    gene_set_posteriors1 = calc_posterior(gene_set_BFs1)
    gene_set_posteriors2 = calc_posterior(gene_set_BFs2)
    
    if db == 1: 
        
        print gene_set_BFs1[0:10]
        print gene_set_BFs2[0:10]
        
        print gene_set_posteriors1[0:10]
        print gene_set_posteriors2[0:10]


    ### sort by posterior size
    gene_set_posteriors1.sort(key=lambda x: x[0], reverse=False)
    gene_set_posteriors2.sort(key=lambda x: x[0], reverse=False)

    ### select just toe posteriors
    posteriors1 = [x[1] for x in gene_set_posteriors1]
    posteriors2 = [x[1] for x in gene_set_posteriors2]

    ### generate cartesian product from the posteriors
    cart_product = list(itertools.product(posteriors1,posteriors2))

    gene_set_len1 = len(gene_set_posteriors1)
    gene_set_len2 = len(gene_set_posteriors2)

    ### calculate colocalisation posteriors with a specificed scoring function.
    colocalisations = np.array(map(lambda x: min(x[0],x[1]), cart_product)).reshape(gene_set_len1,gene_set_len2)


    ### pull out sorted set list
    sorted_setlist1 = [x[0] for x in gene_set_posteriors1]
    sorted_setlist2 = [x[0] for x in gene_set_posteriors2]

    if db == 1:
        
        ###  create bidirectional map from gene_set to positon in colocalisation array
        setlist_1map = bidict([(sorted_setlist1[i],i) for i in range(len(sorted_setlist1))])
        setlist_2map = bidict([(sorted_setlist1[i],i) for i in range(len(sorted_setlist2))])

        bf_1map = dict(gene_set_BFs1)
        bf_2map = dict(gene_set_BFs2)


        posterior1_map = dict(gene_set_posteriors1)
        posterior2_map = dict(gene_set_posteriors2)
        pdb.set_trace()

    ### output total evidence for colocalisation
    return sum([colocalisations[i][i] for i in range(colocalisations.shape[0])])

In [69]:
gene_sets = [({8:6},{8:6}),
             ({8:6},{8:10}),
             ({8:6},{10:6}),
             ({8:6, 10:4},{8:6, 10:4}),
             ({8:6, 10:4},{8:6, 10:8}),
             ({8:6, 10:4},{8:6, 15:4}),
             ({8:6, 10:4, 12:3},{8:6, 10:4, 12:3}),
             ({8:6, 10:4, 12:3},{8:6, 10:4, 12:8}),
             ({8:6, 10:4, 12:3},{8:6, 10:4, 15:3}),
            ]

for g in gene_sets:
    ### set sample size
    n = 10000

    ### simulate genotypes and scale columns
    X = preprocessing.scale(simulate_genotype(n, 30, (0.85, 0.1, 0.05)))

    ### calculate LD matrix
    LD_matrix = np.corrcoef(X,rowvar=0)


    ### simulate two traits and scale columns
    y1 = preprocessing.scale(simulate_traits(X, g[0]))
    y2 = preprocessing.scale(simulate_traits(X, g[1]))

    print g, is_colocalised(X,y1,y2,db=0) > 0.6

({8: 6}, {8: 6}) True
({8: 6}, {8: 10}) True
({8: 6}, {10: 6}) False
({8: 6, 10: 4}, {8: 6, 10: 4}) True
({8: 6, 10: 4}, {8: 6, 10: 8}) True
({8: 6, 10: 4}, {8: 6, 15: 4}) False
({8: 6, 10: 4, 12: 3}, {8: 6, 10: 4, 12: 3}) True
({8: 6, 10: 4, 12: 3}, {8: 6, 10: 4, 12: 8}) True
({8: 6, 10: 4, 12: 3}, {8: 6, 10: 4, 15: 3}) False


In [24]:
%%time


0.0
CPU times: user 364 ms, sys: 7.86 ms, total: 372 ms
Wall time: 371 ms


# Tests

In [24]:
class TestStringMethods(unittest.TestCase):

    def test_upper(self):
        self.assertEqual('foo'.upper(), 'FOO')

    def test_isupper(self):
        self.assertTrue('FOO'.isupper())
        self.assertFalse('Foo'.isupper())

    def test_split(self):
        s = 'hello world'
        self.assertEqual(s.split(), ['hello', 'world'])
        # check that s.split fails when the separator is not a string
        with self.assertRaises(TypeError):
            s.split(2)


suite = unittest.TestLoader().loadTestsFromTestCase(TestStringMethods)
unittest.TextTestRunner().run(suite)

...
----------------------------------------------------------------------
Ran 3 tests in 0.002s

OK


<unittest.runner.TextTestResult run=3 errors=0 failures=0>